Read and Write

List of Functions

read_and_write(file_name[, file_location, …])

Read CSV or Excel files by removing bad lines and save them as parquet, csv or both

read_and_write_all(folder_path[, to, …])

Read CSV or Excel files in the folder by removing bad lines and save them as parquet, csv or both

read_and_concat(list_of_files[, …])

Read a list of parquet or csv file and concatenate them by row or by column.

read_all_sheets(excel_file[, file_location, …])

Read an excel file with multiple sheets and concatenate the sheets

Definition of Functions

aio.read_and_write(file_name, file_location=None, write_directory=None, to='both', sep=',', verbose=False)

Read CSV or Excel files by removing bad lines and save them as parquet, csv or both

Parameters
file_name: str

The name of a data file (e.g: ‘data.csv’).

file_location: str or path object

The path of a data file location (folder)

write_directory: str or path object

The desired path to save the new file (folder)

to: {‘csv’, ‘parquet’, ‘both’}, optional

The format to save the data. Default is ‘both’.

sep: str, default ‘,’

Delimiter to use.

verbose: bool, default False

Informs when writing the file is successful.

Raises
Error when the file name does not contain csv or xls.
aio.read_and_write_all(folder_path, to='both', write_directory=None, verbose=False)

Read CSV or Excel files in the folder by removing bad lines and save them as parquet, csv or both

Parameters
folder_path: str or path object

The path of a data file location (folder).

to: {‘csv’, ‘parquet’, ‘both’}, optional

The format to save the data. Default is ‘both’.

write_directory: str or path object

The desired path to save the new file (folder).

verbose: bool, default False

Informs when writing the file is successful.

Raises
Error when the file name does not contain csv or xls.
aio.read_and_concat(list_of_files, file_location=None, write_directory=None, parquet=True, by_row=True, save_by=None, sep=',')

Read a list of parquet or csv file and concatenate them by row or by column. The file is written in the directory under the ‘save_by’ name.

Parameters
list_of_files: list of str

list of dataframe names (e.g: [‘data_0.csv’, ‘data_1.csv’]).

file_location: str or path object

The path of a data file location (folder).

write_directory: str or path object

The desired path to save the new file (folder).

parquet: bool, optional. Default True

If file formats are parquet or csv (False)

by_row: bool. Default True

If concatenate by row or by column (False)

save_by: str, optional. Default None

If not None, the concatenate dataframe is written to the file_location under this name.

sep: str, default ‘,’

Delimiter to use.

Returns
concat_output: DataFrame

The concatenated dataframe optional: Save the concatenated dataframe to the given directory

aio.read_all_sheets(excel_file, file_location=None, write_directory=None, by_row=True, save_by=None, to='both')

Read an excel file with multiple sheets and concatenate the sheets

Parameters
excel_file: str

The name of a excel file with multiple sheets (e.g. ‘data.xlsx’)

file_location: str or path object

The path of a data file location (folder)

write_directory: str or path object

The desired path to save the new file (folder).

by_row: bool. Default True

If concatenate by row or by column (False)

save_by: str, optional. Default None

If not None, the concatenate dataframe is written to the file_location under this name.

to: {‘csv’, ‘parquet’, ‘both’}, optional

The format to save the data. Default is ‘both’.

Returns
concat_output: List of DataFrame

List of dataframes, each sheet is a dataframe optional: Save the concatenated dataframe to the given directory