Pandas Excel

Pandas provides rich Excel file operation functions, helping us conveniently read and write `.xls` and `.xlsx` files, supporting complex operations such as multiple sheets, indexing, and column selection, making it an essential tool in data analysis. | **Operation** | **Method** | **Description** | | --- | --- | --- | | Read Excel file | `pd.read_excel()` | Read Excel file and return DataFrame | | Write DataFrame to Excel | `DataFrame.to_excel()` | Write DataFrame to Excel file | | Load Excel file | `pd.ExcelFile()` | Load Excel file and access multiple sheets | | Write multiple sheets using ExcelWriter | `pd.ExcelWriter()` | Write multiple DataFrames to different sheets in the same Excel file | ### `pd.read_excel() - Read Excel File` pd.read_excel() method is used to read data from an Excel file and load it as a DataFrame. It supports reading .xls and .xlsx format files. The syntax is as follows: pandas.read_excel(io, sheet_name=0, *, header=0, names=None, index_col=None, usecols=None, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, parse_dates=False, date_parser=, date_format=None, thousands=None, decimal='.', comment=None, skipfooter=0, storage_options=None, dtype_backend=, engine_kwargs=None) Parameter Description: io: This is a required parameter that specifies the path or file object of the Excel file to read. sheet_name=0: Specifies the worksheet name or index to read. Default is 0, which means the first worksheet. header=0: Specifies the row to use as column names. Default is 0, which means the first row. names=None: A list to specify column names. If provided, it will override the column names in the file. index_col=None: Specifies the column to use as row index. Can be the column name or number. usecols=None: Specifies the columns to read. Can be a list of column names or column indices. dtype=None: Specifies the data type of columns. Can be in dictionary format, with keys as column names and values as data types. engine=None: Specifies the parsing engine. Default is None, pandas will automatically select. converters=None: A dictionary of functions for converting data. true_values=None: Specifies values that should be treated as boolean True. false_values=None: Specifies values that should be treated as boolean False. skiprows=None: Specifies the number of rows to skip or a list of rows to skip. nrows=None: Specifies the number of rows to read. na_values=None: Specifies values that should be treated as missing values. keep_default_na=True: Specifies whether to parse default missing values (such as NaN) as NA. na_filter=True: Specifies whether to convert data to NA. verbose=False: Specifies whether to output detailed progress information. parse_dates=False: Specifies whether to parse dates. date_parser=: Function used to parse dates. date_format=None: Specifies the date format. thousands=None: Specifies the thousands separator. decimal='.': Specifies the decimal character. comment=None: Specifies the comment character. skipfooter=0: Specifies the number of rows to skip at the end of the file. storage_options=None: A dictionary of parameters for cloud storage. dtype_backend=: Specifies the data type backend. engine_kwargs=None: An additional dictionary of parameters to pass to the engine. This article uses tutorial_pandas_data.xlsx as an example, you can download tutorial_pandas_data.xlsx to test. Example import pandas as pd # Read data.xlsx file df = pd.read_excel('tutorial_pandas_data.xlsx') # Print the read DataFrame print(df) The output is: Table 1 Unnamed: 1 Unnamed: 2 0 Name Age City 1 Alice 25 New York 2 Bob 30 Los Angeles 3 Charlie 35

YouTip

Pandas Excel

📂 Categories