Pandas df.rename() Function |
\n\n\n\n\n\n
df.rename() is a function in Pandas used to rename column names or index of a DataFrame.
In data analysis, clear column names and indices are important for code readability and maintainability. rename() allows you to flexibly modify column names and index names, making data more standardized and easier to understand. This is especially useful when importing, cleaning, and generating reports with data.
\n\n
Basic Syntax and Parameters
\n\nrename() is a member function of DataFrame, called using the dot operator ..
Syntax Format
\n\nDataFrame.rename(mapper=None, index=None, columns=None, axis=None, inplace=False, errors='ignore')\n\nParameter Description
\n\n| Parameter | \nType | \nRequired | \nDescription | \nDefault Value | \n
|---|---|---|---|---|
| mapper | \ndict or function | \nOptional | \nMapping rules for renaming axes (row index or column names). Can be a dictionary or function. Used together with the axis parameter. | \nNone | \n
| index | \ndict or function | \nOptional | \nDirectly specify renaming rules for row indices. Takes precedence over mapper and axis. | \nNone | \n
| columns | \ndict or function | \nOptional | \nDirectly specify renaming rules for column names. Takes precedence over mapper and axis. | \nNone | \n
| axis | \nint or str | \nOptional | \nSpecify which axis to rename. 0 or 'index' means row index; 1 or 'columns' means column names. | \nNone | \n
| inplace | \nbool | \nOptional | \nIf True, modifies the original DataFrame directly without returning a new object; if False, returns a new DataFrame, leaving the original unchanged. | \nFalse | \n
| errors | \nstr | \nOptional | \nControls error handling. 'raise' raises an exception when a key is not found; 'ignore' ignores non-existent keys. | \n'ignore' | \n
Return Value Description
\n\n- \n
- Returns a new DataFrame (if
inplace=False), orNone(ifinplace=True). \n - The returned DataFrame has renamed column names or indices. \n
\n\n
Examples
\n\nLet's go through several examples to fully master the usage of rename().
Example 1: Renaming Column Names
\n\nUsing the columns parameter can easily rename column names.
Example
\n\nimport pandas as pd\n\n# Create a DataFrame\n\ndata = {\n 'name': ['Zhang San','Li Si','Wang Wu'],\n 'age': [25,30,35],\n 'salary': [5000,6000,7000]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Rename column names using a dictionary\n\ndf_renamed = df.rename(columns={\n 'name': 'Name',\n 'age': 'Age',\n 'salary': 'Salary'\n})\n\nprint("Data after renaming column names: ")\nprint(df_renamed)\n\n\nExpected Output:
\n\nOriginal data:\n name age salary\n0 Zhang San 25 5000\n1 Li Si 30 6000\n2 Wang Wu 35 7000\n==================================================\nData after renaming column names:\n Name Age Salary\n0 Zhang San 25 5000\n1 Li Si 30 6000\n2 Wang Wu 35 7000\n\n\nCode Explanation:
\n\n- \n
- The original column names are in English, we use a dictionary to rename them into Chinese. \n
- The
columnsparameter only affects column names, not row indices. \n - This is a common data normalization operation. \n
Example 2: Renaming Row Index
\n\nUse the index parameter to rename row indices.
Example
\n\nimport pandas as pd\n\n# Create a DataFrame with default index\n\ndata = {\n 'Fruit': ['apple','banana','Orange'],\n 'Quantity': [10,20,15]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data (default index): ")\nprint(df)\nprint("=" * 50)\n\n# Rename row indices using a dictionary\n\ndf_renamed = df.rename(index={\n 0: 'First row',\n 1: 'Second row',\n 2: 'Third row'\n})\n\nprint("Data after renaming row index: ")\nprint(df_renamed)\n\n\nExpected Output:
\n\nOriginal data (default index):\n Fruit Quantity\n0 apple 10\n1 banana 20\n2 Orange 15\n==================================================\nData after renaming row index:\n Fruit Quantity\nFirst row apple 10\nSecond row banana 20\nThird row Orange 15\n\n\nCode Explanation:
\n\n- \n
- The original row indices are numeric: 0, 1, 2. \n
- Using the
indexparameter allows renaming them to more meaningful names. \n - This is particularly useful when generating reports. \n
Example 3: Renaming Both Column Names and Row Indices
\n\nYou can rename both column names and row indices at once.
\n\nExample
\n\nimport pandas as pd\n\n# Create a DataFrame\n\ndata = {\n 'A': [1,2,3],\n 'B': [4,5,6]\n}\n\ndf = pd.DataFrame(data, index=['x','y','z'])\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Rename both column names and row indices\n\ndf_renamed = df.rename(columns={'A': 'Column A','B': 'Column B'}, index={'x': 'Row X','y': 'Row Y','z': 'Row Z'})\n\nprint("Renamed data: ")\nprint(df_renamed)\n\n\nExpected Output:
\n\nOriginal data:\n A B\nx 1 4\ny 2 5\nz 3 6\n==================================================\nRenamed data:\n Column A Column B\nRow X 1 4\nRow Y 2 5\nRow Z 3 6\n\n\nCode Explanation:
\n\n- \n
- You can use both
columnsandindexparameters simultaneously. \n - This is very convenient during data import and standardization. \n
Example 4: Using Functions to Rename
\n\nrename() also accepts functions as parameters, allowing batch transformations of column names or indices.
Example
\n\nimport pandas as pd\n\n# Create a DataFrame with complex column names\n\ndata = {\n 'USER_NAME': ['Zhang San','Li Si'],\n 'USER_AGE': [25,30],\n 'USER_SALARY': [5000,6000]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Convert column names to lowercase using a function\n\ndf_lower = df.rename(columns=str.lower)\nprint("After converting column names to lowercase:")\nprint(df_lower)\nprint("=" * 50)\n\n# Remove prefix (e.g., USER_) using lambda function\n\ndf_clean = df.rename(columns=lambda x: x.replace('USER_',''))\nprint("After removing prefix: ")\nprint(df_clean)\nprint("=" * 50)\n\n# Add prefix using lambda function\n\ndf_prefix = df.rename(columns=lambda x: 'col_' + x)\nprint("After adding prefix: ")\nprint(df_prefix)\n\n\nExpected Output:
\n\nOriginal data:\n USER_NAME USER_AGE USER_SALARY\n0 Zhang San 25 5000\n1 Li Si 30 6000\n==================================================\nAfter converting column names to lowercase:\n user_name user_age user_salary\n0 Zhang San 25 5000\n1 Li Si 30 6000\n==================================================\nAfter removing prefix:\n NAME AGE SALARY\n0 Zhang San 25 5000\n1 Li Si 30 6000\n==================================================\nAfter adding prefix:\n col_USER_NAME col_USER_AGE col_USER_SALARY\n0 Zhang San 25 5000\n1 Li Si 30 6000\n\n\nCode Explanation:
\n\n- \n
str.loweris a method to convert strings to lowercase. \n- Using
lambdafunctions provides flexible handling of complex naming rules. \n - This approach is suitable for batch processing large numbers of column names. \n
Example 5: Using the inplace Parameter for In-Place Modification
\n\nUsing inplace=True modifies the original DataFrame directly without returning a new object.
Example
\n\nimport pandas as pd\n\n# Create a DataFrame\n\ndata = {\n 'name': ['Zhang San','Li Si'],\n 'age': [25,30]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint(f"Original data id: {id(df)}")\nprint("=" * 50)\n\n# Modify in place using inplace=True\n\ndf.rename(columns={'name': 'Name','age': 'Age'}, inplace=True)\n\nprint("Using inplace=True Modified data:")\nprint(df)\nprint(f"Modified data id: {id(df)}")\n\n\nExpected Output:
\n\nOriginal data:\n name age\n0 Zhang San 25\n1 Li Si 30\nOriginal data id: 140234567890\n==================================================\nUsing inplace=True Modified data:\n Name Age\n0 Zhang San 25\n1 Li Si 30\nModified data id: 140234567890 # Same object\n\n\nCode Explanation:
\n\n- \n
- With
inplace=True, the DataFrame's memory address (id) remains unchanged, indicating in-place modification. \n - This method saves memory, especially when dealing with large datasets. \n
Example 6: Handling Non-Existent Column Names
\n\nThe errors parameter controls behavior when a column does not exist.
Example
\n\nimport pandas as pd\n\n# Create a DataFrame\n\ndata = {\n 'A': [1,2,3],\n 'B': [4,5,6]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Try to rename a non-existent column, using errors='ignore' (default)\n\ndf_renamed = df.rename(columns={'C': 'C_new'}) # Column C does not exist\n\nprint("Renaming non-existent column (C), using default behavior: ")\nprint(df_renamed) # No change\n\nprint("=" * 50)\n\n# Try to rename a non-existent column, using errors='raise'\n\ntry:\n df_renamed2 = df.rename(columns={'C': 'C_new'}, errors='raise')\nexcept KeyError as e:\n print(f"Raises exception: {e}")\n\n\nExpected Output:
\n\nOriginal data:\n A B\n0 1 4\n1 2 5\n2 3 6\n==================================================\nRenaming non-existent column (C), using default behavior:\n A B\n0 1 4\n1 2 5\n2 3 6 # No change\n==================================================\nRaises exception: 'C'\n\n\nCode Explanation:
\n\n- \n
- By default (
errors='ignore'), if a column doesn't exist, nothing changes and no error is raised. \n - If
errors='raise'is set, aKeyErrorexception will be raised. \n
Example 7: Using the axis Parameter
\n\nThe axis parameter allows more flexible specification of which axis to rename.
Example
\n\nimport pandas as pd\n\n# Create a DataFrame\n\ndata = {\n 'col1': [1,2,3],\n 'col2': [4,5,6]\n}\n\ndf = pd.DataFrame(data, index=['row1','row2','row3'])\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Rename columns using axis='columns'\n\ndf_renamed_cols = df.rename({'col1': 'First column','col2': 'Second column'}, axis='columns')\n\nprint("Using axis='columns' After renaming columns:")\nprint(df_renamed_cols)\n\nprint("=" * 50)\n\n# Rename row indices using axis='index'\n\ndf_renamed_idx = df.rename({'row1': 'First row','row2': 'Second row','row3': 'Third row'}, axis='index')\n\nprint("Using axis='index' After renaming rows: ")\nprint(df_renamed_idx)\n\n\nExpected Output:
\n\nOriginal data:\n col1 col2\nrow1 1 4\nrow2 2 5\nrow3 3 6\n==================================================\nUsing axis='columns' After renaming columns:\n First column Second column\nrow1 1 4\nrow2 2 5\nrow3 3 6\n\n\nCode Explanation:
\n\n- \n
axis='columns'oraxis=1indicates operations on columns. \naxis='index'oraxis=0indicates operations on row indices. \n- This syntax is consistent with NumPyβs style. \n
\n\n
Notes
\n\n- \n
rename()does not modify the original DataFrame by default. To modify in place, use theinplace=Trueparameter. \n- Prefer using
columnsandindexparameters overmapperandaxisas they are clearer and more intuitive. \n - When using functions for renaming, ensure that the function returns a string type, otherwise it may cause errors. \n
- When working with large datasets, using
inplace=Truecan save memory. \n - Before renaming, it's recommended to check current column names using
df.columns. \n
\n\n
YouTip