YouTip LogoYouTip

Pandas Df Rename

Pandas df.rename() Function |

\n\n

Image 1: Pandas Common functions Pandas General Functions

\n\n
\n\n

df.rename() is a function in Pandas used to rename column names or index of a DataFrame.

\n\n

In data analysis, clear column names and indices are important for code readability and maintainability. rename() allows you to flexibly modify column names and index names, making data more standardized and easier to understand. This is especially useful when importing, cleaning, and generating reports with data.

\n\n
\n\n

Basic Syntax and Parameters

\n\n

rename() is a member function of DataFrame, called using the dot operator ..

\n\n

Syntax Format

\n\n
DataFrame.rename(mapper=None, index=None, columns=None, axis=None, inplace=False, errors='ignore')
\n\n

Parameter Description

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ParameterTypeRequiredDescriptionDefault Value
mapperdict or functionOptionalMapping rules for renaming axes (row index or column names). Can be a dictionary or function. Used together with the axis parameter.None
indexdict or functionOptionalDirectly specify renaming rules for row indices. Takes precedence over mapper and axis.None
columnsdict or functionOptionalDirectly specify renaming rules for column names. Takes precedence over mapper and axis.None
axisint or strOptionalSpecify which axis to rename. 0 or 'index' means row index; 1 or 'columns' means column names.None
inplaceboolOptionalIf True, modifies the original DataFrame directly without returning a new object; if False, returns a new DataFrame, leaving the original unchanged.False
errorsstrOptionalControls error handling. 'raise' raises an exception when a key is not found; 'ignore' ignores non-existent keys.'ignore'
\n\n

Return Value Description

\n\n
    \n
  • Returns a new DataFrame (if inplace=False), or None (if inplace=True).
  • \n
  • The returned DataFrame has renamed column names or indices.
  • \n
\n\n
\n\n

Examples

\n\n

Let's go through several examples to fully master the usage of rename().

\n\n

Example 1: Renaming Column Names

\n\n

Using the columns parameter can easily rename column names.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame\n\ndata = {\n    'name': ['Zhang San','Li Si','Wang Wu'],\n    'age': [25,30,35],\n    'salary': [5000,6000,7000]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Rename column names using a dictionary\n\ndf_renamed = df.rename(columns={\n    'name': 'Name',\n    'age': 'Age',\n    'salary': 'Salary'\n})\n\nprint("Data after renaming column names: ")\nprint(df_renamed)\n
\n\n

Expected Output:

\n\n
Original data:\n   name  age  salary\n0    Zhang San   25    5000\n1    Li Si   30    6000\n2    Wang Wu   35    7000\n==================================================\nData after renaming column names:\n   Name  Age  Salary\n0  Zhang San   25    5000\n1  Li Si   30    6000\n2  Wang Wu   35    7000\n
\n\n

Code Explanation:

\n\n
    \n
  1. The original column names are in English, we use a dictionary to rename them into Chinese.
  2. \n
  3. The columns parameter only affects column names, not row indices.
  4. \n
  5. This is a common data normalization operation.
  6. \n
\n\n

Example 2: Renaming Row Index

\n\n

Use the index parameter to rename row indices.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame with default index\n\ndata = {\n    'Fruit': ['apple','banana','Orange'],\n    'Quantity': [10,20,15]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data (default index): ")\nprint(df)\nprint("=" * 50)\n\n# Rename row indices using a dictionary\n\ndf_renamed = df.rename(index={\n    0: 'First row',\n    1: 'Second row',\n    2: 'Third row'\n})\n\nprint("Data after renaming row index: ")\nprint(df_renamed)\n
\n\n

Expected Output:

\n\n
Original data (default index):\n   Fruit  Quantity\n0  apple   10\n1  banana   20\n2  Orange   15\n==================================================\nData after renaming row index:\n   Fruit  Quantity\nFirst row  apple   10\nSecond row  banana   20\nThird row  Orange   15\n
\n\n

Code Explanation:

\n\n
    \n
  • The original row indices are numeric: 0, 1, 2.
  • \n
  • Using the index parameter allows renaming them to more meaningful names.
  • \n
  • This is particularly useful when generating reports.
  • \n
\n\n

Example 3: Renaming Both Column Names and Row Indices

\n\n

You can rename both column names and row indices at once.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame\n\ndata = {\n    'A': [1,2,3],\n    'B': [4,5,6]\n}\n\ndf = pd.DataFrame(data, index=['x','y','z'])\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Rename both column names and row indices\n\ndf_renamed = df.rename(columns={'A': 'Column A','B': 'Column B'}, index={'x': 'Row X','y': 'Row Y','z': 'Row Z'})\n\nprint("Renamed data: ")\nprint(df_renamed)\n
\n\n

Expected Output:

\n\n
Original data:\n   A  B\nx  1  4\ny  2  5\nz  3  6\n==================================================\nRenamed data:\n   Column A  Column B\nRow X   1   4\nRow Y   2   5\nRow Z   3   6\n
\n\n

Code Explanation:

\n\n
    \n
  • You can use both columns and index parameters simultaneously.
  • \n
  • This is very convenient during data import and standardization.
  • \n
\n\n

Example 4: Using Functions to Rename

\n\n

rename() also accepts functions as parameters, allowing batch transformations of column names or indices.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame with complex column names\n\ndata = {\n    'USER_NAME': ['Zhang San','Li Si'],\n    'USER_AGE': [25,30],\n    'USER_SALARY': [5000,6000]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Convert column names to lowercase using a function\n\ndf_lower = df.rename(columns=str.lower)\nprint("After converting column names to lowercase:")\nprint(df_lower)\nprint("=" * 50)\n\n# Remove prefix (e.g., USER_) using lambda function\n\ndf_clean = df.rename(columns=lambda x: x.replace('USER_',''))\nprint("After removing prefix: ")\nprint(df_clean)\nprint("=" * 50)\n\n# Add prefix using lambda function\n\ndf_prefix = df.rename(columns=lambda x: 'col_' + x)\nprint("After adding prefix: ")\nprint(df_prefix)\n
\n\n

Expected Output:

\n\n
Original data:\n   USER_NAME  USER_AGE  USER_SALARY\n0      Zhang San        25         5000\n1      Li Si        30         6000\n==================================================\nAfter converting column names to lowercase:\n   user_name  user_age  user_salary\n0       Zhang San        25         5000\n1       Li Si        30         6000\n==================================================\nAfter removing prefix:\n   NAME  AGE  SALARY\n0   Zhang San   25    5000\n1   Li Si   30    6000\n==================================================\nAfter adding prefix:\n   col_USER_NAME  col_USER_AGE  col_USER_SALARY\n0           Zhang San            25             5000\n1           Li Si            30             6000\n
\n\n

Code Explanation:

\n\n
    \n
  • str.lower is a method to convert strings to lowercase.
  • \n
  • Using lambda functions provides flexible handling of complex naming rules.
  • \n
  • This approach is suitable for batch processing large numbers of column names.
  • \n
\n\n

Example 5: Using the inplace Parameter for In-Place Modification

\n\n

Using inplace=True modifies the original DataFrame directly without returning a new object.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame\n\ndata = {\n    'name': ['Zhang San','Li Si'],\n    'age': [25,30]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint(f"Original data id: {id(df)}")\nprint("=" * 50)\n\n# Modify in place using inplace=True\n\ndf.rename(columns={'name': 'Name','age': 'Age'}, inplace=True)\n\nprint("Using inplace=True Modified data:")\nprint(df)\nprint(f"Modified data id: {id(df)}")\n
\n\n

Expected Output:

\n\n
Original data:\n   name  age\n0    Zhang San   25\n1    Li Si   30\nOriginal data id: 140234567890\n==================================================\nUsing inplace=True Modified data:\n   Name  Age\n0  Zhang San   25\n1  Li Si   30\nModified data id: 140234567890 # Same object\n
\n\n

Code Explanation:

\n\n
    \n
  • With inplace=True, the DataFrame's memory address (id) remains unchanged, indicating in-place modification.
  • \n
  • This method saves memory, especially when dealing with large datasets.
  • \n
\n\n

Example 6: Handling Non-Existent Column Names

\n\n

The errors parameter controls behavior when a column does not exist.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame\n\ndata = {\n    'A': [1,2,3],\n    'B': [4,5,6]\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Try to rename a non-existent column, using errors='ignore' (default)\n\ndf_renamed = df.rename(columns={'C': 'C_new'}) # Column C does not exist\n\nprint("Renaming non-existent column (C), using default behavior: ")\nprint(df_renamed) # No change\n\nprint("=" * 50)\n\n# Try to rename a non-existent column, using errors='raise'\n\ntry:\n    df_renamed2 = df.rename(columns={'C': 'C_new'}, errors='raise')\nexcept KeyError as e:\n    print(f"Raises exception: {e}")\n
\n\n

Expected Output:

\n\n
Original data:\n   A  B\n0  1  4\n1  2  5\n2  3  6\n==================================================\nRenaming non-existent column (C), using default behavior:\n   A  B\n0  1  4\n1  2  5\n2  3  6 # No change\n==================================================\nRaises exception: 'C'\n
\n\n

Code Explanation:

\n\n
    \n
  • By default (errors='ignore'), if a column doesn't exist, nothing changes and no error is raised.
  • \n
  • If errors='raise' is set, a KeyError exception will be raised.
  • \n
\n\n

Example 7: Using the axis Parameter

\n\n

The axis parameter allows more flexible specification of which axis to rename.

\n\n

Example

\n\n
import pandas as pd\n\n# Create a DataFrame\n\ndata = {\n    'col1': [1,2,3],\n    'col2': [4,5,6]\n}\n\ndf = pd.DataFrame(data, index=['row1','row2','row3'])\n\nprint("Original data:")\nprint(df)\nprint("=" * 50)\n\n# Rename columns using axis='columns'\n\ndf_renamed_cols = df.rename({'col1': 'First column','col2': 'Second column'}, axis='columns')\n\nprint("Using axis='columns' After renaming columns:")\nprint(df_renamed_cols)\n\nprint("=" * 50)\n\n# Rename row indices using axis='index'\n\ndf_renamed_idx = df.rename({'row1': 'First row','row2': 'Second row','row3': 'Third row'}, axis='index')\n\nprint("Using axis='index' After renaming rows: ")\nprint(df_renamed_idx)\n
\n\n

Expected Output:

\n\n
Original data:\n   col1  col2\nrow1     1     4\nrow2     2     5\nrow3     3     6\n==================================================\nUsing axis='columns' After renaming columns:\n   First column  Second column\nrow1     1     4\nrow2     2     5\nrow3     3     6\n
\n\n

Code Explanation:

\n\n
    \n
  • axis='columns' or axis=1 indicates operations on columns.
  • \n
  • axis='index' or axis=0 indicates operations on row indices.
  • \n
  • This syntax is consistent with NumPy’s style.
  • \n
\n\n
\n\n

Notes

\n\n
    \n
  • rename() does not modify the original DataFrame by default. To modify in place, use the inplace=True parameter.
  • \n
  • Prefer using columns and index parameters over mapper and axis as they are clearer and more intuitive.
  • \n
  • When using functions for renaming, ensure that the function returns a string type, otherwise it may cause errors.
  • \n
  • When working with large datasets, using inplace=True can save memory.
  • \n
  • Before renaming, it's recommended to check current column names using df.columns.
  • \n
\n\n
\n\n

Image 2: Pandas Common functions Pandas General Functions

← Pandas Groupby MeanPandas Df Drop Duplicates β†’