YouTip LogoYouTip

Pandas Df Reset Index

Pandas df.reset_index() Function |\n\n

Image 1: Pandas Common functions Pandas General Functions

\n\n
\n\n

df.reset_index() is a function in Pandas used to reset the row index of a DataFrame.

\n\n

During data processing, indices may become discontinuous or not meet your needs. reset_index() can reset the index to the default integer index (0, 1, 2, ...), or convert the current index into a column. This is very useful when cleaning, sorting, and filtering data.

\n\n
\n\n

Basic Syntax and Parameters

\n\n

reset_index() is a member function of DataFrame, called via the dot operator ..

\n\n

Syntax Format

\n\n
DataFrame.reset_index(level=None, drop=False, inplace=False, col_level=0, col_fill='')\n
\n\n

Parameter Description

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ParameterTypeRequiredDescriptionDefault
levelint, str, tuple or listOptionalSpecify the index level(s) to reset. For MultiIndex, you can choose to reset certain levels. Default resets all levels. NoneNone
dropboolOptionalIf True, discard the original index without converting it to a column; if False, keep the original index as a new column. FalseFalse
inplaceboolOptionalIf True, modify the original DataFrame directly without returning a new object; if False, return a new DataFrame, leaving the original unchanged. FalseFalse
col_levelint or strOptionalIf column names are MultiIndex, specify which level to insert the index into. 00
col_fillstrOptionalIf column names are MultiIndex, used to fill column names at other levels when the original index is converted to a column. ''''
\n\n

Return Value Description

\n\n
    \n
  • Returns a new DataFrame (if inplace=False), or None (if inplace=True).
  • \n
  • The returned DataFrame has its index reset, and the original index (if drop=False) will be retained as a new column.
  • \n
\n\n
\n\n

Examples

\n\n

Let's go through a series of examples to fully master the usage of reset_index().

\n\n

Example 1: Basic Usage - Reset Index

\n\n

The most common use case is resetting the index to the default integer index.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame\n\ndata = {\n\n'Name': ['Zhang San','Li Si','Wang Wu'],\n\n'Score': [85,92,78]\n\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original Data (Default Index):")\nprint(df)\nprint("=" * 50)\n\n# Sort data, causing index to be discontinuous\n\ndf_sorted = df.sort_values(by='Score', ascending=False)\n\nprint("Sorted Data (Discontinuous Index):")\nprint(df_sorted)\nprint("=" * 50)\n\n# Reset index\n\ndf_reset = df_sorted.reset_index()\n\nprint("After Resetting Index:")\nprint(df_reset)\n
\n\n

Expected Output:

\n\n
\nOriginal Data (Default Index):\n   Name  Score\n0  Zhang San     85\n1      Li Si     92\n2    Wang Wu     78\n==================================================\nSorted Data (Discontinuous Index):\n   Name  Score\n1      Li Si     92\n0  Zhang San     85\n2    Wang Wu     78\n==================================================\nAfter Resetting Index:\n   Name  Score  index\n0  Li Si     92      1\n1  Zhang San     85      0\n2  Wang Wu     78      2\n
\n\n

Code Explanation:

\n\n
    \n
  1. After sorting, the index becomes 1, 0, 2 instead of continuous 0, 1, 2.
  2. \n
  3. Using reset_index(), the index is reset to 0, 1, 2.
  4. \n
  5. The original index is preserved as a new column named "index".
  6. \n
\n\n

Example 2: Using drop Parameter to Discard Original Index

\n\n

If you don't need to retain the original index, you can use the drop=True parameter.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame\n\ndata = {\n\n'Name': ['Zhang San','Li Si','Wang Wu'],\n\n'Score': [85,92,78]\n\n}\n\ndf = pd.DataFrame(data)\n\n# Sort data\n\ndf_sorted = df.sort_values(by='Score', ascending=False)\n\nprint("Sorted Data:")\nprint(df_sorted)\nprint("=" * 50)\n\n# Reset index, discarding original index\n\ndf_reset = df_sorted.reset_index(drop=True)\n\nprint("After Resetting Index (Discarding Original Index):")\nprint(df_reset)\n
\n\n

Expected Output:

\n\n
\nSorted Data:\n   Name  Score\n1      Li Si     92\n0  Zhang San     85\n2    Wang Wu     78\n==================================================\nAfter Resetting Index (Discarding Original Index):\n   Name  Score\n0  Li Si     92\n1  Zhang San     85\n2  Wang Wu     78\n
\n\n

Code Explanation:

\n\n
    \n
  • With drop=True, the original index is discarded and not kept as a new column.
  • \n
  • This is useful when you don't need to preserve original index information.
  • \n
\n\n

Example 3: Reset Index After Filtering Data

\n\n

After filtering data, the index might be discontinuous. Using reset_index() can organize the data.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame with all student scores\n\ndata = {\n\n'Name': ['Zhang San','Li Si','Wang Wu','Zhao Liu','Qian Qi'],\n\n'Score': [85,92,78,90,88]\n\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original Data:")\nprint(df)\nprint("=" * 50)\n\n# Filter students with scores greater than 85\n\ndf_filtered = df[df['Score']>85]\n\nprint("Filtered Data (Discontinuous Index):")\nprint(df_filtered)\nprint("=" * 50)\n\n# Reset index\n\ndf_reset = df_filtered.reset_index(drop=True)\n\nprint("After Resetting Index:")\nprint(df_reset)\n
\n\n

Expected Output:

\n\n
\nOriginal Data:\n   Name  Score\n0  Zhang San     85\n1      Li Si     92\n2    Wang Wu     78\n3    Zhao Liu     90\n4    Qian Qi     88\n==================================================\nFiltered Data (Discontinuous Index):\n   Name  Score\n1      Li Si     92\n3    Zhao Liu     90\n4    Qian Qi     88\n==================================================\nAfter Resetting Index:\n   Name  Score\n0  Li Si     92\n1  Zhao Liu     90\n2  Qian Qi     88\n
\n\n

Code Explanation:

\n\n
    \n
  • After filtering, the retained indices are 1, 3, 4, not continuous 0, 1, 2.
  • \n
  • Using reset_index(drop=True) gives continuous indices, making subsequent processing easier.
  • \n
\n\n

Example 4: Reset Index After Removing Duplicate Rows

\n\n

After removing duplicate rows, you can also use reset_index() to organize the index.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame with duplicate rows\n\ndata = {\n\n'Name': ['Zhang San','Li Si','Zhang San','Wang Wu'],\n\n'City': ['Beijing','Shanghai','Beijing','Guangzhou']\n\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original Data (With Duplicate Rows):")\nprint(df)\nprint("=" * 50)\n\n# Remove duplicate rows\n\ndf_unique = df.drop_duplicates()\n\nprint("After Removing Duplicates (Discontinuous Index):")\nprint(df_unique)\nprint("=" * 50)\n\n# Reset index\n\ndf_reset = df_unique.reset_index(drop=True)\n\nprint("After Resetting Index:")\nprint(df_reset)\n
\n\n

Expected Output:

\n\n
\nOriginal Data (With Duplicate Rows):\n   Name      City\n0  Zhang San   Beijing\n1      Li Si   Shanghai\n2  Zhang San   Beijing\n3    Wang Wu Guangzhou\n==================================================\nAfter Removing Duplicates (Discontinuous Index):\n   Name      City\n0  Zhang San   Beijing\n1      Li Si   Shanghai\n3    Wang Wu Guangzhou\n==================================================\nAfter Resetting Index:\n   Name      City\n0  Zhang San   Beijing\n1      Li Si   Shanghai\n2    Wang Wu Guangzhou\n
\n\n

Code Explanation:

\n\n
    \n
  • After removing duplicates, the retained indices are 0, 1, 3, not continuous 0, 1, 2.
  • \n
  • Resetting the index makes the data cleaner.
  • \n
\n\n

Example 5: Reset Index After Handling Missing Values

\n\n

After deleting rows with missing values, you can also reset the index.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame with missing values\n\ndata = {\n\n'Name': ['Zhang San','Li Si','Wang Wu','Zhao Liu'],\n\n'Score': [85, np.nan,92,78]\n\n}\n\ndf = pd.DataFrame(data)\n\nprint("Original Data:")\nprint(df)\nprint("=" * 50)\n\n# Delete rows with missing values\n\ndf_cleaned = df.dropna()\n\nprint("After Deleting Missing Values (Discontinuous Index):")\nprint(df_cleaned)\nprint("=" * 50)\n\n# Reset index\n\ndf_reset = df_cleaned.reset_index(drop=True)\n\nprint("After Resetting Index:")\nprint(df_reset)\n
\n\n

Expected Output:

\n\n
\nOriginal Data:\n   Name  Score\n0  Zhang San   85.0\n1      Li Si    NaN\n2    Wang Wu   92.0\n3    Zhao Liu   78.0\n==================================================\nAfter Deleting Missing Values (Discontinuous Index):\n   Name  Score\n0  Zhang San   85.0\n2    Wang Wu   92.0\n3    Zhao Liu   78.0\n==================================================\nAfter Resetting Index:\n   Name  Score\n0  Zhang San   85.0\n1    Wang Wu   92.0\n2    Zhao Liu   78.0\n
\n\n

Code Explanation:

\n\n
    \n
  • After deleting the second row (Li Si's score is NaN), the retained indices are 0, 2, 3.
  • \n
  • Using reset_index() gives continuous indices 0, 1, 2.
  • \n
\n\n

Example 6: Using inplace Parameter for In-Place Modification

\n\n

Using inplace=True modifies the original DataFrame directly.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame\n\ndata = {\n\n'Name': ['Zhang San','Li Si','Wang Wu'],\n\n'Score': [85,92,78]\n\n}\n\ndf = pd.DataFrame(data)\n\n# Filter data\n\ndf_filtered = df[df['Score']>80]\n\nprint("Filtered Data (Not Reset):")\nprint(df_filtered)\nprint(f"Data ID: {id(df_filtered)}")\nprint("=" * 50)\n\n# Use inplace=True to reset index in place\n\ndf_filtered.reset_index(drop=True, inplace=True)\n\nprint("After Resetting with inplace=True:")\nprint(df_filtered)\nprint(f"Data ID: {id(df_filtered)}")\n
\n\n

Expected Output:

\n\n
\nFiltered Data (Not Reset):\n   Name  Score\n1      Li Si     92\n0  Zhang San     85\nData ID: 140234567890\n==================================================\nAfter Resetting with inplace=True:\n   Name  Score\n0  Li Si     92\n1  Zhang San     85\nData ID: 140234567890 # Same object\n
\n\n

Code Explanation:

\n\n
    \n
  • With inplace=True, the DataFrame's memory address (ID) remains unchanged.
  • \n
  • This method saves memory.
  • \n
\n\n

Example 7: Resetting Multi-Level Index

\n\n

For MultiIndex, you can reset specific levels or all levels.

\n\n

Example

\n\n
import pandas as pd\n\nimport numpy as np\n\n# Create a DataFrame with MultiIndex\n\ndf = pd.DataFrame({\n\n'Score': [85,92,78,88,90,95]\n\n}, index=pd.MultiIndex.from_tuples([\n\n('Class 1','Zhang San'),('Class 1','Li Si'),('Class 1','Wang Wu'),\n\n('Class 2','Zhao Liu'),('Class 2','Qian Qi'),('Class 2','Sun Ba')\n\n], names=['Class','Name']))\n\nprint("Original Data (MultiIndex):")\nprint(df)\nprint("=" * 50)\n\n# Reset all index levels\n\ndf_reset_all = df.reset_index()\n\nprint("Reset All Index Levels:")\nprint(df_reset_all)\nprint("=" * 50)\n\n# Reset only the first index level (Class)\n\ndf_reset_first = df.reset_index(level=0)\n\nprint("Reset Only Class Index:")\nprint(df_reset_first)\n
\n\n

Expected Output:

\n\n
\nOriginal Data (MultiIndex):\n       Score\nClass   Name\nClass 1 Zhang San     85\n        Li Si         92\n        Wang Wu       78\nClass 2 Zhao Liu        88\n        Qian Qi       90\n        Sun Ba        95\n==================================================\nReset All Index Levels:\n   Class      Name  Score\n0  Class 1  Zhang San     85\n1  Class 1     Li Si     92\n2  Class 1   Wang Wu     78\n3  Class 2   Zhao Liu     88\n4  Class 2   Qian Qi     90\n5  Class 2    Sun Ba     95\n==================================================\nReset Only Class Index:\n       Score\nName\nZhang San     85\nLi Si         92\nWang Wu       78\nZhao Liu      88\nQian Qi       90\nSun Ba        95\n
\n\n

Code Explanation:

\n\n
    \n
  • A DataFrame with MultiIndex can use reset_index() to convert indices into columns.
  • \n
  • The level parameter allows you to choose which index level to reset.
  • \n
\n\n
\n\n

Notes

\n\n
    \n
  • reset_index() does not modify the original DataFrame by default. To modify in place, use the inplace=True parameter.
  • \n
  • By default (drop=False), the original index is retained as a new column. If not needed, use drop=True.
  • \n
  • Before using reset_index(), ensure no important index information needs to be retained.
  • \n
  • For time-series data, after resetting the index, you may need to re-set the time index.
  • \n
  • In chained operations (e.g., filtering then sorting), the index may change after each operation. Using reset_index() appropriately can maintain code predictability.
  • \n
\n\n
\n\n

Image 2: Pandas Common functions Pandas General Functions

← Pandas Df TailPandas Groupby Mean β†’