Pandas Df Pivot
[ Pandas Common Functions](#)
* * *
`df.pivot()` is a member method of DataFrame used to **create a pivot table**. It reshapes data into a new layout based on specified row indices, column indices, and value columns.
Unlike `pivot_table()`, `pivot()` does not support aggregation operations and requires that combinations of row and column indices be unique; otherwise, it will raise an error.
**Word Definition**: `pivot` means "rotate, pivot", here referring to the "rotation" reshaping of data, similar to how Pivot tables can dynamically switch row and column dimensions.
* * *
## Basic Syntax and Parameters
`df.pivot()` is an instance method of DataFrame, called via dot notation.
### Syntax Format
DataFrame.pivot(index=None, columns, values=None)
### Parameter Description
* **Parameter**: `index`
* Type: Column name, label, list of labels, or None.
* Description: Column to use as the row index of the new DataFrame. If None, the existing row index is used.
* **Parameter**: `columns`
* Type: Column name or label.
* Description: **Required parameter**. Column to use as the column index of the new DataFrame, used to expand unique values into column names.
* **Parameter**: `values`
* Type: Column name, label, list of labels, or None.
* Description: Data column used to fill the DataFrame. If not specified, all columns except index and columns are used.
### Function Description
* **Return Value**: Returns a new DataFrame, i.e., a pivot table.
* **Limitation**: If there are duplicate combinations of index and columns, a ValueError will be raised. In such cases, use `pivot_table()`, which automatically performs aggregation.
* * *
## Examples
Let's go through a series of examples from simple to complex to fully master the usage of `df.pivot()`.
### Example 1: Basic Usage - Creating a Simple Pivot Table
## Example
import pandas as pd
# 1. Create sales data (each row represents a product-region-sales combination)
sales = pd.DataFrame({
'product': ['A','A','B','B','C','C'],
'region': ['North','South','North','South','North','South'],
'sales': [100,150,200,180,220,250]
})
print("=== Original Sales Data ===")
print(sales)
# 2. Use df.pivot() to create a pivot table
# Row index: product, Column index: region, Values: sales
pivot_result = sales.pivot(index='product', columns='region', values='sales')
print("n=== df.pivot() Pivot Table ===")
print(pivot_result)
**Expected Output:**
=== Original Sales Data === product region sales 0 A North 1001 A South 1502 B North 2003 B South 1804 C North 2205 C South 250=== df.pivot() Pivot Table === region North South product A 100 150 B 200 180 C 220 250
**Code Explanation:**
1. The original data is in long format, with each row representing the sales amount for a product in a specific region.
2. `pivot()` uses the product column as the row index, the region column as the column index, and sales as the values.
3. The result is a clear two-dimensional table that allows intuitive comparison of sales performance across different products and regions.
### Example 2: Multiple Value Columns - Pivotting Multiple Metrics Simultaneously
You can pivot multiple numeric columns at once to obtain a richer view of the data.
## Example
import pandas as pd
# 1. Create data including sales and quantity
sales = pd.DataFrame({
'product': ['A','A','B','B','A','A','B','B'],
'region': ['North','South','North','South','East','West','East','West'],
'sales': [100,150,200,180,120,130,190,210],
'quantity': [10,15,20,18,12,13,19,21]
})
print("=== Original Data ===")
print(sales)
# 2. Pivot only sales
print("n=== Sales Pivot Table ===")
sales_pivot = sales.pivot(index='product', columns='region', values='sales')
print(sales_pivot)
# 3. Pivot both sales and quantity (without specifying values, using all numeric columns)
print("n=== Multi-Metric Pivot ===")
multi_pivot = sales.pivot(index='product', columns='region')
print(multi_pivot)
**Expected Output:**
=== Original Data === product region sales quantity 0 A North 100 101 A South 150 152 B North 200 203 B South 180 184 A East 120 125 A West 130 136 B East 190 197 B West 210 21=== Sales Pivot Table === region East North South West product A 120 100 150 130 B 190 200 180 210=== Multi-Metric Pivot === sales quantity region East North South West East North South West product A 120 100 150 130 12 10 15 13 B 190 200 180 210 19 20 18 21
**Code Explanation:**
* When the `values` parameter is not specified, all columns except index and columns are used.
* Multi-column pivoting results in a multi-level column index, where the first level is the original column names and the second level is the values from the columns parameter.
* This approach facilitates simultaneous comparison of multiple metrics across different dimensions.
### Example 3: Multi-Level Indexing - Using Multiple Columns as Row Index
Multiple columns can be used to create a pivot table with hierarchical indexing.
## Example
import pandas as pd
# 1. Create data including year, quarter, and product information
data = pd.DataFrame({
'year': [2023,2023,2023,2023,2024,2024,2024,2024],
'quarter': ['Q1','Q1','Q2','Q2','Q1','Q1','Q2','Q2'],
'product': ['A','B','A','B','A','B','A','B'],
'sales': [100,150,200,180,220,250,230,270]
})
print("=== Original Data ===")
print(data)
# 2. Use multiple columns as row index (in list form)
print("n=== Multi-Level Row Index Pivot ===")
pivot_multi = data.pivot(index=['year','quarter'], columns='product', values='sales')
print(pivot_multi)
# 3. Reset index to convert back to a regular DataFrame
print("n=== After Reset Index ===")
pivot_reset = pivot_multi.reset_index()
print(pivot_reset)
**Expected Output:**
=== Original Data === year quarter product sales 0 2023 Q1 A 1001 2023 Q1 B 1502 2023 Q2 A 2003 2023 Q2 B 1804 2024 Q1 A 2205 2024 Q1 B 2506 2024 Q2 A 2307 2024 Q2 B 270=== Multi-Level Row Index Pivot === product A B year quarter 2023 Q1 100 150 2023 Q2 200 180 2024 Q1 220 250 2024 Q2 230 270=== After Reset Index === year quarter A B 0 2023 Q1 100 150 1 2023 Q2 200 180 2 2024 Q1 220 250 3 2024 Q2 230 270
**Code Explanation:**
* By using `index=['year', 'quarter']`, a pivot table with a multi-level row index can be created.
* This format is convenient for analyzing product performance across time dimensions.
* Using `reset_index()` converts the hierarchical index back into regular columns.
> **Note:** If there are duplicate combinations of index + columns in the original data, `pivot()` will raise `ValueError: Index contains duplicate entries`. At this point, use `pivot_table()`, which automatically aggregates duplicate values.
[ Pandas Common Functions](#)
YouTip