YouTip LogoYouTip

Pandas Df Pivot

[![Image 1: Pandas Common Functions](#) Pandas Common Functions](#) * * * `df.pivot()` is a member method of DataFrame used to **create a pivot table**. It reshapes data into a new layout based on specified row indices, column indices, and value columns. Unlike `pivot_table()`, `pivot()` does not support aggregation operations and requires that combinations of row and column indices be unique; otherwise, it will raise an error. **Word Definition**: `pivot` means "rotate, pivot", here referring to the "rotation" reshaping of data, similar to how Pivot tables can dynamically switch row and column dimensions. * * * ## Basic Syntax and Parameters `df.pivot()` is an instance method of DataFrame, called via dot notation. ### Syntax Format DataFrame.pivot(index=None, columns, values=None) ### Parameter Description * **Parameter**: `index` * Type: Column name, label, list of labels, or None. * Description: Column to use as the row index of the new DataFrame. If None, the existing row index is used. * **Parameter**: `columns` * Type: Column name or label. * Description: **Required parameter**. Column to use as the column index of the new DataFrame, used to expand unique values into column names. * **Parameter**: `values` * Type: Column name, label, list of labels, or None. * Description: Data column used to fill the DataFrame. If not specified, all columns except index and columns are used. ### Function Description * **Return Value**: Returns a new DataFrame, i.e., a pivot table. * **Limitation**: If there are duplicate combinations of index and columns, a ValueError will be raised. In such cases, use `pivot_table()`, which automatically performs aggregation. * * * ## Examples Let's go through a series of examples from simple to complex to fully master the usage of `df.pivot()`. ### Example 1: Basic Usage - Creating a Simple Pivot Table ## Example import pandas as pd # 1. Create sales data (each row represents a product-region-sales combination) sales = pd.DataFrame({ 'product': ['A','A','B','B','C','C'], 'region': ['North','South','North','South','North','South'], 'sales': [100,150,200,180,220,250] }) print("=== Original Sales Data ===") print(sales) # 2. Use df.pivot() to create a pivot table # Row index: product, Column index: region, Values: sales pivot_result = sales.pivot(index='product', columns='region', values='sales') print("n=== df.pivot() Pivot Table ===") print(pivot_result) **Expected Output:** === Original Sales Data === product region sales 0 A North 1001 A South 1502 B North 2003 B South 1804 C North 2205 C South 250=== df.pivot() Pivot Table === region North South product A 100 150 B 200 180 C 220 250 **Code Explanation:** 1. The original data is in long format, with each row representing the sales amount for a product in a specific region. 2. `pivot()` uses the product column as the row index, the region column as the column index, and sales as the values. 3. The result is a clear two-dimensional table that allows intuitive comparison of sales performance across different products and regions. ### Example 2: Multiple Value Columns - Pivotting Multiple Metrics Simultaneously You can pivot multiple numeric columns at once to obtain a richer view of the data. ## Example import pandas as pd # 1. Create data including sales and quantity sales = pd.DataFrame({ 'product': ['A','A','B','B','A','A','B','B'], 'region': ['North','South','North','South','East','West','East','West'], 'sales': [100,150,200,180,120,130,190,210], 'quantity': [10,15,20,18,12,13,19,21] }) print("=== Original Data ===") print(sales) # 2. Pivot only sales print("n=== Sales Pivot Table ===") sales_pivot = sales.pivot(index='product', columns='region', values='sales') print(sales_pivot) # 3. Pivot both sales and quantity (without specifying values, using all numeric columns) print("n=== Multi-Metric Pivot ===") multi_pivot = sales.pivot(index='product', columns='region') print(multi_pivot) **Expected Output:** === Original Data === product region sales quantity 0 A North 100 101 A South 150 152 B North 200 203 B South 180 184 A East 120 125 A West 130 136 B East 190 197 B West 210 21=== Sales Pivot Table === region East North South West product A 120 100 150 130 B 190 200 180 210=== Multi-Metric Pivot === sales quantity region East North South West East North South West product A 120 100 150 130 12 10 15 13 B 190 200 180 210 19 20 18 21 **Code Explanation:** * When the `values` parameter is not specified, all columns except index and columns are used. * Multi-column pivoting results in a multi-level column index, where the first level is the original column names and the second level is the values from the columns parameter. * This approach facilitates simultaneous comparison of multiple metrics across different dimensions. ### Example 3: Multi-Level Indexing - Using Multiple Columns as Row Index Multiple columns can be used to create a pivot table with hierarchical indexing. ## Example import pandas as pd # 1. Create data including year, quarter, and product information data = pd.DataFrame({ 'year': [2023,2023,2023,2023,2024,2024,2024,2024], 'quarter': ['Q1','Q1','Q2','Q2','Q1','Q1','Q2','Q2'], 'product': ['A','B','A','B','A','B','A','B'], 'sales': [100,150,200,180,220,250,230,270] }) print("=== Original Data ===") print(data) # 2. Use multiple columns as row index (in list form) print("n=== Multi-Level Row Index Pivot ===") pivot_multi = data.pivot(index=['year','quarter'], columns='product', values='sales') print(pivot_multi) # 3. Reset index to convert back to a regular DataFrame print("n=== After Reset Index ===") pivot_reset = pivot_multi.reset_index() print(pivot_reset) **Expected Output:** === Original Data === year quarter product sales 0 2023 Q1 A 1001 2023 Q1 B 1502 2023 Q2 A 2003 2023 Q2 B 1804 2024 Q1 A 2205 2024 Q1 B 2506 2024 Q2 A 2307 2024 Q2 B 270=== Multi-Level Row Index Pivot === product A B year quarter 2023 Q1 100 150 2023 Q2 200 180 2024 Q1 220 250 2024 Q2 230 270=== After Reset Index === year quarter A B 0 2023 Q1 100 150 1 2023 Q2 200 180 2 2024 Q1 220 250 3 2024 Q2 230 270 **Code Explanation:** * By using `index=['year', 'quarter']`, a pivot table with a multi-level row index can be created. * This format is convenient for analyzing product performance across time dimensions. * Using `reset_index()` converts the hierarchical index back into regular columns. > **Note:** If there are duplicate combinations of index + columns in the original data, `pivot()` will raise `ValueError: Index contains duplicate entries`. At this point, use `pivot_table()`, which automatically aggregates duplicate values. [![Image 2: Pandas Common Functions](#) Pandas Common Functions](#)
← Pandas Groupby TransformPandas Series Dt Day β†’