YouTip LogoYouTip

Pandas Matplotlib

Data visualization is an important part of data analysis, helping us better understand and interpret data patterns, trends, and relationships. Through graphics, charts, and other forms, data visualization transforms complex numbers and statistical information into easy-to-understand images, facilitating decision-making. Pandas provides integration with visualization libraries such as (#) and (#), making data visualization simple and efficient. In Pandas, data visualization functionality is mainly implemented through `DataFrame.plot()` and `Series.plot()` methods, which are actually wrappers for the Matplotlib library, simplifying the chart drawing process. | **Chart Type** | **Description** | **Method** | | --- | --- | --- | | **Line Plot** | Shows the trend of data over time or other continuous variables | `df.plot(kind='line')` | | **Bar Chart** | Compares data across different categories | `df.plot(kind='bar')` | | **Horizontal Bar Chart** | Compares data across different categories, but bars are horizontal | `df.plot(kind='barh')` | | **Histogram** | Shows the distribution of data | `df.plot(kind='hist')` | | **Scatter Plot** | Shows the relationship between two numerical variables | `df.plot(kind='scatter', x='col1', y='col2')` | | **Box Plot** | Shows data distribution, including median, quartiles, etc. | `df.plot(kind='box')` | | **Density Plot** | Shows the density distribution of data | `df.plot(kind='kde')` | | **Pie Chart** | Shows the proportion of different parts in the whole | `df.plot(kind='pie')` | | **Area Plot** | Shows cumulative values of data | `df.plot(kind='area')` | The basic functions and methods of Pandas data visualization can meet most daily data visualization needs, but for more complex visualizations, you can combine Matplotlib and Seaborn for more refined chart customization. * * * ## 1. Pandas Data Visualization Overview The `plot()` method provided by Pandas can easily draw different types of charts, including line plots, bar charts, histograms, scatter plots, etc. The `plot()` method has many parameters to customize the chart style, color, labels, etc. ### 1. Basic `plot()` Method | **Parameter** | **Description** | | --- | --- | | `kind` | Chart type, supports `'line'`, `'bar'`, `'barh'`, `'hist'`, `'box'`, `'kde'`, `'density'`, `'area'`, `'pie'`, etc. | | `x` | Set the data column for x-axis | | `y` | Set the data column for y-axis | | `title` | Chart title | | `xlabel` | X-axis label | | `ylabel` | Y-axis label | | `color` | Set chart color | | `figsize` | Set chart size (width, height) | | `legend` | Whether to show legend | ### 2. Common Chart Types | **Chart Type** | **Description** | **Common Usage** | | --- | --- | --- | | **Line Plot** | Used to show data trends over time | `df.plot(kind='line')` | | **Bar Chart** | Used to show comparative data between categories | `df.plot(kind='bar')` | | **Horizontal Bar Chart** | Similar to bar chart, but bars are horizontal | `df.plot(kind='barh')` | | **Histogram** | Used to show data distribution (frequency distribution) | `df.plot(kind='hist')` | | **Scatter Plot** | Used to show the relationship between two numerical variables | `df.plot(kind='scatter', x='col1', y='col2')` | | **Box Plot** | Used to show data distribution, outliers, and quartiles | `df.plot(kind='box')` | | **Density Plot** | Used to show density distribution of data | `df.plot(kind='kde')` | | **Pie Chart** | Used to show the proportion of each part to the whole | `df.plot(kind='pie')` | | **Area Plot** | Used to show chart of cumulative values (similar to line plot but filled with color) | `df.plot(kind='area')` | * * * ## 2. Data Visualization Examples ### 1. Line Plot (Line Plot) Line plots are typically used to show data trends over time. ## Example import pandas as pd import matplotlib.pyplot as plt # Sample data data ={'Year': [2015,2016,2017,2018,2019,2020], 'Sales': [100,150,200,250,300,350]} df = pd.DataFrame(data) # Draw line plot df.plot(kind='line', x='Year', y='Sales', title='Sales Over Years', xlabel='Year', ylabel='Sales', figsize=(10,6)) plt.show() **Output:** !(#) ### 2. Bar Chart (Bar Chart) Bar charts are used to show comparisons between different categories, especially suitable for discrete data. ## Example import pandas as pd import matplotlib.pyplot as plt # Sample data data ={'Category': ['A','B','C','D'], 'Value': [10,15,7,12]} df = pd.DataFrame(data) # Draw bar chart df.plot(kind='bar', x='Category', y='Value', title='Category Values', xlabel='Category', ylabel='Value', figsize=(8,5)) plt.show() **Output:** !(#) ### 3. Scatter Plot (Scatter Plot) Scatter plots are used to show the relationship between two numerical variables. ## Example import pandas as pd import matplotlib.pyplot as plt # Sample data data ={'Height': [150,160,170,180,190], 'Weight': [50,60,70,80,90]} df = pd.DataFrame(data) # Draw scatter plot df.plot(kind='scatter', x='Height', y='Weight', title='Height vs Weight', xlabel='Height (cm)', ylabel='Weight (kg)', figsize=(8,5)) plt.show() **Output:** !(#) ### 4. Histogram (Histogram) Histograms are used to show data distribution, especially for describing frequency distribution of data. ## Example import pandas as pd import matplotlib.pyplot as plt # Sample data data ={'Scores': [55,70,85,90,60,75,80,95,100,65]} df = pd.DataFrame(data) # Draw histogram df.plot(kind='hist', y='Scores', bins=5, title='Scores Distribution', xlabel='
← Pandas OptimizationPandas Excel β†’