Pandas Df Dropna
[ Pandas Common Functions](#)
* * *
`df.dropna()` is a function in Pandas used to delete rows or columns containing missing values.
In data analysis, missing values (`NaN`, `None`, or `NULL`) are a common problem. The `dropna()` function provides a simple way to clean data. You can choose to delete rows or columns containing missing values, or filter data based on the proportion of missing values.
* * *
## Basic Syntax and Parameters
`dropna()` is a member function of DataFrame, called via the dot operator `.`.
### Syntax Format
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
### Parameter Description
| Parameter | Type | Required | Description | Default Value |
| --- | --- | --- | --- | --- |
| axis | int or str | Optional | Specifies whether to delete rows or columns. `0` or `'index'` means deleting rows containing missing values; `1` or `'columns'` means deleting columns containing missing values. | 0 |
| how | str | Optional | Specifies the deletion condition. `'any'` means deleting if there is any missing value; `'all'` means deleting only if all values are missing. | 'any' |
| thresh | int | Optional | Sets the minimum number of non-missing values. If the number of non-missing values in a row (or column) is less than `thresh`, the row (or column) is deleted. | None |
| subset | array-like | Optional | Specifies in which columns (or rows) to check for missing values. If `axis=0`, `subset` specifies column names; if `axis=1`, `subset` specifies row indices. | None |
| inplace | bool | Optional | If `True`, modifies the original DataFrame directly without returning a new object; if `False`, returns a new DataFrame, leaving the original data unchanged. | False |
### Return Value Description
* Returns a new DataFrame (if `inplace=False`), or `None` (if `inplace=True`).
* The returned DataFrame does not contain rows or columns with missing values.
* * *
## Examples
Let's thoroughly master the usage of `dropna()` through a series of examples.
### Example 1: Deleting Rows with Missing Values
This is the most common usage, deleting any row that contains missing values.
## Example
import pandas as pd
import numpy as np
# Create a DataFrame containing missing values
data = {
'Name': ['Zhang San', 'Li Si', 'Wang Wu', 'Zhao Liu', 'Qian Qi'],
'Age': [25, 30, np.nan, 35, 28], # Wang Wu's age is missing
'Salary': [5000, 6000, 5500, np.nan, 7000], # Zhao Liu's salary is missing
'Department': ['Tech', 'Marketing', 'Tech', 'Marketing', 'Tech']
}
df = pd.DataFrame(data)
print("Original Data:")
print(df)
print("=" * 50)
# Delete rows containing missing values (default behavior)
df_cleaned = df.dropna()
print("Data after deleting missing values:")
print(df_cleaned)
**Expected Output:**
Original Data:
Name Age Salary Department
0 Zhang San 25.0 5000.0 Tech
1 Li Si 30.0 6000.0 Marketing
2 Wang Wu NaN 5500.0 Tech
3 Zhao Liu 35.0 NaN Marketing
4 Qian Qi 28.0 7000.0 Tech
==================================================
Data after deleting missing values:
Name Age Salary Department
0 Zhang San 25.0 5000.0 Tech
1 Li Si 30.0 6000.0 Marketing
4 Qian Qi 28.0 7000.0 Tech
**Code Analysis:**
1. We created a DataFrame containing missing values, where Wang Wu's age is `NaN` and Zhao Liu's salary is `NaN`.
2. Using t
YouTip