Pandas Loc Iloc
Data selection is one of the most common operations in Pandas. Understanding the differences and use cases of `loc`, `iloc`, and `at` will help you process data more efficiently.
* * *
## Difference Between loc and iloc
| Feature | loc | iloc |
| --- | --- | --- |
| Indexing Method | Label-based | Integer-based |
| Slicing | Includes end position | Does not include end position |
| Single Value | Returns scalar | Returns scalar |
| Recommended Scenario | When explicit index labels exist | When selecting by position |
## Examples
import pandas as pd
# Create sample DataFrame
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu","Zhao Liu","Qi Qian"],
"Age": [25,30,28,35,22],
"City": ["Beijing","Shanghai","Guangzhou","Shenzhen","Hangzhou"]
}, index=[1,3,5,7,9])# Note: Index is not continuous
print("DataFrame:")
print(df)
print()
# loc: Use label-based indexing (includes end position)
print("df.loc[1:5] (label slicing, includes 5):")
print(df.loc[1:5])
print()
# iloc: Use position-based indexing (does not include end position)
print("df.iloc[0:2] (position slicing, does not include 2):")
print(df.iloc[0:2])
* * *
## Usage of loc
### Selecting Rows
## Examples
import pandas as pd
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu","Zhao Liu"],
"Age": [25,30,28,35],
"City": ["Beijing","Shanghai","Guangzhou","Shenzhen"]
}, index=["a","b","c","d"])
# Select single row (returns Series)
print("Select one row:")
print(df.loc)
print()
# Select multiple rows
print("Select multiple rows:")
print(df.loc[["a","c"]])
print()
# Slice selection (includes start and end)
print("Slice selection:")
print(df.loc["a":"c"])
### Selecting Columns
## Examples
import pandas as pd
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu"],
"Age": [25,30,28],
"City": ["Beijing","Shanghai","Guangzhou"]
}, index=["a","b","c"])
# Select single column
print("Select single column:")
print(df.loc[:,"Name"])
print()
# Select multiple columns
print("Select multiple columns:")
print(df.loc[:,["Name","City"]])
print()
# Slice selection for columns
print("Slice selection for columns:")
print(df.loc[:,"Name":"City"])
### Selecting Specific Rows and Columns (Recommended Approach)
## Examples
import pandas as pd
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu","Zhao Liu"],
"Age": [25,30,28,35],
"City": ["Beijing","Shanghai","Guangzhou","Shenzhen"]
}, index=["a","b","c","d"])
# Select specific rows and columns
print("Select single value:")
print(df.loc["a","Name"])# Returns "Zhang San"
print(type(df.loc["a","Name"]))# Type is str
print()
# Select multiple rows and columns
print("Select subset:")
print(df.loc[["a","c"],["Name","City"]])
print()
# Conditional selection
print("Rows where age > 28:")
print(df.loc[df>28])
* * *
## Usage of iloc
### Selection by Position
## Examples
import pandas as pd
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu","Zhao Liu"],
"Age": [25,30,28,35],
"City": ["Beijing","Shanghai","Guangzhou","Shenzhen"]
})
# Select row 0
print("Select row 0:")
print(df.iloc)
print()
# Select first 3 rows
print("Select first 3 rows:")
print(df.iloc[:3])
print()
# Select specific rows
print("Select specific rows:")
print(df.iloc[[0,2,3]])
print()
# Negative indexing (from end)
print("Select last row:")
print(df.iloc)
print()
print("Select last 3 rows:")
print(df.iloc[-3:])
### Selecting Columns
## Examples
import pandas as pd
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu"],
"Age": [25,30,28],
"City": ["Beijing","Shanghai","Guangzhou"]
})
# Select column 0
print("Column 0:")
print(df.iloc[:,0])
print()
# Select columns 1 and 2
print("Select multiple columns:")
print(df.iloc[:,[1,2]])
print()
# Slice selection
print("Slice selection for columns:")
print(df.iloc[:,0:2])
* * *
## at and iat (Getting Single Values)
`at` and `iat` are accessors specifically designed for getting/setting single values, and are faster than `loc` and `iloc`.
## Examples
import pandas as pd
import time
import numpy as np
# Create large DataFrame for performance testing
df = pd.DataFrame(np.random.randn(1000,10), columns=[f"col_{i}"for i in range(10)])
# at: Get single value (label-based indexing)
print(f"at: {df.at[0, 'col_0']}")
# iat: Get single value (position-based indexing)
print(f"iat: {df.iat[0, 0]}")
# Performance comparison
n =10000
start =time.time()
for _ in range(n):
_ = df.iloc[0,0]
print(f"iloc time: {time.time() - start:.4f}s")
start =time.time()
for _ in range(n):
_ = df.iat[0,0]
print(f"iat time: {time.time() - start:.4f}s")
* * *
## Conditional Selection
Using boolean conditions to filter data is one of the most common operations.
## Examples
import pandas as pd
df = pd.DataFrame({
"Name": ["Zhang San","Li Si","Wang Wu","Zhao Liu"],
"Age": [25,30,28,35],
"City": ["Beijing","Shanghai","Guangzhou","Beijing"],
"Salary": [12000,15000,11000,18000]
})
# Single condition
print("Employees with age > 28:")
print(df[df>28])
print()
# Multiple conditions (using & | ~)
print("Beijing and salary > 12000:")
print(df[(df=="Beijing")&(df>12000)])
print()
# isin filtering
print("City is Beijing or Shanghai:")
print(df[df.isin(["Beijing","Shanghai"])])
print()
# String contains
print("Name contains 'San':")
print(df[df.str.contains("San")])
* * *
## Notes
**1. Slicing includes end position**
`loc` slicing includes the end position, while `iloc` does not.
**2. Non-existent index will raise an error**
When using `loc`, if the index does not exist, a KeyError will be raised. You can use `loc` with `reindex` as a workaround.
**3. Prefer using `loc`**
Because `loc` is more readable and less prone to errors, unless you need to select by position.
> `at`/`iat` are the fastest ways to access single values, while `loc`/`iloc` are used for selecting multiple values. Choosing the appropriate method can improve code performance and readability.**
YouTip