Pandas Chunksize
# Pandas Handling Large Files (chunksize)
When processing large data files, memory may not be sufficient to load all data at once. Pandas provides chunk reading functionality to process data in batches.
* * *
## Reading CSV in Chunks
### chunksize Parameter
## Example
import pandas as pd
import io
# Simulate large file content (replace with file path in actual use)
data ="""id,value
1,100
2,200
3,300
4,400
5,500
6,600
7,700
8,800
9,900
10,1000
"""
# Use chunksize to read in chunks
chunks = pd.read_csv(io.StringIO(data), chunksize=3)
print("Chunk processing:")
for i,chunk in enumerate(chunks):
print(f"n Chunk {i+1}:")
print(chunk)
### Aggregation Processing
## Example
import pandas as pd
import io
import numpy as np
# Simulate large file
data ="valuen" + "n".join([str(i)for i in range(1,1001)])
# Calculate sum in chunks
total =0
count =0
for chunk in pd.read_csv(io.StringIO(data), chunksize=100):
total +=chunk
YouTip