Pandas Datetime
Pandas provides powerful date and time processing capabilities, allowing convenient conversion of strings to date types and performing various date-related calculations and analyses.
* * *
## Creating Date and Time
### date_range Function
## Example
import pandas as pd
# Create a date range
dates = pd.date_range("2024-01-01", periods=10, freq="D")
print("Daily:")
print(dates)
print()
# Monthly
dates_month = pd.date_range("2024-01-01", periods=12, freq="M")
print("Monthly:")
print(dates_month)
print()
# Hourly
dates_hour = pd.date_range("2024-01-01 00:00", periods=24, freq="H")
print("Hourly (first 5):")
print(dates_hour[:5])
### DatetimeIndex and DatetimeArray
## Example
import pandas as pd
import numpy as np
# Create using datetime objects
dt = pd.DatetimeIndex([
pd.Timestamp("2024-01-01"),
pd.Timestamp("2024-01-02"),
pd.Timestamp("2024-01-03")
])
print("DatetimeIndex:")
print(dt)
print()
# Create a Series with time index
s = pd.Series(
[100,200,300],
index=pd.date_range("2024-01-01", periods=3, freq="D")
)
print("Series with date index:")
print(s)
* * *
## Converting Strings to Dates
### pd.to_datetime
## Example
import pandas as pd
# Convert various string formats to dates
dates_str =["2024-01-01","2024/01/02","01-03-2024","20240104"]
# Auto-infer format
dt = pd.to_datetime(dates_str)
print("Auto-inferred format:")
print(dt)
print()
# Specify format
dt2 = pd.to_datetime(dates_str, format="%Y-%m-%d")
print("Specified format:")
print(dt2)
print()
# Handle invalid values
dt3 = pd.to_datetime(["2024-01-01","invalid","2024-01-03"], errors="coerce")
print("Handle invalid values (converted to NaT):")
print(dt3)
### read_csv Automatic Date Parsing
## Example
import pandas as pd
from io import StringIO
# Simulate CSV data
csv_data ="""Date,Sales
2024-01-01,100
2024-01-02,200
2024-01-03,150
"""
# Method 1: Read then convert
df = pd.read_csv(StringIO(csv_data))
df= pd.to_datetime(df)
print("Convert after reading:")
print(df.dtypes)
print()
# Method 2: Parse during reading
df2 = pd.read_csv(StringIO(csv_data), parse_dates=)
print("Parse during reading:")
print(df2.dtypes)
* * *
## Accessing Date Properties
After converting to datetime type, various properties can be easily extracted from date times.
## Example
import pandas as pd
# Create a date Series
s = pd.Series(pd.date_range("2024-01-15", periods=5, freq="D"))
print("Date Series:")
print(s)
print()
# Extract year/month/day
print("Extract year:")
print(s.dt.year)
print()
print("Extract month:")
print(s.dt.month)
print()
print("Extract day:")
print(s.dt.day)
print()
# Extract day of week (0=Monday, 6=Sunday)
print("Day of week (number):")
print(s.dt.dayofweek)
print()
# Extract day name
print("Day name:")
print(s.dt.day_name())
### More Date Properties
| Property | Description | Example |
| --- | --- | --- |
| `year` | Year | 2024 |
| `month` | Month (1-12) | 1 |
| `day` | Day (1-31) | 15 |
| `hour` | Hour (0-23) | 10 |
| `minute` | Minute (0-59) | 30 |
| `dayofweek` | Day of week (0-6) | 0 |
| `quarter` | Quarter (1-4) | 1 |
| `is_month_start` | Is beginning of month | True/False |
| `is_month_end` | Is end of month | True/False |
* * *
## Date Operations
### Date Addition and Subtraction
## Example
import pandas as pd
# Create a date
date = pd.Timestamp("2024-01-15")
print(f"Base date: {date}")
print()
# Add/subtract days
print(f"+3 days: {date + pd.Timedelta(days=3)}")
print(f"-5 days: {date - pd.Timedelta(days=5)}")
print()
# Date difference
date1 = pd.Timestamp("2024-01-01")
date2 = pd.Timestamp("2024-01-15")
print(f"Date difference: {date2 - date1}")
print(f"Days difference: {(date2 - date1).days}")
print()
# Date Series operations
dates = pd.date_range("2024-01-01", periods=5, freq="D")
print("Date + 3 days:")
print(dates + pd.Timedelta(days=3))
### Date Offsets
## Example
import pandas as pd
date = pd.Timestamp("2024-01-15")
print(f"Base date: {date}")
print()
# Beginning/end of month
print(f"Beginning of month: {date + pd.offsets.MonthBegin(1)}")
print(f"End of month: {date + pd.offsets.MonthEnd(1)}")
print()
# Year offset
print(f"Add 1 year: {date + pd.DateOffset(years=1)}")
print(f"Subtract 1 month: {date + pd.DateOffset(months=-1)}")
print()
# Weekday
print(f"Next Monday: {date + pd.offsets.Week(weekday=0)}")
* * *
## Timezone Handling
## Example
import pandas as pd
# Create timezone-naive timestamps
dates = pd.date_range("2024-01-01 10:00", periods=3, freq="H")
print("Without timezone:")
print(dates)
print()
# Set timezone
dates_utc = dates.tz_localize("UTC")
print("Set UTC timezone:")
print(dates_utc)
print()
# Convert timezone
dates_shanghai = dates_utc.tz_convert("Asia/Shanghai")
print("Converted to Shanghai timezone:")
print(dates_shanghai)
* * *
## Practical Example: Sales Data Analysis
## Example
import pandas as pd
# Simulate sales data
df = pd.DataFrame({
"Date": pd.date_range("2024-01-01", periods=30, freq="D"),
"Sales": [100,150,120,180,200,90,80] * 4 + [100,100]
})
df= pd.to_datetime(df)
print("Sales data:")
print(df.head(10))
print()
# Group by weekday
print("Average sales by weekday:")
weekday_sales = df.groupby(df.dt.day_name()).mean()
print(weekday_sales)
print()
# Group by month
print("Monthly summary:")
df= df.dt.to_period("M")
monthly_sales = df.groupby("Month").sum()
print(monthly_sales)
print()
# Calculate 7-day rolling average
df= df.rolling(window=7).mean()
print("Added 7-day rolling average:")
print(df.head(10))
* * *
## Common Issues
**1. Inconsistent Date Formats**
When using `pd.to_datetime`, specify the `format` parameter to clearly define the format and avoid parsing errors.
**2. Timezone Confusion**
When handling cross-timezone data, ensure all times use a consistent timezone or UTC.
**3. Date Operations Using Timedelta**
Use `pd.Timedelta` for date addition/subtraction instead of simple integer arithmetic.
> When working with time series data, it's recommended to convert date strings to datetime types early on to leverage Pandas' powerful date features.
YouTip