Pandas Pd To Datetime
[ Pandas Common Functions](#)
* * *
`pd.to_datetime()` is a function in the Pandas library used to **convert data into datetime type**. It can convert data of various formats such as strings and Unix timestamps into Pandas' datetime64 type.
Handling date and time is a common task in data analysis. After converting to datetime type, it becomes convenient to perform operations like extracting dates, time calculations, and timezone handling.
**Word Definition**: `to_datetime` means "convert to datetime", which converts various time data formats into standard datetime objects.
* * *
## Basic Syntax and Parameters
`pd.to_datetime()` is a top-level function in the Pandas library used to convert various time data formats into datetime types.
### Syntax Format
pd.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=False, format=None, unit='ns')
### Parameter Description
| Parameter | Type | Description |
| --- | --- | --- |
| arg | integer, float, string, datetime, list, Series | Data to be converted to datetime type. |
| errors | string | Error handling method: 'raise' (default) throws exception; 'coerce' converts to NaT; 'ignore' returns original data. |
| dayfirst | boolean | If True, the day comes before the month (e.g., 01/02/2023 represents February 2nd). |
| utc | boolean | If True, returns datetime in UTC timezone. |
| format | string | Specifies the date-time format, such as '%Y-%m-%d'. |
| unit | string | When arg is numeric, specifies the unit: 'D', 's', 'ms', 'us', 'ns'. |
### Return Value Description
* **Return Value**: Returns a Series with datetime64 type.
* **Effect**: Converts input data into Pandas' datetime type for convenient date-time operations.
* * *
## Examples
Let's go through a series of examples from simple to complex to fully master the usage of `pd.to_datetime()`.
### Example 1: Basic Usage - Convert String to Datetime
## Example
import pandas as pd
# 1. Create a Series of datetime strings
dates = pd.Series([
'2023-01-01',
'2023-01-02',
'2023-01-03',
'2023-01-04',
'2023-01-05'
])
print("=== Original Series (Type:", dates.dtype,")===")
print(dates)
# 2. Use pd.to_datetime() to convert to datetime type
result = pd.to_datetime(dates)
print("n=== After pd.to_datetime() (Type:", result.dtype,")===")
print(result)
# 3. Support multiple formats
dates_mixed = pd.Series([
'2023-01-01',
'2023/02/03',
'01.04.2023',
'2023Year 05 Month 06 Day'
])
print("n=== Multiple formats auto-parsed ===")
print(pd.to_datetime(dates_mixed))
# 4. Include time information
dates_with_time = pd.Series([
'2023-01-01 12:30:45',
'2023-02-03 08:15:30'
])
result_datetime = pd.to_datetime(dates_with_time)
print("n=== Include time information ===")
print(result_datetime)
**Output:**
=== Original Series (Type: object )===
0 2023-01-01
1 2023-01-02
2 2023-01-03
3 2023-01-04
4 2023-01-05
dtype: object
=== pd.to_datetime() conversion (Type: datetime64 )===
0 2023-01-01 00:00:00
1 2023-01-02 00:00:00
2 2023-01-03 00:00:00
3 2023-01-04 00:00:00
4 2023-01-05 00:00:00
=== Multiple formats auto-parsed ===
0 2023-01-01 00:00:00
1 2023-02-03 00:00:00
2 2023-04-01 00:00:00
3 2023-05-06 00:00:00
=== Include time information ===
0 2023-01-01 12:30:45
1 2023-02-03 08:15:30
**Code Explanation:**
1. `pd.to_datetime()` can automatically recognize many common date-time formats.
2. Even if only the date part is provided, the time part will default to 00:00:00.
3. It can handle date formats with different separators like Chinese characters, dots, and slashes.
### Example 2: Handling Different Date Formats and the dayfirst Parameter
For different regional date formats, you can use the `dayfirst` parameter or the `format` parameter.
## Example
import pandas as pd
# 1. European-style dates (month/day/year)
europe_dates = pd.Series(['01/02/2023','02/03/2023','03/04/2023'])
print("=== European-style dates (01/02/2023 interpreted as Feb 2nd) ===")
print("Default (month/day/year):", pd.to_datetime(europe_dates).dt.day.tolist())
print("dayfirst=True:", pd.to_datetime(europe_dates, dayfirst=True).dt.day.tolist())
# 2. Use format parameter to specify exact format
print("n=== Using format parameter ===")
result = pd.to_datetime('2023-06-15 14:30:00', format='%Y-%m-%d %H:%M:%S')
print(f"Conversion result: {result}")
# 3. Complex format parsing
complex_dates = pd.Series(['2023Year 12 Month 25 Day','15/08/2023','2023-05-01'])
print("n=== Mixed format auto-parsing ===")
print(pd.to_datetime(complex_dates, dayfirst=True))
**Output:**
=== European-style dates (01/02/2023 interpreted as Feb 2nd) ===
Default: [1, 2, 3]
dayfirst=True: [2, 3, 4]
=== Using format parameter ===
Conversion result: 2023-06-15 14:30:00
=== Mixed format auto-parsing ===
0 2023-12-25 00:00:00
1 2023-08-15 00:00:00
2 2023-05-01 00:00:00
**Code Explanation:**
* The `format` parameter explicitly tells Pandas how to parse dates, avoiding ambiguity.
* Auto-parsing is powerful, but when there's ambiguity (like interpreting 01/02/2023), you need to use either `dayfirst` or `format` parameters to clarify.
### Example 3: Converting Unix Timestamps to Datetime
Unix timestamps exported from databases or systems can be converted using the `unit` parameter.
## Example
import pandas as pd
# 1. Unix timestamps (seconds)
timestamps = pd.Series([1672531200,1672617600,1672704000])
print("=== Unix Timestamps (seconds) ===")
print(timestamps.values)
# 2. Convert to datetime
result = pd.to_datetime(timestamps, unit='s')
print("n=== After pd.to_datetime(..., unit='s') ===")
print(result)
# 3. Millisecond Unix timestamps
timestamps_ms = pd.Series([1672531200000,1672617600000,1672704000000])
result_ms = pd.to_datetime(timestamps_ms, unit='ms')
print("n=== Millisecond timestamps ===")
print(result_ms)
# 4. Convert from fixed origin
print("n=== Calculated from 2023-01-01 ===")
days = pd.Series([0,1,2,3,4])
result_origin = pd.to_datetime(days, origin='2023-01-01', unit='D')
print(result_origin)
**Output:**
=== Unix Timestamps (seconds) ===
=== After pd.to_datetime(..., unit='s') ===
0 2023-01-01 00:00:00
1 2023-01-02 00:00:00
2 2023-01-03 00:00:00
=== Millisecond timestamps ===
0 2023-01-01 00:00:00
1 2023-01-02 00:00:00
2 2023-01-03 00:00:00
=== Calculated from 2023-01-01 ===
0 2023-01-01
1 2023-01-02
2 2023-01-03
3 2023-01-04
4 2023-01-05
**Code Explanation:**
* `unit='s'` indicates that the numeric values are in seconds (Unix timestamp).
* `unit='ms'` indicates millisecond timestamps.
* The `origin` parameter can specify a starting time for calculating relative times.
### Example 4: errors Parameter Handling Invalid Dates
## Example
import pandas as pd
import numpy as np
# 1. Series containing invalid dates
mixed_dates = pd.Series(['2023-01-01','2023-02-30','invalid','2023-03-15'])
print("=== Series with invalid dates ===")
print(mixed_dates)
# 2. errors='raise' (default) - throw exception
print("n=== errors='raise' ===")
try:
pd.to_datetime(mixed_dates, errors='raise')
except Exception as e:
print(f"Exception: {type(e).__name__}")
# 3. errors='coerce' - convert invalid dates to NaT
print("n=== errors='coerce' ===")
result = pd.to_datetime(mixed_dates, errors='coerce')
print(result)
# 4. errors='ignore' - keep original
print("n=== errors='ignore' ===")
result_ignore = pd.to_datetime(mixed_dates, errors='ignore')
print(result_ignore)
print(f"Type: {result_ignore.dtype}")
**Output:**
=== Series with invalid dates ===
0 2023-01-01
1 2023-02-30
2 invalid
3 2023-03-15
=== errors='raise' ===
Exception: OutOfBoundsDatetime
=== errors='coerce' ===
0 2023-01-01 00:00:00
1 NaT
2 NaT
3 2023-03-15 00:00:00
=== errors='ignore' ===
0 2023-01-01
1 2023-02-30
2 invalid
3 2023-03-15
dtype: object
**Code Explanation:**
* `errors='coerce'` converts invalid dates and unparseable values to NaT (Not a Time, equivalent to missing values).
* This is very useful when dealing with messy data, ensuring continuous data processing.
* * *
## Notes
> **Important Note:**
>
> * If data contains values outside the datetime range, an `OutOfBoundsDatetime` exception will be raised.
> * The `format` parameter clearly defines the date format. It is recommended to use it when processing large amounts of data to improve performance.
> * The converted datetime objects can use the `.dt` accessor for rich date-time operations.
> * Timezone handling requires installing the `pytz` library.
* * Pandas Common Functions](#)
YouTip