Cheat sheet for working with datetime, dates and time in Pandas and Python. The cheat sheet try to show most popular operations in a short form. There is also a visual representation of the cheat sheet.
Pandas is a powerful library for working with datetime data in Python. Pandas offer variaty of attributes, methods, classes to work with date and time. The picture below illustrates most of them:
The idea of this post is to save your time and effort of googling the same commands over and over. If you have an idea how to improve it - please let us know! Thank you!
Pandas Datetime Cheat Sheet
import
from datetime import datetime, timedelta
import pandas as pd
impot time
now & today
pd.to_datetime('today')
pd.to_datetime('now')
pd.to_datetime('today').normalize()
datetime.today().strftime('%d/%m/%Y')
time.strftime('%d/%m/%Y')
datetime.datetime.now().isoformat()
datetime.datetime.utcnow().isoformat()
parse date & time
pd.to_datetime('2023-01-11 16:11:26.862697')
pd.to_datetime(df['date'])
pd.to_datetime(df['date'], dayfirst=True)
pd.to_datetime(df['date'], yearfirst=True)
pd.to_datetime(df['date'], infer_datetime_format=True)
pd.to_datetime(df['date'], format='%Y-%m-%d %H:%M:%S')
pd.to_datetime(df[['year', 'month', 'day']])
attributes
dt.year
dt.month
dt.day
dt.hour
dt.minute
dt.second
dt.day_of_year
dt.dayofweek
dt.is_leap_year
dt.daysinmonth
dt.daysinmonth
methods
dto = pd.to_datetime('2023-01-11 16:11:26.862697')
dto.day_name()
dto.month_name()
dto.tz_localize('UTC')
dto.tz_convert('US/Central')
idx = pd.date_range('2023-01-01', periods=3)
idx.to_period()
dto.round('H')
dto.floor('1min')
dto.ceil('1min')
calculations
t1 = pd.to_datetime('1/1/2023 01:00')
t2 = pd.to_datetime('today')
(t2 - t1).components
(t2 - t1).seconds
(t2 - t1).total_seconds()
df['date1'].dt.year - df['date2'].dt.year
df['date'] + pd.Timedelta(days=1)
df['date'] + pd.DateOffset(hours=16)
pd.Timedelta(5, 'H')
pd.Timedelta(1, 'd').total_seconds()
select date & time
df.set_index(['date'])
df.sort_index(inplace=True, ascending=True))
df.index = df['date']
df.loc['2023']
df.loc['2022-7']
df.loc['2022-1-1']
df.loc['2019' : '2022']
df.loc[df['date'] > '2022-01-01']
df.between_time('11:10','12:15')
df[df['date'].between('2022', '2023')]
df.loc['2022-7-1 11:10' : '2023-1-1 12:15' ]
pd.Interval(t1, t2).length
pd.Interval(t1, t2).overlaps(pd.Interval(t3, t4))
time zone
pd.Timestamp.now()
pd.Timestamp.utcnow()
pd.Timestamp.now(tz='Europe/Rome').tz_localize(None)
pd.Timestamp.now(tz='Europe/Rome')
pd.Timestamp.utcnow().tz_localize(None)
pd.Timestamp.utcnow().tz_convert(None)
dto.dt.tz_localize('+0100')
read_csv
pd.read_csv('test.csv', parse_dates = ['end_date'])
pd.read_csv('test.csv', parse_dates = [['yy', 'mm', 'dd']])
pd.read_csv('test.csv', parse_dates = {'date1': ['yy', 'mm', 'dd']})
pd.read_csv('test.csv', dayfirst = True)
pd.read_csv('test.csv', keep_date_col = True)
date & time format
from datetime import datetime
now = datetime.now()
now.strftime('%X')
%a
%A
%b
%B
%c
%d
%H
%I
%j
%m
%M
%p
%S
%U
%w
%W
%x
%X
%y
%Y
%Z
%%
Working with datetime
Here are some common tasks you might want to do with datetime data in Pandas:
- Create a datetime column in a Pandas DataFrame - we can use the
pd.to_datetime
method to convert a column of strings to a column of datetime objects.
For example:
import pandas as pd
df = pd.DataFrame({'date_string': ['2022-01-01', '2022-01-02', '2022-01-03']})
df['date'] = pd.to_datetime(df['date_string'])
result is a datetime Series and new column:
0 2022-01-01
1 2022-01-02
2 2022-01-03
Name: date_string, dtype: datetime64[ns]
- Extract the year, month, or day from a datetime column - we can use the
dt
attribute of a datetime columns to extract the year, month, or day as a separate column.
For example:
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day
- Filter a DataFrame by a datetime column:
dt
attribute can be used to filter a DataFrame by a specific date range.
For example:
# filter rows where the date year is 2023
df_2023 = df[df['date'].dt.year == 2023]
# filter rows where the date is in January 2023
df_january_2023 = df[(df['date'].dt.year == 2023) & (df['date'].dt.month == 1)]
- Aggregate data by a datetime column - we can use the
groupby()
method to aggregate data by a datetime column.
Example:
# calculate the mean value of a column for each year
df.groupby(df['date'].dt.year)['value'].mean()
# calculate the sum of a column for each month
df.groupby(df['date'].dt.month)['value'].sum()