Round Pandas date to nearest year, month or week
In this post you can find how to:
- solve error: ValueError: is a non-fixed frequency
- floor, ceil or round to year, month or week in Pandas
(1) Get year, month or week in Pandas
df['year'] = df['date'].dt.year
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.week
(2) Round or floor year in Pandas
df['date'].dt.to_period('Y').dt.start_time
result:
0 2021-01-01
1 2022-01-01
2 2023-01-01
3 2024-01-01
Name: date, dtype: datetime64[ns]
(3) Round or floor year in Pandas
df['date'].dt.to_period('Y').dt.end_time
result:
0 2021-12-31 23:59:59.999999999
1 2022-12-31 23:59:59.999999999
2 2023-12-31 23:59:59.999999999
3 2024-12-31 23:59:59.999999999
Name: date, dtype: datetime64[ns]
(4) Round or floor year in Pandas
df['date'].dt.to_period('W-SUN').dt.start_time
result:
0 2020-12-28
1 2022-06-27
2 2023-03-27
3 2024-10-28
Name: date, dtype: datetime64[ns]
(5) Get Weekly Period in Pandas
df['date'].dt.to_period('W-SUN')
result:
0 2020-12-28/2021-01-03
1 2022-06-27/2022-07-03
2 2023-03-27/2023-04-02
3 2024-10-28/2024-11-03
Name: date, dtype: period[W-SUN]
(6) Ceil or floor by offests
(df['date'] + pd.offsets.YearBegin(-1)).dt.date
df['date'] + pd.offsets.MonthBegin(-1)
result:
0 2020-01-01
1 2022-01-01
2 2023-01-01
3 2024-01-01
Name: date, dtype: object
ValueError: is a non-fixed frequency
There is open issue about using dt.ceil, dt.floor, dt.round with coarser target frequencies:
- YE
- ME
- W
Can't use dt.ceil, dt.floor, dt.round with coarser target frequencies
Usually you will receive error like:
ValueError: <30 * YearEnds: month=12> is a non-fixed frequency
ValueError: <YearEnd: month=12> is a non-fixed frequency
ValueError: <MonthEnd> is a non-fixed frequency
ValueError: <Week: weekday=6> is a non-fixed frequency
To solve this error you can:
- get the period by:
df['date'].dt.to_period('W-SUN')
- get the start or the end of the period:
.dt.start_time
or.dt.end_time
Sample Data
import pandas as pd
data = {
'date': [
pd.Timestamp('2021-01-01 11:04:45'),
pd.Timestamp('2022-07-01 12:035:15'),
pd.Timestamp('2023-04-01 12:16:30'),
pd.Timestamp('2024-11-01 15:57:59')
]
}
df = pd.DataFrame(data)
df