How to Extract Year-Week from DateTime in Pandas
In this tutorial, we'll see how to extract year-week from datetime column in Pandas DataFrame.
So at the end we will get:
2019-10-14 00:00:00 -> 2019-w42
2018-10-15 -> 2018-w42
2018-10-15 -> 2018-42
(1) Extract year-week as date - strftime
df['col'].dt.strftime('%Y-w%V')
(2) Extract year-week as string
df['y'].apply(str) + '-' + df['w'].apply(str)
You can find more about Pandas dates extraction
- How to Convert DateTime to Day of Week(name and number) in Pandas
- How to Extract Month and Year from DateTime column in Pandas
Setup
For this example we will use DataFrame like:
import pandas as pd
data = {
'age': [25, 30, 40, 35, 20, 40, 22],
'start_date': ["2019-10-14", "2018-10-15","2020-7-15", "2020-10-6","2020-03-8","2015-10-14","2011-12-18"],
'person': ['Tim', 'Jim', 'Kim', 'Bim', 'Dim', 'Sim', 'Lim']
}
df = pd.DataFrame(data)
In this DataFrame there is a date column:
age | start_date | person | |
---|---|---|---|
0 | 25 | 2019-10-14 | Tim |
1 | 30 | 2018-10-15 | Jim |
2 | 40 | 2020-07-15 | Kim |
3 | 35 | 2020-10-06 | Bim |
4 | 20 | 2020-03-08 | Dim |
5 | 40 | 2015-10-14 | Sim |
6 | 22 | 2011-12-18 | Lim |
If data is read as a string we need to convert it to datetime by:
df['start_date'] = pd.to_datetime(df['start_date'])
In order to avoid errors like:
AttributeError: Can only use .dt accessor with datetimelike values
1: Extract year-week in Pandas
Let's start by extracting the year-week column from datetime in Pandas. The most convenient and shortest way is by using strftime
:
df['start_date'].dt.strftime('%Y-w%V')
which will result into:
0 2019-w42
1 2018-w42
2 2020-w29
3 2020-w41
4 2020-w10
5 2015-w42
6 2011-w50
Name: start_date, dtype: object
2: Extract year-week as string
Alternatively if we have columns for year and week we can combine them by +
operator.
df['w'] = df['start_date'].dt.isocalendar().week
df['y'] = df['start_date'].dt.isocalendar().year
df['yw'] = df['y'].apply(str) + '-' + df['w'].apply(str)
The output is year-week pairs:
0 2019-42
1 2018-42
2 2020-29
3 2020-41
4 2020-10
5 2015-42
6 2011-50
Name: yw, dtype: object
Rule of thumb: Be sure that columns are converted to string with .apply(str)
- to avoid errors like:
TypeError: can only perform ops with numeric values
Storing year-week values as a string might impact data analysis. For example, sorting and visualization will not work as expected.
3: Week number formats & directives
There are several possible ways to extract week numbers:
%V
- 1..53 - starting Monday (ISO standard)%W
- 0..53 - start on Sunday%U
- 0..53 - Monday first day of week
More info can be found on:
Conclusion
In this article, we learned how to week and year-week in Pandas.
We started by date format extraction, then we covered string extraction with potential problems. Finally we covered different week formats in Python and strftime
.