How to Extract Year-Week from DateTime in Pandas

In this tutorial, we'll see how to extract year-week from datetime column in Pandas DataFrame.

So at the end we will get:

  • 2019-10-14 00:00:00 -> 2019-w42
  • 2018-10-15 -> 2018-w42
  • 2018-10-15 -> 2018-42

(1) Extract year-week as date - strftime

df['col'].dt.strftime('%Y-w%V')

(2) Extract year-week as string

df['y'].apply(str) + '-' + df['w'].apply(str)

You can find more about Pandas dates extraction

Setup

For this example we will use DataFrame like:

import pandas as pd

data = {
        'age': [25, 30, 40, 35, 20, 40, 22],
        'start_date': ["2019-10-14", "2018-10-15","2020-7-15", "2020-10-6","2020-03-8","2015-10-14","2011-12-18"],
        'person': ['Tim', 'Jim', 'Kim', 'Bim', 'Dim', 'Sim', 'Lim']
       }
df = pd.DataFrame(data)

In this DataFrame there is a date column:

age start_date person
0 25 2019-10-14 Tim
1 30 2018-10-15 Jim
2 40 2020-07-15 Kim
3 35 2020-10-06 Bim
4 20 2020-03-08 Dim
5 40 2015-10-14 Sim
6 22 2011-12-18 Lim

If data is read as a string we need to convert it to datetime by:

df['start_date'] = pd.to_datetime(df['start_date'])

In order to avoid errors like:

AttributeError: Can only use .dt accessor with datetimelike values

1: Extract year-week in Pandas

Let's start by extracting the year-week column from datetime in Pandas. The most convenient and shortest way is by using strftime:

df['start_date'].dt.strftime('%Y-w%V')

which will result into:

0    2019-w42
1    2018-w42
2    2020-w29
3    2020-w41
4    2020-w10
5    2015-w42
6    2011-w50
Name: start_date, dtype: object

2: Extract year-week as string

Alternatively if we have columns for year and week we can combine them by + operator.

df['w'] = df['start_date'].dt.isocalendar().week
df['y'] = df['start_date'].dt.isocalendar().year
df['yw'] = df['y'].apply(str) + '-' + df['w'].apply(str)

The output is year-week pairs:

0    2019-42
1    2018-42
2    2020-29
3    2020-41
4    2020-10
5    2015-42
6    2011-50
Name: yw, dtype: object

Rule of thumb: Be sure that columns are converted to string with .apply(str) - to avoid errors like:

TypeError: can only perform ops with numeric values
Pro Tip 1
Storing year-week values as a string might impact data analysis. For example, sorting and visualization will not work as expected.

3: Week number formats & directives

There are several possible ways to extract week numbers:

  • %V - 1..53 - starting Monday (ISO standard)
  • %W - 0..53 - start on Sunday
  • %U - 0..53 - Monday first day of week

More info can be found on:

Conclusion

In this article, we learned how to week and year-week in Pandas.

We started by date format extraction, then we covered string extraction with potential problems. Finally we covered different week formats in Python and strftime.