How to Extract Month and Year from DateTime column in Pandas

How to Extract Month and Year from DateTime column in Pandas

In this short guide, I'll show you how to extract Month and Year from a DateTime column in Pandas DataFrame. You can also find how to convert string data to a DateTime. So at the end you will get:

01/08/2021 -> 2021-08
DD/MM/YYYY -> YYYY-MM

or any other date format. We will also cover MM/YYYY.

To start, here is the syntax that you may apply in order extract concatenation of year and month:

.dt.to_period('M')

In the next section, I'll review the steps to apply the above syntax in practice.

Step 1: Create a DataFrame with Datetime values

Lets create a DataFrame which has a single column StartDate:

dates = ['2021-08-01', '2021-08-02', '2021-08-03']
df = pd.DataFrame({'StartDate': dates})

result:

StartDate
2021-08-01
2021-08-02
2021-08-03

In order to convert string to Datetime column we are going to use:

df['StartDate'] = pd.to_datetime(df['StartDate'])

Step 2: Extract Year and Month with .dt.to_period('M') - format YYYY-MM

In order to extract from a full date only the year plus the month: 2021-08-01 -> 2021-08 we need just this line:

df['StartDate'].dt.to_period('M')

result:

0    2021-08
1    2021-08
2    2021-08

Step 3: Extract Year and Month other formats MM/YYYY

What if you like to get the month first and then the year? In this case we will use .dt.strftime in order to produce a column with format: MM/YYYY or any other format.

df['StartDate'].dt.strftime('%m/%Y')
0    08/2021
1    08/2021
2    08/2021

Note: have in mind that this solution might be really slow in case of a huge DataFrame.

Step 4: Extracting Year and Month separately and combine them

A bit faster solution than step 3 plus a trace of the month and year info will be:

  • extract month and date to separate columns
  • combine both columns into a single one
df['yyyy'] = pd.to_datetime(df['StartDate']).dt.year
df['mm'] = pd.to_datetime(df['StartDate']).dt.month
StartDate yyyy mm
2021-08-01 2021 8
2021-08-02 2021 8
2021-08-03 2021 8

and then:

df['yyyy'].astype(str) + '-'+ df['mm'].astype(str)

Note: If you don't need extra columns you can just do:

df['StartDate'].dt.year.astype(str) + "-" + df['StartDate'].dt.month.astype(str)

Notebook with all examples: Extract Month and Year from DateTime column