In this short guide, I'll show you how to extract Month and Year from a DateTime column in Pandas DataFrame. You can also find how to convert string data to a DateTime. So at the end you will get:

01/08/2021 -> 2021-08
DD/MM/YYYY -> YYYY-MM

or any other date format. We will also cover MM/YYYY.

To start, here is the syntax that you may apply in order extract concatenation of year and month:

.dt.to_period('M')

In the next section, I'll review the steps to apply the above syntax in practice.

Step 1: Create a DataFrame with Datetime values

Lets create a DataFrame which has a single column StartDate:

dates = ['2021-08-01', '2021-08-02', '2021-08-03']
df = pd.DataFrame({'StartDate': dates})

result:

StartDate
2021-08-01
2021-08-02
2021-08-03

In order to convert string to Datetime column we are going to use:

df['StartDate'] = pd.to_datetime(df['StartDate'])

Step 2: Extract Year and Month with .dt.to_period('M') - format YYYY-MM

In order to extract from a full date only the year plus the month: 2021-08-01 -> 2021-08 we need just this line:

df['StartDate'].dt.to_period('M')

result:

0    2021-08
1    2021-08
2    2021-08

Step 3: Extract Year and Month other formats MM/YYYY

What if you like to get the month first and then the year? In this case we will use .dt.strftime in order to produce a column with format: MM/YYYY or any other format.

df['StartDate'].dt.strftime('%m/%Y')
0    08/2021
1    08/2021
2    08/2021

Note: have in mind that this solution might be really slow in case of a huge DataFrame.

Step 4: Extracting Year and Month separately and combine them

A bit faster solution than step 3 plus a trace of the month and year info will be:

  • extract month and date to separate columns
  • combine both columns into a single one
df['yyyy'] = pd.to_datetime(df['StartDate']).dt.year
df['mm'] = pd.to_datetime(df['StartDate']).dt.month
StartDate yyyy mm
2021-08-01 2021 8
2021-08-02 2021 8
2021-08-03 2021 8

and then:

df['yyyy'].astype(str) + '-'+ df['mm'].astype(str)

Note: If you don't need extra columns you can just do:

df['StartDate'].dt.year.astype(str) + "-" + df['StartDate'].dt.month.astype(str)

Notebook with all examples: Extract Month and Year from DateTime column

extract-month-year-datetime-column-pandas