How to Group by multiple columns, count and map in Pandas

To group by two or multiple columns, count unique combinations and map the result we can chain two Pandas methods:

groupby()
size()

df.groupby(['col1', 'col2']).size()

The picture below shows all the steps and the final result:

Let's create a sample DataFrame and explain all the steps in details:

import pandas as pd

data = {'col1': ['a', 'c', 'a', 'b', 'a', 'c'],
    	'col2': ['d', 'a', 'c', 'e', 'd', 'a']}
df = pd.DataFrame(data)

data looks like:

	col1	col2
0	a	d
1	c	a
2	a	c
3	b	e
4	a	d
5	c	a

Group by two columns and count

To group by multiple columns in Pandas and count the combinations we can chain methods:

df_g = df.groupby(['col1', 'col2']).size().reset_index(name='counts')

This gives us a new DataFrame with counts of unique combinations from the columns. We have the original columns plus new columns - counts which contains the occurrences:

	col1	col2	counts
0	a	c	1
1	a	d	2
2	b	e	1
3	c	a	2

Map count to new column in first DataFrame

Finally we can map the count to the original DataFrame. We will use method merge and map on two columns ['col1', 'col2']:

pd.merge(df, df_g, on=['col1', 'col2'])

This gives us:

	col1	col2	counts
0	a	d	2
1	a	d	2
2	c	a	2
3	c	a	2
4	a	c	1
5	b	e	1

Or if we like to preserve the order of the original DataFrame we can use left join - how="left":

pd.merge(df, df_g, on=['col1', 'col2'], how='left')

How does it work?

we group by two columns col1 and col2
the size method counts the number of occurrences of each combination
reset_index method reset the index
rename of the new column to counts and assign counts
finally map the two DataFrames on multiple columns

> Basic concepts

> Installations

> Series

> DataFrame

> Create

> Data Types

> Exercise

> Cheat Sheet

> Basic concepts

> Row

> Column

> Index

> MultiIndex

> Exercise

> Basic concepts

> read_csv()

> read_excel()

> Kaggle

> Exercise

> read_xml()

> read_json()

> to_csv()

> to_dict()

> to_json()

> Basic concepts

> groupby()

> Reshape

> melt()

> Exercise

> Pivot

> merge()

> Filter

> Basic concepts

> replace()

> split()

> Regex

> Search

> Exercise

> Find

> Basic concepts

> apply()

> aggfunc

> Convert

> count()

> Other

> Exercise

> map()

> Basic concepts

> Data Validation

> Data Cleaning

> Duplicate

> Time Series

> Pandas Error

> Get

> Basic concepts

> Styling

> Table

> Display

> DataIsBeautiful

> Beginners

> Data Science Projects

> Newsletter

Group by two columns and count

Map count to new column in first DataFrame