How To Create a Pivot Table in Pandas?

We can use the following syntax to create a pivot table in Python using Pandas:

df_pivot = df.pivot_table(values='D', index=['A', 'B'], columns='C')

Next, we'll see the full steps to create pivot tables in Pandas using a simple example.

Steps to create pivot table:

Step 1: Get data for pivot

Suppose we have the following DataFrame which contains 4 columns:

3 string columns
2 numeric

We will pivot on multiple columns in next sections

import pandas as pd

data = {"A": ["foo", "foo", "foo", "foo", "foo",
          	"bar", "bar", "bar", "bar"],
    	"B": ["one", "one", "one", "two", "two",
          	"one", "one", "two", "two"],
    	"C": ["small", "large", "large", "small",
          	"small", "large", "small", "small",
          	"large"],
    	"D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
    	"E": [2, 4, 5, 5, 6, 6, 8, 9, 9]}

df = pd.DataFrame(data)

Let's see how to pivot based on data below:

	A	B	C	D	E
0	foo	one	small	1	2
1	foo	one	large	2	4
2	foo	one	large	2	5
3	foo	two	small	3	5
4	foo	two	small	3	6
5	bar	one	large	4	6
6	bar	one	small	5	8
7	bar	two	small	6	9
8	bar	two	large	7	9

Step 2: Select columns indexes and values

Next, determine the:

index - keys to group by on the pivot table index
columns - keys to group by on the pivot table column
values - columns used for aggregation data of the pivot table
aggfunc - functions or list of functions used for aggregation

We will demonstrate a pivot table using all of the above.

Step 3: Create the pivot table

Finally, create the pivot table using method: pivot_table() based on the following syntax:

pd.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean')

In our example we will create pivot table:

using column 'D' as aggregating values
for index we will have 2 columns - 'A' and 'B'
- this creates multi-index
for columns we will use column 'C'
as aggregating function we will use Python method sum

df_pivot = pd.pivot_table(df, values='D', index=['A', 'B'],
                      	columns=['C'], aggfunc=sum)
df_pivot

The pivot table looks like:

	C	large	small
A	B
bar	one	4.0	5.0
bar	two	7.0	6.0
foo	one	4.0	1.0
foo	two	NaN	6.0

Step 4: Advanced Pivot options

In the previous example we saw the basic usage of the pivot_table() method with most used options.

Alternatively, we may use more options with the following default values:

pd.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None,
               margins=False, dropna=True, margins_name='All', observed=False, sort=True)

Useful pivot options are:

fill_value - value to replace missing values with
dropna - exclude columns whose entries are all NaN
margins - add all row/columns (subtotal / grand totals)
sort - sort the results

Pivot table examples

Pivot Table with Multiple aggfunc

We can use multiple aggregation functions. The functions might be different for different columns:

'D' - mean
'E' - min and max

df_pivot = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
                      	aggfunc={'D': 'mean',
                               	'E': [min, max]})
df_pivot

Result:

		D	E
		mean	max	min
A	C
bar	large	5.500000	9	6
bar	small	5.500000	9	8
foo	large	2.000000	5	4
foo	small	2.333333	6	2

Pivot table replace NaN

To replace NaN values in the pivot table we can use the parameter fill_value. We can replace NaN values with 0 by:

df_pivot = pd.pivot_table(df, values='D', index=['A', 'B'],
                	columns=['C'], aggfunc=sum, fill_value=0)
df_pivot

Result of replace NaN values by 0 in pivot table:

	C	large	small
A	B
bar	one	4	5
bar	two	7	6
foo	one	4	1
foo	two	0	6

Pivot table remove NaN

To drop columns with NaN values we can use option dropna=True:

pd.pivot_table(df, values=['D'],
           	index=['A'], columns=['C', 'E'],
           	aggfunc=sum, dropna=True)

The result is pivot table without NaN columns:

pandas-pivot-drop-na

Summary

To summarize, in this article, we've seen an example of a creation pivot table in Pandas. We've briefly discussed syntax and options.

> Basic concepts

> Installations

> Series

> DataFrame

> Create

> Data Types

> Exercise

> Cheat Sheet

> Basic concepts

> Row

> Column

> Index

> MultiIndex

> Exercise

> Basic concepts

> read_csv()

> read_excel()

> Kaggle

> Exercise

> read_xml()

> read_json()

> to_csv()

> to_dict()

> to_json()

> Basic concepts

> groupby()

> Reshape

> melt()

> Exercise

> Pivot

> merge()

> Filter

> Basic concepts

> replace()

> split()

> Regex

> Search

> Exercise

> Find

> Basic concepts

> apply()

> aggfunc

> Convert

> count()

> Other

> Exercise

> map()

> Basic concepts

> Data Validation

> Data Cleaning

> Duplicate

> Time Series

> Pandas Error

> Get

> Basic concepts

> Styling

> Table

> Display

> DataIsBeautiful

> Beginners

> Data Science Projects

> Newsletter

Step 1: Get data for pivot

Step 2: Select columns indexes and values

Step 3: Create the pivot table

Step 4: Advanced Pivot options

Pivot table examples

Pivot Table with Multiple aggfunc

Pivot table replace NaN

Pivot table remove NaN

Summary

Resources