How to Use set_index With MultiIndex Columns in Pandas

Need to use the method set_index with MultiIndex columns in Pandas? If so you can find how to set single or multiple columns as index in Pandas DataFrame.

To start let's create an example DataFrame with multi-level index for columns:

import pandas as pd

cols = pd.MultiIndex.from_tuples([('company A', 'rank'), ('company A', 'points'), ('company B', 'rank'), ('company B', 'points')])
df = pd.DataFrame([[1,2,3,4], [2,3, 3,4]], columns=cols)

the DataFrame:

company A company B
rank points rank points
1 2 3 4
2 3 3 4

Before to show how to set the index for a given column let's check the column names by:

df.columns

the result is:

MultiIndex([('company A',   'rank'),
            ('company A', 'points'),
            ('company B',   'rank'),
            ('company B', 'points')],
           )

So column name is a tuple of values:

  • ('company A', 'rank')
  • ('company A', 'points')

Now in order to use the method set_index with columns we need to provide all levels from the hierarchical index.

So set_index applied on a single column:

df.set_index([('company A',   'rank')])

or getting the names with attribute columns:

df.set_index(df.columns[0])

this will change the DataFrame to:

company A company B
points rank points
(company A, rank)
1 2 3 4
2 3 3 4

To use set_index with multiple columns in Pandas DataFrame we can apply next syntax:

df.set_index([('company A',   'rank'), ('company B',   'rank')])

output:

company A company B
points points
(company A, rank) (company B, rank)
1 3 2 4
2 3 3 4

Resources