Need to use the method set_index
with MultiIndex columns in Pandas? If so you can find how to set single or multiple columns as index in Pandas DataFrame.
To start let's create an example DataFrame with multi-level index for columns:
import pandas as pd
cols = pd.MultiIndex.from_tuples([('company A', 'rank'), ('company A', 'points'), ('company B', 'rank'), ('company B', 'points')])
df = pd.DataFrame([[1,2,3,4], [2,3, 3,4]], columns=cols)
the DataFrame:
company A | company B | ||
---|---|---|---|
rank | points | rank | points |
1 | 2 | 3 | 4 |
2 | 3 | 3 | 4 |
Before to show how to set the index for a given column let's check the column names by:
df.columns
the result is:
MultiIndex([('company A', 'rank'),
('company A', 'points'),
('company B', 'rank'),
('company B', 'points')],
)
So column name is a tuple of values:
('company A', 'rank')
('company A', 'points')
Now in order to use the method set_index
with columns we need to provide all levels from the hierarchical index.
So set_index
applied on a single column:
df.set_index([('company A', 'rank')])
or getting the names with attribute columns
:
df.set_index(df.columns[0])
this will change the DataFrame to:
company A | company B | ||
---|---|---|---|
points | rank | points | |
(company A, rank) | |||
1 | 2 | 3 | 4 |
2 | 3 | 3 | 4 |
To use set_index
with multiple columns in Pandas DataFrame we can apply next syntax:
df.set_index([('company A', 'rank'), ('company B', 'rank')])
output:
company A | company B | ||
---|---|---|---|
points | points | ||
(company A, rank) | (company B, rank) | ||
1 | 3 | 2 | 4 |
2 | 3 | 3 | 4 |