How to Drop Column in Pandas

In this quick tutorial, we will see how to drop single or multiple columns by name or index in Pandas.

We'll first look into using the drop() method to:

  • drop a single column
  • then by using alternatives like - del and df.pop
  • drop column with NaN values
  • finally how to drop multiple columns.

Setup

In the post, we'll use the following DataFrame, which consists of several rows and columns:

import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/softhints/Pandas-Tutorials/master/data/csv/extremes.csv')

DataFrame looks like:

Continent Highest point Elevation high Lowest point Elevation low
Asia Mount Everest 8848 Dead Sea −427
South America Aconcagua 6960 Laguna del Carbón −105
North America Denali 6198 Death Valley −86
Africa Mount Kilimanjaro 5895 Lake Assal −155
Europe Mount Elbrus 5642 Caspian Sea −28

Step 1: Drop column by name in Pandas

Let's start by using the DataFrame method drop() to remove a single column.

To drop column named - 'Lowest point' we can use the next syntax:

df = df.drop('Lowest point', axis=1)

or the equivalent:

df = df.drop(columns='Lowest point')

By default method drop() will return a copy. If you like to do the operation in place you can use the syntax above or parameter:

df.drop('Lowest point', axis=1, inplace=True)
Note:

Note that method works on both axes - `axis=1` - means columns.

After the operation the DataFrame will look like:

Continent Highest point Elevation high Elevation low
Asia Mount Everest 8848 −427
South America Aconcagua 6960 −105
North America Denali 6198 −86
Africa Mount Kilimanjaro 5895 −155
Europe Mount Elbrus 5642 −28

Step 2: Drop column by index in Pandas

To drop a column by index we will combine:

  • df.columns
  • drop()

This step is based on the previous step plus getting the name of the columns by index. So to get the first column we have:

df.columns[0]

the result is:

Continent

So to drop the column on index 0 we can use the following syntax:

df.drop(df.columns[0], axis=1)

Step 3. Drop multiple columns by name in Pandas

Next let's see how to drop multiple columns in Pandas - for example: "Elevation high" and "Elevation low".

Again we are going to use method drop() by providing list of columns:

df.drop(["Elevation high", "Elevation low"], axis=1)

result:

Continent Highest point
Asia Mount Everest
South America Aconcagua
North America Denali
Africa Mount Kilimanjaro
Europe Mount Elbrus

This is possible because parameter labels can be single or list-like.

Note:

Instead of using axis - `labels, axis=1` you can use parameter `columns`:

df.drop(columns=["Highest point"])

Step 4. Drop multiple columns by index

To drop multiple columns by index we can use syntax like:

cols = [0, 2]
df.drop(df.columns[cols], axis=1, inplace=True)

This will drop the first and the third column from the DataFrame

Step 5. Drop column with NaN in Pandas

To drop column or columns which contain NaN values we can use method dropna():

df.dropna(axis=1, how='all')

The parameter how='all' will drop all columns which contain only NaN values.

Note:

that

dropna()
doesn't change DataFrame in place. We need to use parameter -
inplace=True
to do so

To drop columns with NaN values by method dropna() we need the following parameters:

  • axis=1 - for columns
  • how
    • any - If any NA values are present, drop that row or column
    • all - If all values are NA, drop that row or column
  • subset - Labels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include

Step 6. Drop column with del and df.pop

An alternative solution to remove column from DataFrame is using the Python keyword - del:

del df["Lowest point"]
Note:

Note that this is going to delete the column in place.

One more way to achieve the same behavior is by using method df.pop:

df.pop('Highest point')

This method will return the column as series:

0 Mount Everest
1 Aconcagua
2 Denali
3 Mount Kilimanjaro
4 Mount Elbrus
5 Vinson Massif
6 Puncak Jaya
Name: Highest point, dtype: object

At the same time will remove the column from the DataFrame.

Conclusion & Resources

In this article, we looked at different ways to drop columns in Pandas.

We saw how to drop single or multiple columns. How to drop columns by index or name. How to drop columns with NaN values.

We covered alternative ways for dropping columns. Finally we saw which is the most efficient way of doing it.

The code for the examples is available over on GitHub in a Notebook.