You can use the following code to apply a function to multiple columns in a Pandas DataFrame:

def get_date_time(row, date, time):
    return row[date] + ' ' +row[time]

df.apply(get_date_time, axis=1, date='Date', time='Time')

For applying function to single column and performance optimization on apply check - How to apply function to single column in Pandas

Next, you'll see several examples on how to apply a function to two and more columns in Pandas.

The DataFrame below is available from Kaggle:

Date Latitude Longitude Depth Type
12/24/2016 -5.1460 153.5166 30.00 Earthquake
12/25/2016 -43.4029 -73.9395 38.00 Earthquake
12/25/2016 -43.4810 -74.4771 14.93 Earthquake
12/27/2016 45.7192 26.5230 97.00 Earthquake
12/28/2016 38.3754 -118.8977 10.80 Earthquake

You can download it from Kaggle or read it with Python - How to Search and Download Kaggle Dataset to Pandas DataFrame

Option 1: Apply function to two columns in Pandas DataFrame

Suppose you would like to create a new column with the city based on the pair: Latitude and Longitude.

For this purpose we will define new function geo_rev(x) which will be applied on columns and will return the city for each row:

import geocoder

def geo_rev(x):
    g = geocoder.osm([x['Latitude'], x['Longitude']], method='reverse').json
    if g:
        return g.get('country')
    else:
        return 'no country'

df.apply(geo_rev, axis=1)

Function apply takes argument axis=1 which can be described as:

  • 0 or 'index': apply function to each column.
  • 1 or 'columns': apply function to each row.

The function receives all values from the current row and they can be accessed by: x['Latitude']

To create a new column after applying a function we can use:

df['country'] = df.apply(geo_rev, axis=1)

Option 2: Apply function to multiple columns with parameters

If you need to apply a function to DataFrame and pass parameters to the function at the same time then you can use the following syntax:

def get_date_time(row, date, time):
    return row[date] + ' ' +row[time]

df.apply(get_date_time, axis=1, date='Date', time='Time')

There's no limit on the number of parameters.

Option 3: Apply function with lambda and multiple columns

In this example we are going to use method apply and lambda in order to apply function to several columns.

Again we are going to convert Latitude and Longitude to country by applying function:

import pandas as pd

def geo_rev(lat, lon):
    g = geocoder.osm([lat, lon], method='reverse').json
    if g:
        return g.get('country')
    else:
        return 'no country'

df.apply(lambda x: geo_rev(x['Latitude'], x['Longitude']), axis=1)

result is:

23402    Papua Niugini
23403            Chile
23404            Chile
23405          România
23406    United States
23407    United States
23408    United States
23409               日本
23410        Indonesia
23411       no country

Option 4: Select and apply function to multiple columns

You can select several columns from a Pandas DataFrame and apply function to them by:

def geo_rev(lat, lon, mag):
    g = geocoder.osm([lat, lon], method='reverse').json
    if g:
        return g.get('country') + ' ' + str(mag)
    else:
        return 'no country '

df[['Latitude', 'Longitude', 'Magnitude']].apply(lambda x: geo_rev(*x), axis=1)

result of this operation is:

23402    Papua Niugini 5.8
23403            Chile 7.6
23404            Chile 5.6

Option 5: Apply function to multiple columns without using apply

Finally let's see an alternative solution to apply a function to several columns but without the method apply.

This can be achieved by using a combination of list and map. This technique is much faster than using Pandas apply:

def geo_rev(lat, lon):
    g = geocoder.osm([lat, lon], method='reverse').json
    if g:
        return g.get('country')
    else:
        return 'no country '


list(map(geo_rev, df['Latitude'], df['Longitude']))

The advantage of this approach is the speed as we can see in the comparison below for a small dataset:

  • %timeit list(map( - 12 µs per loop
  • %timeit df.apply( - 760 µs per loop

Resources