In this short guide, I'll show you how to extract or explode a dictionary value from a column in a Pandas DataFrame. You can use:
- list or dict comprehension to extract dictionary values
- the
apply()
function along with a lambda function to extract the value from each dictionary
Setup
For example, suppose you have a DataFrame with a column containing dictionaries:
import pandas as pd
data = {'data': [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]}
df = pd.DataFrame(data)
data | |
---|---|
0 | {'a': 1, 'b': 2} |
1 | {'a': 3, 'b': 4} |
Extract key from dict column
To extract values for a particular key from the dictionary in each row, we can use the following code:
df['key'] = df['data'].apply(lambda x: x[key])
which give us a Series of all values matching this key:
0 1
1 3
Name: data, dtype: int64
Finally we create a new column from the extracted data.
As we mentioned at the start we can use list comprehension to extract all values to list:
[d.get('a') for d in df.data]
which give us:
[1, 3]
Split/Explode a column of dictionaries
To split or explode a column of dictionaries to separate columns we can use: .apply(pd.Series)
:
df['data'].apply(pd.Series)
this give us new DataFrame with columns from the exploded dictionaries:
a | b | |
---|---|---|
0 | 1 | 2 |
1 | 3 | 4 |
A faster way to achieve similar behavior is by using pd.json_normalize(df['data'])
:
pd.json_normalize(df['data'])
For more information and examples you can check:
Extract all values/keys from dict column
To extract all keys and values from a column which contains dictionary data we can list all keys by: list(x.keys())
. Below are several examples:
df['keys'] = df['data'].apply(lambda x: list(x.keys()))
df['values'] = df['data'].apply(lambda x: list(x.values()))
which create new column only with the keys or the values from the original dictionary:
data | keys | values | |
---|---|---|---|
0 | {'a': 1, 'b': 2} | [a, b] | [1, 2] |
1 | {'a': 3, 'b': 4} | [a, b] | [3, 4] |
Summary
In this article, we looked at different solutions for extraction and explosion of dictionary columns in Pandas. We focused on extraction in these code snippets, but exploding and splitting is very similar.