In this tutorial, we'll see how to solve a Pandas error – "ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.".
We get this error from the Pandas when we try to create DataFrame with mixed elements:
- dictionaries
- non-Series
- list
- etc
In short to solve this error use json_normalize()
:
from pandas import json_normalize
json_normalize(data)
Reproduce the error
First let's see an example with this error. Suppose we have data for the Game of Thrones movie - data is extracted from imdb by library - cinemagoer.
If you like to find how to extract and analyze IMDB with Python you can follow youtube channel - DataScientYst - we are planning video on this topic.
We would like to create DataFrame with this data like:
import pandas as pd
data = {'title': 'Game of Thrones',
'year': 2011,
'kind': 'tv series',
'taglines': ['Winter is coming.',
'Winter is here. (season 7)',
'The Great War Is Here (Season 8)',
'For the Throne.'],
'number of votes': {10: 1210366, 9: 444762, 8: 192847, 7: 76850, 6: 30220,
5: 18478, 4: 9341, 3: 7930, 2: 7145, 1: 65485},
'arithmetic mean': 9.0,
'median': 1}
pd.DataFrame(data)
we got error like:
ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
the one mentioned in the title.
Solve the error - json_normalize
To solve this error we will use: json_normalize
from pandas import json_normalize
json_normalize(data)
Now we can create DataFrame from the input data without error. DataFrame below is transposed - for readability:
0 | |
---|---|
title | Game of Thrones |
year | 2011 |
kind | tv series |
taglines | [Winter is coming., Winter is here. (season 7), The Great War Is Here (Season 8), For the Throne.] |
arithmetic mean | 9.0 |
median | 1 |
number of votes.10 | 1210366 |
number of votes.9 | 444762 |
number of votes.8 | 192847 |
number of votes.7 | 76850 |
number of votes.6 | 30220 |
number of votes.5 | 18478 |
number of votes.4 | 9341 |
number of votes.3 | 7930 |
number of votes.2 | 7145 |
number of votes.1 | 65485 |
Solve the error - json
Alternatively we can read only the important information for us by:
import pandas as pd
df = pd.DataFrame(data["taglines"])
which give us:
0 | |
---|---|
0 | Winter is coming. |
1 | Winter is here. (season 7) |
2 | The Great War Is Here (Season 8) |
3 | For the Throne. |
or if we read json file we can use Python json library to load the file as:
import json
data = json.load(open('data.json'))
df = pd.DataFrame(data["taglines"])