In this tutorial, we'll see how to solve Pandas error:
ValueError: All arrays must be of the same length
First, we'll create an example of how to produce it. Next, we'll explain the reason and finally, we'll see how to fix it.
Example
Let's try to create the following DataFrame:
import pandas as pd
data={'day': [1, 2, 3, 4, 5],
'numeric': [1, 2, 3, 4, 5, 6]}
df = pd.DataFrame(data)
this would cause:
ValueError: All arrays must be of the same length
Reason
The problem is that we try to create DataFrame from arrays with different length:
for i in data.values():
print(len(i))
result:
5
6
So we will get this error when we try to create a DataFrame with columns of different lengths.
So we need to use equal sized input arrays.
API errors
We can get this error often when we work with different API-s and try to create DataFrames from the results.
Investigate the input data and correct it if needed.
Solution
In order to solve the error we can change the array length.
A generic solution would be something like:
df = pd.DataFrame.from_dict(data, orient='index')
df = df.transpose()
which will result into DataFrame like:
day | numeric | |
---|---|---|
0 | 1.0 | 1.0 |
1 | 2.0 | 2.0 |
2 | 3.0 | 3.0 |
3 | 4.0 | 4.0 |
4 | 5.0 | 5.0 |
5 | NaN | 6.0 |
How does it work
First we create DataFrame from the existing data as index:
pd.DataFrame.from_dict(data, orient='index')
this result into:
0 | 1 | 2 | 3 | 4 | 5 | |
---|---|---|---|---|---|---|
day | 1 | 2 | 3 | 4 | 5 | NaN |
numeric | 1 | 2 | 3 | 4 | 5 | 6.0 |
Then we just transpose the results.
By using from_dict(orient='index')
we can have different sized arrays as DataFrame input.
Note
Since error: "ValueError: All arrays must be of the same length" suggest data inconsistency be sure that input data is correct.
Check if data is aligned correctly and can be used.
Conclusion
In this article, we saw how to investigate and solve error: "ValueError: All arrays must be of the same length".