In this post I'll try to list the most often errors and their solution in Pandas and Python.

The list will grow with time and will be updated frequently.

DateTime

Timezones

  • TypeError: Timestamp subtraction must have the same timezones or no timezones
  • TypeError: Invalid comparison between dtype=datetime64[ns] and DatetimeArray

Quick solution is to remove the timezone information by:

df['time_tz'].dt.tz_localize(None)

Example and more details: How to Remove Timezone from a DateTime Column in Pandas

OutOfBoundsDatetime

  • OutOfBoundsDatetime: Out of bounds nanosecond timestamp

The short answer of this error is:

pd.to_datetime(df['date'], errors = 'ignore')

Example and more details: OutOfBoundsDatetime: Out of bounds nanosecond timestamp - Pandas and pd.to_datetime

Wrong dates

  • ParserError: Unknown string format: 1975-02-23T02:58:41.000Z 1975-02-23T02:58:41.000Z
df['date'] = pd.to_datetime(df['date_str'], format='%d/%m/%Y', errors='coerce')

Example and more details:

read_csv

UnicodeDecodeError

  • OutOfBoundsDatetime: Out of bounds nanosecond timestamp

The short answer of this error is:

df = pd.read_csv('../data/csv/file_utf-16.csv', encoding='utf-16')

Example and more details: How to Fix - UnicodeDecodeError: invalid start byte - during read_csv in Pandas

ParserError

  • ParserError: Expected 5 fields in line 5, saw 6. Error could possibly be due to quotes being ignored when a multi-char delimiter is used

The short answer of this error is:

df = pd.read_csv(csv_file, delimiter=';;', engine='python', error_bad_lines=False)
df = pd.read_csv(csv_file, delimiter=';;', engine='python', on_bad_lines='skip)

Example and more details: How to Use Multiple Char Separator in read_csv in Pandas

to_csv

AttributeError: 'numpy.ndarray'

  • AttributeError: 'numpy.ndarray' object has no attribute 'to_csv'

The short answer of this error is:

pd.Series(df['Magnitude Type'].unique()).to_csv('data.csv')

Example and more details: Dump (unique) values to CSV / to_csv in Pandas

Index / MultiIndex

MultiIndex

  • ValueError: Cannot remove 1 levels from an index with 1 levels: at least one level must be left.
  • IndexError: Too many levels: Index has only 1 level, not 4

The short answer of this error is:

df.index
df.droplevel(level=1)
df.reset_index(level=1)
df.columns.droplevel(level=0)

Example and more details: How to Drop a Level from a MultiIndex in Pandas DataFrame

Sort MultiIndex

  • ValueError: The column label 'Depth' is not unique. For a multi-index, the label must be a tuple with elements corresponding to each level.

The short answer of this error is:

df_multi.columns
df_multi.columns.get_level_values(1)
df_multi.sort_values(by=[('Depth', 'mean')], ascending=False)

Example and more details: How to Sort MultiIndex in Pandas

Merge

  • ValueError: Indexes have overlapping values: Index(['A', 'B', 'C', 'D'], dtype='object')

The short answer of this error is:

pd.concat([df1, df2], axis='columns', verify_integrity=False)
df1.join(df2, lsuffix='_x')

Example and more details: How to Merge Two DataFrames on Index in Pandas

String

  • TypeError: can only concatenate str (not "float") to str

The short answer of this error is:

df['Magnitude'].astype(str)

Example and more details: Combine Multiple columns into a single one in Pandas

Column

  • ValueError: cannot reindex from a duplicate axis

The short answer of this error is:

df = df.sort_index(axis=1)

Example and more details: How to Change the Order of Columns in Pandas DataFrame