1. Overview

In this tutorial, we'll learn how to solve the popular warning message in Pandas:

/tmp/ipykernel_4904/714243365.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Several different reasons can cause this warning message. We are going to cover most of them and their solutions.

× If you face warnings in Pandas try to understand and resolve them. Ignoring or skipping might result in unexpected behaviour.

2. Setup

For this example we are going to use dummy DataFrame created by method: makeMixedDataFrame:

from pandas.util.testing import makeMixedDataFrame

df = makeMixedDataFrame()

data:

A B C D
0 0.0 0.0 foo1 2009-01-01
1 1.0 1.0 foo2 2009-01-02
2 2.0 0.0 foo3 2009-01-05
3 3.0 1.0 foo4 2009-01-06
4 4.0 0.0 foo5 2009-01-07

3. What are the reasons for SettingWithCopyWarning

3.1. Is Pandas DataFrame a Copy or a View?

Before jumping to solutions, let's try to answer the question in the title of this section. How to tell the difference between Copy or a View?

Let's cover this in few examples showing how to copy a DataFrame or get some part of it:

df_2 = df
df_3 = df.copy()
df_4 = df[:]
df_5 = df.loc[:, :]
df_6 = df.iloc[0:2, :]
df_7 = df['D']

Let's verify which of them are copies and which are views:

print(df._is_view, '|',  hex(id(df)), '|', df._is_copy)
print(df_2._is_view, '|',  hex(id(df_2)), '|', df_2._is_copy)
print(df_3._is_view, '|',  hex(id(df_3)), '|',  df_3._is_copy)
print(df_4._is_view, '|',  hex(id(df_4)), '|',  df_4._is_copy)
print(df_5._is_view, '|',  hex(id(df_5)), '|',  df_5._is_copy)
print(df_6._is_view, '|',  hex(id(df_6)), '|',  df_6._is_copy)
print(df_7._is_view, '|',  hex(id(df_7)), '|',  df_7._is_copy)

The output helps us to understand Copies and Views better:

_is_view hex(id( _is_copy
False 0x7fde7ab88eb0 None
False 0x7fde7ab88eb0 None
False 0x7fde7abf24f0 None
False 0x7fdec136f940 <weakref at 0x7fde7ab8a680; to 'DataFrame' at 0x7fde7ab88eb0>
False 0x7fde7ab88eb0 None
False 0x7fde7abf2730 <weakref at 0x7fde7ab8a680; to 'DataFrame' at 0x7fde7ab88eb0>
True 0x7fde7abf2340 None

So we can see that: df['D'] will return a copy. df.iloc[0:2, :] and df[:] returns views.

We can also see the addresses of all DataFrames.

One more way to check the values of the DataFrame is by attribute:

df_2.values.base

which will show difference in case of different values:

array([[0.0, 1.0, 2.0, 3.0, 4.0],
       [0.0, 1.0, 0.0, 1.0, 0.0],
       ['foo1', 'foo2', 'foo3', 'foo4', 'foo5'],
       [Timestamp('2009-01-01 00:00:00'),
        Timestamp('2009-01-02 00:00:00'),
        Timestamp('2009-01-05 00:00:00'),
        Timestamp('2009-01-06 00:00:00'),
        Timestamp('2009-01-07 00:00:00')]], dtype=object)

In some cases using df_2.values might lead to controversial results.

3.2. How data is accessed

Depending on how DataFrame data is accessed - will result in showing a warning or not.

Let's say that we would like to update values in column C. We can do this by:

df["C"][df["C"]=="foo3"] = "foo33"

but warning will be produced:

/tmp/ipykernel_9907/1845991504.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df["C"][df["C"]=="foo3"] = "foo33"

To get and set the values without SettingWithCopyWarning warning we need to use loc:

df.loc[df["C"]=="foo3", "C"] = "foo333"

4. Fix SettingWithCopyWarning by method copy()

The first and simplest solution is to create a DataFrame copy and work with it. This can be done by method - copy().

Let's do a short demo of this problem and the solution. Let say that we get part of the initial DataFrame by:

df_new = df[['D', 'B']]

Our goal is to work only with this subset of columns and create new column based on the existing ones:

df_new['E'] = df_new['B'] > 0

This will cause warning:

df_7['E'] = df_7['B'] > 0
df_7['E'] = df_7['B'] > 0
/tmp/ipykernel_9907/381168311.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_7['E'] = df_7['B'] > 0

This warning can be suppress by using method copy():

df_new = df[['D', 'B']].copy()
Note:

Using method - `copy()` is recommended to small and medium sized DataFrames. For big ones and production solutions will cause performance issues.

5. Fix SettingWithCopyWarning by method loc

In this section we will do a demo on the warning when we work with a single DataFrame. In this case the warning is caused by the way we access data.

For example if we like to change all values in column C which are different from foo3 then we might use:

df["C"][df["C"]!="foo3"] = "foo"

This will raise the warning message:

SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame

To perform the operation without SettingWithCopyWarning - we need to use attribute loc in this way:

df.loc[df["C"]!="foo3", "C"] = "foo"

the operation is completed without the warning.

6. Conclusion

In this article, we looked at the reasons and solutions for the SettingWithCopyWarning warning in Pandas.

We focused on solving the original cause of the issue rather than suppressing the message itself.

Finally depending on your context you may get slightly different problem. For example working with multi-index which is explained here: Returning a view versus a copy