How to Highlight NaN Values in Pandas DataFrame
Here are two ways to highlight nan
values in a Pandas DataFrame:
- highlight nan values in red - using
pd.isna
andstyle.applymap
df.style.applymap(lambda x: 'color: red' if pd.isna(x) else '')
- change background of nan values - comparing the value to itself
df.style.applymap(lambda x: '' if x==x else 'background-color: yellow')
Let's see several useful examples applying both ways in practice.
Step 1: Create sample DataFrame
Let's start with DataFrame with random numbers:
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,100,size=(5, 5)), columns=list(range(0, 5)))
Data:
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 48 | 12 | 94 | 73 | 13 |
1 | 9 | 77 | 24 | 6 | 63 |
2 | 63 | 38 | 49 | 86 | 39 |
3 | 93 | 98 | 84 | 91 | 8 |
4 | 59 | 55 | 10 | 64 | 87 |
Let's set some of those cells to NaN values with generating pairs of coordinates:
import numpy as np
randoms = np.random.choice(4,(5,2),replace=True)
result:
array([[0, 2],
[3, 2],
[2, 1],
[1, 3],
[1, 1]])
set those coordinates to NaN values with simple loop and df.loc
:
import numpy as np
for pair in randoms:
df.loc[pair[0],pair[1]] = np.nan
this will result into:
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 48 | 12.0 | NaN | 73.0 | 13 |
1 | 9 | NaN | 24.0 | NaN | 63 |
2 | 63 | NaN | 49.0 | 86.0 | 39 |
3 | 93 | 98.0 | NaN | 91.0 | 8 |
4 | 59 | 55.0 | 10.0 | 64.0 | 87 |
Step 2: Highlight NaN values with lambda and pd.isna
First lets color all NaN values in the DataFrame by using a lambda and pd.isna
:
df.style.applymap(lambda x: 'color: red' if pd.isna(x) else '')
You can see the result below:
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 48 | 12.000000 | nan | 73.000000 | 13 |
1 | 9 | nan | 24.000000 | nan | 63 |
2 | 63 | nan | 49.000000 | 86.000000 | 39 |
3 | 93 | 98.000000 | nan | 91.000000 | 8 |
4 | 59 | 55.000000 | 10.000000 | 64.000000 | 87 |
Step 3: Highlight NaN values by changing background
In this step we are going to change the background of each NaN cell to yellow with applymap
:
df.style.applymap(lambda x: '' if x==x else 'background-color: yellow')
result:
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 48 | 12.000000 | nan | 73.000000 | 13 |
1 | 9 | nan | 24.000000 | nan | 63 |
2 | 63 | nan | 49.000000 | 86.000000 | 39 |
3 | 93 | 98.000000 | nan | 91.000000 | 8 |
4 | 59 | 55.000000 | 10.000000 | 64.000000 | 87 |
Step 4: Highlight NaN values in specific columns
To change the color of NaN values only in selected columns we can use the parameter subset
of method style.applymap
. It can accept list of column names:
df.style.applymap(lambda x: '' if x==x else 'background-color: yellow', subset=[2,3])
result:
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 48 | 12.000000 | nan | 73.000000 | 13 |
1 | 9 | nan | 24.000000 | nan | 63 |
2 | 63 | nan | 49.000000 | 86.000000 | 39 |
3 | 93 | 98.000000 | nan | 91.000000 | 8 |
4 | 59 | 55.000000 | 10.000000 | 64.000000 | 87 |
Step 5: Highlight NaN values in specific columns and rows
It's possible to select rows and columns in which NaN values to be highlighted. For this purpose we will use 2d input in order to select rows and columns: subset=([0,1,2], slice(None))
df.style.applymap(lambda x: '' if x==x else 'background-color: yellow', subset=([0,1,2], slice(None)))
result:
0 | 1 | 2 | 3 | 4 | |
---|---|---|---|---|---|
0 | 48 | 12.000000 | nan | 73.000000 | 13 |
1 | 9 | nan | 24.000000 | nan | 63 |
2 | 63 | nan | 49.000000 | 86.000000 | 39 |
3 | 93 | 98.000000 | nan | 91.000000 | 8 |
4 | 59 | 55.000000 | 10.000000 | 64.000000 | 87 |