In this article, we will cover how to count NaN and non-NaN values in Pandas DataFrame or column.
Missing values in Pandas are represented by NaN
- not a number but sometimes are referred as:
- NA
- None
- null
We will see how to count all of them.
Here is how to count NaN and non NAN values in Pandas:
(1) Count NA Values in Pandas DataFrame
df.count()
(2) Count non NA Values in DataFrame
df.isna().sum()
(3) Count NA Values in Pandas column
df['col1'].count()
(4) Count non NA Values in DataFrame
df['col1'].isna().sum()
Count Na Values
To count the number of NaN values in a Pandas DataFrame or Series, we can
- use the
.isna()
method - then sum the resulting Boolean values( 1 = True, 0 = False):
DataFrame
To count Na values in the whole Pandas DataFrame we can apply isna()
on every column:
import pandas as pd
df = pd.DataFrame({'col1': ['a', None, 3, None, 5],
'col2': [None, 7, 'b', 3, 4]})
na_count = df.isna().sum()
print(na_count)
result:
col1 2
col2 1
dtype: int64
Column
To count Na values in Pandas column we can sum Na values in the column:
df['col1'].isna().sum()
The result is the number of the Na values in this column - 2.
Count non Na Values
To count the number of non-NaN values in a Pandas DataFrame or Series, we can use methods:
pandas.DataFrame.count
pandas.Series.count
DataFrame
Method count
return number of the non Na values for the whole DataFrame:
df.count()
result:
col1 3
col2 4
dtype: int64
Not that this will count the non Na values column wise. For row-wise refer to the next section.
Row-wise
We can count non Na values in a given Pandas DataFrame row-wise by using parameter axis=1
and pass it to count
method:
df.count(axis=1)
result is non Na values in each row:
0 1
1 1
2 2
3 1
4 2
dtype: int64
Column
To count non NaN values in Pandas column we can use the Series count method:
df['col1'].count()
as output we get the number of non Na values in col1: 3.
The code above is equivalent to:
df['col1'].notna().sum()
Count non Na values - describe()
We can use Pandas method describe to count non Na values in the whole DataFrame or column by:
df['col1'].describe()
result:
count 3
unique 3
top a
freq 1
Name: col1, dtype: object
Count Na values - value_counts()
We can check the number of Na or non Na values also by using the method: value_counts()
. To do so we need to pass parameter dropna=False
:
df['col1'].value_counts(dropna=False)
result:
None 2
a 1
3 1
5 1
Name: col1, dtype: int64
Count percent of missing values
To count the percent of the missing values in each column of Pandas DataFrame we can use:
isna()
- chain method
mean()
df.isna().mean()
This will give us the percent of the Na values in the selected columns:
col1 0.4
col2 0.2
dtype: float64
Multiply by 100 to get value between 0 and 100:
df.isna().mean() * 100
result:
col1 40.0
col2 20.0
dtype: float64
Count Na and non Na values in column
To count both Na and non Na values in Pandas column we can use isna
in combination with value_coutns()
method:
df['col1'].isna().value_counts()
The results is number of Na and non Na values in this column:
False 3
True 2
Name: col1, dtype: int64
Conclusion
In this article we covered how to count the number of NaN and non NaN values in Pandas DataFrame. We saw how to count row and column-wise.
We count Na values for the whole DataFrame or a single column. Finally we saw how to calculate the percent of missing values and count Na / non Na values in a column.