1. Overview
This short article describes how to create empty DataFrame in Pandas.
Generally, there are three options to create empty DataFrame:
- Empty DataFrame
- Empty DataFrame with columns
- Empty DataFrame with index and columns
In general it's not recommended to create empty DataFrame and append rows to it.
2. Create Empty DataFrame
Let's start by creating completely empty DataFrame:
- no data
- no index
- no columns
The constructor called without any parameters as:
import pandas as pd
df = pd.DataFrame()
Will create a DataFrame object that doesn't have data and attributes. If you print it, you will get:
Empty DataFrame
Columns: []
Index: []
To append rows and add columns to empty DataFrame - you can use bracket notation and assign values to it.
So in brackets we have the column name and the lists contains the row values:
df['Rank'] = ['silver', 'gold', 'gold']
df['score'] = [54, 75, 87]
Now the result will be:
Rank | score | |
---|---|---|
0 | silver | 54 |
1 | gold | 75 |
2 | gold | 87 |
3. Create Empty DataFrame with index and columns
What if you like to make DataFrame without data - only with index and columns. The DataFrame constructor needs two parameters: columns
and index
:
import pandas as pd
df = pd.DataFrame(columns = ['Score', 'Rank'],
index = ['James', 'Jim'])
our DataFrame without any data looks like:
Score | Rank | |
---|---|---|
James | NaN | NaN |
Jim | NaN | NaN |
To set data for any of the existing rows we can use property loc
which is described as:
Access a group of rows and columns by label(s) or a boolean array.
To set values for Jim we can do:
df.loc['Jim'] = [50, 'gold']
result:
Score | Rank | |
---|---|---|
James | NaN | NaN |
Jim | 50 | gold |
4. Create Empty DataFrame with column names
To create a DataFrame which has only column names we can use the parameter column
. We are using the DataFrame constructor to create two columns:
import pandas as pd
df = pd.DataFrame(columns = ['Score', 'Rank'])
print(df)
result:
Empty DataFrame
Columns: [Score, Rank]
Index: []
If you display it you will get:
Score | Rank |
---|
To append rows to this DataFrame we can use method append() and provide dictionary with values:
df.append({'Score' : '68', 'Rank' : 'gold'}, ignore_index = True)
Result is:
Score | Rank | |
---|---|---|
0 | 68 | gold |
Parameter ignore_index
is used in order to avoid error:
TypeError: Can only append a dict if ignore_index=True
5. Efficient way to append rows to DataFrame
And the way above is not very efficient for multiple rows. A better approach to append rows to empty DataFrame is by method concat():
So in order to append multiple rows to empty or full DataFrame df
like:
Score | Rank | |
---|---|---|
James | NaN | NaN |
Jim | 50 | gold |
is by concatenating the two DataFrames:
df_a = pd.DataFrame({'Score' : ['68', '5'], 'Rank' : ['gold', 'zero']}, index=['Joe', 'Jery'])
pd.concat([df, df_a])
result:
Score | Rank | |
---|---|---|
James | NaN | NaN |
Jim | 50 | gold |
Joe | 68 | gold |
Jery | 5 | zero |
6. Conclusion
And now we're able to create empty DataFrame in several ways. We looked at how to add data, index and columns. We covered more efficient way of doing it.
Now you know more about DataFrame and how you can add rows and data to it.