To read multiple CSV file into single Pandas DataFrame we can use the following syntax:

(1) Pandas read multiple CSV files

path = r'/home/user/Downloads'
all_files = glob.glob(path + "/*.csv")

lst = []

for filename in all_files:
	df = pd.read_csv(filename, index_col=None, header=0)
	lst.append(df)

merged_df = pd.concat(lst, axis=0, ignore_index=True)

(2) Read multiple CSV files - Dask

import dask.dataframe as dd
df = dd.read_csv("~/Downloads/test*.csv")

Pandas Example

Suppose that we would like to read all CSV files:

  • located in folder - /home/user/Downloads
  • by pattern - /test_*.csv - starting with test_ and ending on .csv

We can use the following code:

import glob
import pandas as pd

path = r'/home/user/Downloads'
pattern = "/test_*.csv"
all_files = glob.glob(path + pattern)

lst = []

for filename in all_files:
	df = pd.read_csv(filename, index_col=None, header=0)
	lst.append(df)

merged_df = pd.concat(lst, axis=0, ignore_index=True)

Let's say that we have the following files in this folder:

  • other.csv
  • test_1.csv
  • test_2.csv

In the final DataFrame - merged_df we will have content only from files - test_1.csv and test_2.csv:

Read multiple CSV files with Dask

As an alternative solution we can use the dask module to read multiple CSV files. To install Dask you can visit: dask or use: pip install dask.

To read multiple files from a folder with pattern we can use:

import dask.dataframe as dd
df = dd.read_csv("~/Downloads/test*.csv")

Resources

For more advanced examples on reading multiple CSV or JSON files with Pandas you can check: