How to Skip First Rows in Pandas read

Do you need to skip rows while reading CSV file with read_csv in Pandas? If so, this article will show you how to skip first rows of reading file.

Method read_csv has parameter skiprows which can be used as follows:

(1) Skip first rows reading CSV file in Pandas

pd.read_csv(csv_file, skiprows=3, header=None)

(2) Skip rows by index with read_csv

pd.read_csv(csv_file, skiprows=[0,2])

Lets check several practical examples which will cover all aspects of reading CSV file and skipping rows.

To start lets say that we have the next CSV file:

!cat '../data/csv/multine_header.csv'

CSV file with multiple headers (to learn more about reading a CSV file with multiple headers):

Date,Company A,Company A,Company B,Company B
,Rank,Points,Rank,Points
2021-09-06,1,7.9,2,6
2021-09-07,1,8.5,2,7
2021-09-08,2,8,1,8.1

Step 1: Skip first N rows while reading CSV file

First example shows how to skip consecutive rows with Pandas read_csv method.

There are 2 options:

skip rows in Pandas without using header
skip first N rows and use header for the DataFrame - check Step 2

In this Step Pandas read_csv method will read data from row 4 (index of this row is 3). The newly created DataFrame will have autogenerated column names:

df = pd.read_csv(csv_file, skiprows=3, header=None)

This will result into:

0	1	2	3	4
2021-09-07	1	8.5	2	7.0
2021-09-08	2	8.0	1	8.1

Step 2: Skip first N rows and use header

If parameter header of method read_csv is not provided than first row will be used as a header. In combination of parameters header and skiprows - first the rows will be skipped and then first on of the remaining will be used as a header.

In the example below 3 rows from the CSV file will be skipped. The forth one will be used as a header of the new DataFrame.

df = pd.read_csv(csv_file, skiprows=3)

2021-09-07	1	8.5	2	7
2021-09-08	2	8	1	8.1

Step 3: Pandas keep the header and skip first rows

What if you need to keep the header and then the skip N rows? This can be achieved in several different ways.

The most simple one is by builing a list of rows which to be skipped:

rows_to_skip = range(1,3)
df = pd.read_csv(csv_file, skiprows=rows_to_skip)

result:

Date	Company A	Company A.1	Company B	Company B.1
2021-09-07	1	8.5	2	7.0
2021-09-08	2	8.0	1	8.1

As you can see read_csv method keep the header and skip first 2 rows after the header.

Step 4: Skip non consecutive rows with `read_csv` by index

Parameter skiprows is defined as:

Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file.

So to skip rows 0 and 2 we can pass list of values to skiprows:

df = pd.read_csv(csv_file, skiprows=[0,2])

Unnamed: 0	Rank	Points	Rank.1	Points.1
2021-09-07	1	8.5	2	7.0
2021-09-08	2	8.0	1	8.1

> Basic concepts

> Installations

> Series

> DataFrame

> Create

> Data Types

> Exercise

> Cheat Sheet

> Basic concepts

> Row

> Column

> Index

> MultiIndex

> Exercise

> Basic concepts

> read_csv()

> read_excel()

> Kaggle

> Exercise

> read_xml()

> read_json()

> to_csv()

> to_dict()

> to_json()

> Basic concepts

> groupby()

> Reshape

> melt()

> Exercise

> Pivot

> merge()

> Filter

> Basic concepts

> replace()

> split()

> Regex

> Search

> Exercise

> Find

> Basic concepts

> apply()

> aggfunc

> Convert

> count()

> Other

> Exercise

> map()

> Basic concepts

> Data Validation

> Data Cleaning

> Duplicate

> Time Series

> Pandas Error

> Get

> Basic concepts

> Styling

> Table

> Display

> DataIsBeautiful

> Beginners

> Data Science Projects

> Newsletter

Step 1: Skip first N rows while reading CSV file

Step 2: Skip first N rows and use header

Step 3: Pandas keep the header and skip first rows

Step 4: Skip non consecutive rows with read_csv by index

Resources

Step 4: Skip non consecutive rows with `read_csv` by index