In this short how to guide you can learn how to exclude rows from sorting in Pandas.

To keep the original order of specific rows is useful when you deal with: * multi-dimensional data

  • filtering out outliers
  • excluding specific observations from the sorting process.

Let's explore how to sort a DataFrame in Pandas while omitting specific rows.

Step 1: Data

First, let's create a sample dataframe which will help us to illustrate the example better

import pandas as pd

# Sample DataFrame
data = {
	'id': ['ean', '-', 1, 2, 3, 4, 5],
	'name': ['full', '-','John', 'Alice', 'Bob', 'Charlie', 'Emma'],
	'value': ['$', '-',10, 20, 30, 40, 50],
	'category': ['A-C', '-','A', 'B', 'C', 'B', 'A']
}

df = pd.DataFrame(data)
df

Output is:

id name value category
0 ean full $ A-C
1 - - - -
2 1 John 10 A
3 2 Alice 20 B
4 3 Bob 30 C
5 4 Charlie 40 B
6 5 Emma 50 A

Step 2: Define exclude conditions

First we will define conditions based on which we will decide which rows to be excluded from the sorting by:

condition = (df['id'] == 'ean') | (df['id'] == '-')

excluded = df[condition]
included = df[~condition]

Step 3: Sorting

Next use the sort_values() function to sort the included DataFrame according to your desired criteria. Suppose you want to sort by the "name" column in ascending order:

sorted = included.sort_values(by="name",ascending=True)

Step 4: Merge results

Finally we can merge the unsorted and sorted data by:

pd.concat([excluded, sorted])

Full Example Code: Sort and Exclude

Full example showing how to exclude specific rows from the sorting process, you can filter them out before sorting.

condition = (df['id'] == 'ean') | (df['id'] == '-')

excluded = df[condition]
included = df[~condition]

sorted = included.sort_values(by="name",ascending=True)

df = pd.concat([excluded, sorted])

the final output is keep the first rows without a change while sort the rest by name:

id name value category
0 ean full $ A-C
1 - - - -
3 2 Alice 20 B
4 3 Bob 30 C
5 4 Charlie 40 B
6 5 Emma 50 A
2 1 John 10 A

This example shows how to exclude the first N rows. You can tailor it for excluding the last N rows by changing the concat order.

Conclusion

In this guide we saw how to sort DataFrames in Pandas while excluding specific rows.

By following the steps outlined above, you can ensure your data is sorted accurately according to your criteria, while omitting any rows that may not be relevant to your analysis.

Resources