In this short how to guide you can learn how to exclude rows from sorting in Pandas.
To keep the original order of specific rows is useful when you deal with: * multi-dimensional data
- filtering out outliers
- excluding specific observations from the sorting process.
Let's explore how to sort a DataFrame in Pandas while omitting specific rows.
Step 1: Data
First, let's create a sample dataframe which will help us to illustrate the example better
import pandas as pd
# Sample DataFrame
data = {
'id': ['ean', '-', 1, 2, 3, 4, 5],
'name': ['full', '-','John', 'Alice', 'Bob', 'Charlie', 'Emma'],
'value': ['$', '-',10, 20, 30, 40, 50],
'category': ['A-C', '-','A', 'B', 'C', 'B', 'A']
}
df = pd.DataFrame(data)
df
Output is:
id | name | value | category | |
---|---|---|---|---|
0 | ean | full | $ | A-C |
1 | - | - | - | - |
2 | 1 | John | 10 | A |
3 | 2 | Alice | 20 | B |
4 | 3 | Bob | 30 | C |
5 | 4 | Charlie | 40 | B |
6 | 5 | Emma | 50 | A |
Step 2: Define exclude conditions
First we will define conditions based on which we will decide which rows to be excluded from the sorting by:
condition = (df['id'] == 'ean') | (df['id'] == '-')
excluded = df[condition]
included = df[~condition]
Step 3: Sorting
Next use the sort_values()
function to sort the included DataFrame according to your desired criteria. Suppose you want to sort by the "name" column in ascending order:
sorted = included.sort_values(by="name",ascending=True)
Step 4: Merge results
Finally we can merge the unsorted and sorted data by:
pd.concat([excluded, sorted])
Full Example Code: Sort and Exclude
Full example showing how to exclude specific rows from the sorting process, you can filter them out before sorting.
condition = (df['id'] == 'ean') | (df['id'] == '-')
excluded = df[condition]
included = df[~condition]
sorted = included.sort_values(by="name",ascending=True)
df = pd.concat([excluded, sorted])
the final output is keep the first rows without a change while sort the rest by name
:
id | name | value | category | |
---|---|---|---|---|
0 | ean | full | $ | A-C |
1 | - | - | - | - |
3 | 2 | Alice | 20 | B |
4 | 3 | Bob | 30 | C |
5 | 4 | Charlie | 40 | B |
6 | 5 | Emma | 50 | A |
2 | 1 | John | 10 | A |
This example shows how to exclude the first N rows. You can tailor it for excluding the last N rows by changing the concat order.
Conclusion
In this guide we saw how to sort DataFrames in Pandas while excluding specific rows.
By following the steps outlined above, you can ensure your data is sorted accurately according to your criteria, while omitting any rows that may not be relevant to your analysis.