How to sort a Python Dataframe by one or multiple columns?

In this quick tutorial we’ll learn how to sort rows in a Pandas DataFrame according to specific column or index values.

As we typically do, we’ll quickly import the Pandas library into our Jupyter Notebook / Lab (or any other recommended Python IDE. We’ll then create a simple test DataFrame and populate it with data from a CSV file.

# Import the Pandas library
import pandas as pd

# Import the test data
survey = pd.read_csv('survey.csv')

Here’s our data:

Example of using pd.sort() to order row values by column

We’ll start by showing how to use the pd.sort() method to order our DataFrame by a single column:

survey.sort_values(by='days_to_hire', ascending=False)

The sort method has some useful parameters. In this case we used the by parameter to define the ordering column, as well as use the ascending=False to order the column rows from the top to bottom.

Note: After sorting your data, you might want to slice and subset your Python DataFrame by specific conditions.

Sorting Python DataFrame by index values

After sorting the rows by column value you can rearrange the index accordingly:

survey.sort_values(by=['num_candidates', 'salary'], ascending=False, ignore_index=True)

Sorting in Pandas by multiple columns

A more realistic use case is that you’ll need to sort your data according to multiple columns. In this case we’ll pass a Python list with the column labels to the by parameter of our code.

survey.sort_values(by=['num_candidates', 'salary'], ascending=False, ignore_index=True)

Notes:

  1. We can persist the Dataframe as sorted either by initializing a new DataFrame or to use the inplace=True switch in the pd.sort() method.
# Method 1 -creating a new DataFrame

sorted_survey = survey.sort_values(by=['num_candidates', 'salary'], ascending=False, ignore_index=True)

# Method 2 - will override the original DataFrame
survey.sort_values(by=['num_candidates', 'salary'], ascending=False, ignore_index=True, inplace=True)

Sort a Series with Python with Series.sort_values() and sort_index()

We can easily create a Series from a DataFrame column and sort its values or index as needed. Here’s a simple code example:

salary_s = survey['salary']
salary_s.sort_values()
salary_s.sort_index()