How to sort pandas DataFrames by column values and dates?

In today’s tutorial you will learn how to order the values of your DataFrame by column values. We will also learn how you can rearrange your DataFrame according to specific column order.

Example Data

First off, we will import the pandas library into our Python development environment and create a sample DataFrame that you can use to follow along this tutorial.

import pandas as pd
revenue_dict = {
              'area' : [ 'Retail', 'Online', 'B2B', 'B2C'] , 
             'indirect' : [324, 324, 561, 345],         
            'direct' : [187, 211, 364, 534],
            'distributors' : [278, 345, 183, 765],
            'time' : pd.date_range(start='1/1/2019', periods=4, freq = 'y')}

revenue_df = pd.DataFrame (revenue_dict)

print(revenue_df)

Here’s is the resulting data:

areaindirectdirectdistributorstime
0Retail3241872782019-12-31
1Online3242113452020-12-31
2B2B5613641832021-12-31
3B2C3455347652022-12-31

Sort pandas DataFrame by column value

We can easily order the DataFrame rows by values of a specific column using the sort_values() DataFrame method.

revenue_df.sort_values(by='area')

The DataFrame rows are sorted by the area value in ascending order (default):

areaindirectdirectdistributorstime
2B2B5613641832021-12-31
3B2C3455347652022-12-31
1Online3242113452020-12-31
0Retail3241872782019-12-31

Sort pandas Date columns

Ordering date values is relatively simple. In this case we’ll re-arrange the values in the time column in descending order:

revenue_df.sort_values(by='time', ascending= False)

Order DataFrame values by multiple columns

We are able to pass a list of columns to the sort() method and order our DataFrame accordingly:

revenue_df.sort_values(by=['direct', 'distributors'])

Sort DataFrame columns by column name

We can rearrange the DataFrame column order very easily.

Let’s get started by printing the current column index:

print ("The DataFrame column sequence is: " + str (revenue_df.columns))

This will return the following result.

The DataFrame column sequence is: Index(['area', 'indirect', 'direct', 'distributors', 'time'], dtype='object')

We’ll now reorder the column sequence alphabetically in an ascending order:

revenue_df.columns = revenue_df.columns.sort_values()

And check again the column index sequence to receive the following result:

print ("The DataFrame column sequence is: " + str (revenue_df.columns))
The DataFrame column sequence is: Index(['area', 'direct', 'distributors', 'indirect', 'time'], dtype='object')

Note that you can rearrange the column sequence in additional ways for example:

order_cols = ['direct', 'time', 'indirect', 'distributors', 'area']
revenue_df.columns = order_cols

Additional Learning

How to calculate the average value in a Pandas DataFrame column?