In today’s Data Wrangling tutorial, we would like to show how to quickly modify/ reshuffle the column order in a Python Pandas DataFrame. Specifically we will look into moving a specific column to the DataFrame first position.
Move Pandas column positions
Creating our data
We’ll get started by creating some test data which we’ll use to exemplify the different use cases we’ll be discussing in the tutorial.
import numpy as np
import pandas as pd
np.random.seed(100)
#Define the DataFrame
rand_df = pd.DataFrame(data = np.random.normal(80,10,12).reshape(3,4).round(2), columns = [1,2,3,4])
Here’s the output (your output will be different as these are random numbers):
1 | 2 | 3 | 4 | |
---|---|---|---|---|
0 | 67.41 | 83.29 | 67.90 | 75.59 |
1 | 96.89 | 64.08 | 78.90 | 98.24 |
2 | 87.10 | 62.53 | 59.84 | 71.63 |
1. Rearrange column to first position
First we’ll extract the column into a Pandas Series, which we’ll later insert into the first position of our DF. We’ll use the pop() DataFrame method.
last_col = rand_df.pop(4)
As you can see, the column was extracted into a Series object:
type(last_col)
pandas.core.series.Series
Here’s how our DataFrame looks now -note that it has three columns except the index (and not four as shown above).
1 | 2 | 3 | |
---|---|---|---|
0 | 62.50 | 83.43 | 91.53 |
1 | 89.81 | 85.14 | 82.21 |
2 | 78.11 | 82.55 | 75.42 |
Now let’s go ahead and insert in column index=0 (first position):
# the loc parameter sets the column index position
rand_df.insert(loc= 0 , column= 4, value= last_col)
Let’s see what we just got:
4 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 77.48 | 62.50 | 83.43 | 91.53 |
1 | 69.30 | 89.81 | 85.14 | 82.21 |
2 | 84.35 | 78.11 | 82.55 | 75.4 |
2. Move Pandas column to a specific position (in this case the second)
All we need to do is to set the loc parameter correctly:
rand_df.insert(loc= 1 , column= 4, value= last_col)
3. Reorder columns by index
An alternative solution, would be to re-index the DataFrame:
new_rand_df = pd.DataFrame.reindex(rand_df,columns = [3,2,1,4])
new_rand_df
The output looks as following:
3 | 2 | 1 | 4 | |
---|---|---|---|---|
0 | 91.53 | 83.43 | 62.50 | 77.48 |
1 | 82.21 | 85.14 | 89.81 | 69.30 |
2 | 75.42 | 82.55 | 78.11 | 84.35 |
4. Sort columns alphabetically
Sorting the DataFrame columns index will do the trick 🙂
new_rand_df = pd.DataFrame.reindex(rand_df,columns = rand_df.columns.sort_values())
You can learn more about DataFrame sorting in this tutorial.