How to insert multiple columns to a Pandas DataFrame?

In today’s data wrangling tutorial we will learn how to use Python and the Pandas library to create multiple columns at once in a DataFrame. This is obviously required to speed up your workflow.

We’ll start by importing the required Python libraries and creating a random data set using the Numpy library.

Creating random data

import numpy as np
import pandas as pd
np.random.seed(100)

rand_df = pd.DataFrame(data = np.random.randint(70,100, size = (4,3)), columns = ['score_1', 'score_2', 'score_3'])

rand_df

Here’s our dataset, note that you’ll get different values, as we are using the random() method to generate the data.

score_1score_2score_3
0789473
1779385
2868090
3729172

Insert multiple columns

Adding multiple columns is quite simple. As an example, we’ll show how to calculate the mean and standard deviation and insert those as columns.

rand_df['avg_score'] = rand_df.mean(axis=1).round(2)
rand_df['std_deviation'] = rand_df.std(axis=1).round(2)

rand_df

Note: we used the round() method to round up the calculated values

Here’s our output:

score_1score_2score_3avg_scorestd_deviation
078947381.678.96
177938585.006.53
286809085.334.11
372917278.338.96

Inserting empty columns

In a similar fashion you are able to create empty columns and append those to the DataFrame.

rand_df [['empty1', 'empty2']] = np.nan

Insert columns using the apply() function

We can use apply and involve a lambda function to perform the calculation. Note that if you haven’t imported the Numpy library, you’ll receive a module not found error.

rand_df['avg_score'] = df.apply(lambda x: np.mean(x) , axis=1)
rand_df['std_var'] = df.apply(lambda x: np.std(x) , axis=1)

Sum multiple cols in Pandas

In the same fashion you can go ahead and sum the columns:

rand_df['score_sum'] = rand_df.sum(axis=1)

# Or alternatively, using Apply

rand_df['score_sum'] = rand_df.apply(lambda x: np.sum(x) , axis=1)