In today’s data wrangling tutorial we will learn how to use Python and the Pandas library to create multiple columns at once in a DataFrame. This is obviously required to speed up your workflow.
We’ll start by importing the required Python libraries and creating a random data set using the Numpy library.
Creating random data
import numpy as np
import pandas as pd
np.random.seed(100)
rand_df = pd.DataFrame(data = np.random.randint(70,100, size = (4,3)), columns = ['score_1', 'score_2', 'score_3'])
rand_df
Here’s our dataset, note that you’ll get different values, as we are using the random() method to generate the data.
score_1 | score_2 | score_3 | |
---|---|---|---|
0 | 78 | 94 | 73 |
1 | 77 | 93 | 85 |
2 | 86 | 80 | 90 |
3 | 72 | 91 | 72 |
Insert multiple columns
Adding multiple columns is quite simple. As an example, we’ll show how to calculate the mean and standard deviation and insert those as columns.
rand_df['avg_score'] = rand_df.mean(axis=1).round(2)
rand_df['std_deviation'] = rand_df.std(axis=1).round(2)
rand_df
Note: we used the round() method to round up the calculated values
Here’s our output:
score_1 | score_2 | score_3 | avg_score | std_deviation | |
---|---|---|---|---|---|
0 | 78 | 94 | 73 | 81.67 | 8.96 |
1 | 77 | 93 | 85 | 85.00 | 6.53 |
2 | 86 | 80 | 90 | 85.33 | 4.11 |
3 | 72 | 91 | 72 | 78.33 | 8.96 |
Inserting empty columns
In a similar fashion you are able to create empty columns and append those to the DataFrame.
rand_df [['empty1', 'empty2']] = np.nan
Insert columns using the apply() function
We can use apply and involve a lambda function to perform the calculation. Note that if you haven’t imported the Numpy library, you’ll receive a module not found error.
rand_df['avg_score'] = df.apply(lambda x: np.mean(x) , axis=1)
rand_df['std_var'] = df.apply(lambda x: np.std(x) , axis=1)
Sum multiple cols in Pandas
In the same fashion you can go ahead and sum the columns:
rand_df['score_sum'] = rand_df.sum(axis=1)
# Or alternatively, using Apply
rand_df['score_sum'] = rand_df.apply(lambda x: np.sum(x) , axis=1)