In today’s data wrangling tutorial we will learn how to use Python and the Pandas library to create multiple columns at once in a DataFrame. This is obviously required to speed up your workflow.

We’ll start by importing the required Python libraries and creating a random data set using the Numpy library.

### Creating random data

```
import numpy as np
import pandas as pd
np.random.seed(100)
rand_df = pd.DataFrame(data = np.random.randint(70,100, size = (4,3)), columns = ['score_1', 'score_2', 'score_3'])
rand_df
```

Here’s our dataset, note that you’ll get different values, as we are using the random() method to generate the data.

score_1 | score_2 | score_3 | |
---|---|---|---|

0 | 78 | 94 | 73 |

1 | 77 | 93 | 85 |

2 | 86 | 80 | 90 |

3 | 72 | 91 | 72 |

### Insert multiple columns

Adding multiple columns is quite simple. As an example, we’ll show how to calculate the mean and standard deviation and insert those as columns.

```
rand_df['avg_score'] = rand_df.mean(axis=1).round(2)
rand_df['std_deviation'] = rand_df.std(axis=1).round(2)
rand_df
```

Note: we used the round() method to round up the calculated values

Here’s our output:

score_1 | score_2 | score_3 | avg_score | std_deviation | |
---|---|---|---|---|---|

0 | 78 | 94 | 73 | 81.67 | 8.96 |

1 | 77 | 93 | 85 | 85.00 | 6.53 |

2 | 86 | 80 | 90 | 85.33 | 4.11 |

3 | 72 | 91 | 72 | 78.33 | 8.96 |

### Inserting empty columns

In a similar fashion you are able to create empty columns and append those to the DataFrame.

`rand_df [['empty1', 'empty2']] = np.nan`

### Insert columns using the apply() function

We can use apply and involve a lambda function to perform the calculation. Note that if you haven’t imported the Numpy library, you’ll receive a *module not found error*.

```
rand_df['avg_score'] = df.apply(lambda x: np.mean(x) , axis=1)
rand_df['std_var'] = df.apply(lambda x: np.std(x) , axis=1)
```

### Sum multiple cols in Pandas

In the same fashion you can go ahead and sum the columns:

```
rand_df['score_sum'] = rand_df.sum(axis=1)
# Or alternatively, using Apply
rand_df['score_sum'] = rand_df.apply(lambda x: np.sum(x) , axis=1)
```