How to add a new column to a DataFrame in Python Pandas?

In today’s Pandas data wrangling we’ll learn how to create columns in a Python Pandas DataFrame.

We’ll cover three main uses cases related to inserting columns into Python DataFrames:

  1. Create a new column and populate it with a specific value
  2. Adding a column based on calculation of other column values.
  3. Creating the new column value on specific logic / conditions.

Preparing our DataFrame

Well quickly get started by importing our DataFrame data from a csv file as shown below.

import pandas as pd

# read the csv contents into a DataFrame
hr = pd.read_csv('hr.csv)

Let’s look into our table columns:

hr.head(4)
languagefirst_interviewsecond_interview
1Scala73.083.0
2Java84.086.0
3SQL81.077.0
4R81.085.0

Creating a column with specific values

Let us quickly create a column, and pre-populate it with some value:

hr['venue'] = 'New York Office'

We can also create an empty column in the same fashion:

hr['venue_2']=''

Or fill the column with nan values:

import numpy as np
hr['venue_3'] = np.nan

Note that we are able to determine the column index quite easily or even move a column to the first position in the DataFrame.

Creating column based on other columns calculation

Let’s assume that we just want to create a column that will show the average number of interviews for each position.

hr['avg'] = hr.mean(axis=1)

Voi’la:

languagefirst_interviewsecond_interviewvenueavg
1Scala73.083.0Newy York Office78.0
2Java84.086.0Newy York Office85.0
3SQL81.077.0Newy York Office79.0
4R81.085.0Newy York Office83.0

Adding a column derived from other columns logic (using map() and lambda)

We know want to based each cell content on some logic related to other columns. Using functional programming we’ll first go ahead and create the new column and use a custom lambda function that helps to map between thew values in our avg and the supply fields as shown below:

hr['supply'] = hr['avg'].map(lambda x: 'high' if x> 80 else 'low' )

Here’s our DataFrame:

languagefirst_interviewsecond_interviewvenueavgsupply
1Scala73.083.0Newy York Office78.0low
2Java84.086.0Newy York Office85.0high
3SQL81.077.0Newy York Office79.0low
4R81.085.0Newy York Office83.0high