How to add a series as a DataFrame column in Pandas?

In this tutorial we would like to show how to quickly append a new column to a Pandas DataFrame. You might want to follow along the step by step process.

Define example data

We’ll first going to define a DataFrame and a Series made out of some random numbers that we have generated.

import pandas as pd

week = ['2021-01-31', '2021-02-28', '2021-03-31', '2021-04-30',
               '2021-05-31', '2021-06-30', '2021-07-31', '2021-08-31']
salary = [127., 125., 105., 126., 131., 113., 110., 106.]
candidates = [40., 38., 49., 74., 31., 46., 64., 52.]

Let’s quickly create our Pandas DataFrame using the pd.DataFrame constructor as shown below:

hr_df = pd.DataFrame ({ 'week': week, 
    'salary': salary})

Adding a list or series as a new DataFrame column

We’ll show three methods for adding a Series as a new column to the DataFrame.

Assign a Series to the DataFrame

We’ll start by creating a Series from our candidates list:

cand_s = pd.Series (candidates)

Now we’ll append the series as a column to the DataFrame using the pd.assign method. Assumption: our series has the same length than the DataFrame.

new_hr = hr_df.assign(candidates = cand_s)

Add a Python list to a DataFrame using Join

Here we first need to convert the list to a Dataframe, then join its content to the source DataFrame:

cand_df = pd.DataFrame (candidates)
new_hr_2 = hr_df.join(cand_df)

Append the list directly to the DataFrame

hr_df['candidates'] = candidates

Adding a column based on other column

We can easily derive column values based on other column values. In our example we’ll define a column named weekly salary.

hr_df['weekly_salary'] = hr_df['salary']/4

An alternative way is to use the pd.assign method:

hr_df.assign(weekly_salary = hr_df['salary']/4)

Append a Series as a row

For completeness, in this section we would like to cover the case of adding a Series to an existing DataFrame.

#first we define a Series from a list
new_s = pd.Series(['2021-09-30', 137, 48], index = hr_df.columns) 

# using the loc indexer, append the series to the end of the df
hr_df.loc[len(hr_df)] =  new_s

Suggested learning