How to add a series as a DataFrame column in Pandas?

In this tutorial we would like to show how to quickly append a new column to a Pandas DataFrame. You might want to follow along the step by step process.

Example data to create DataFrame

We’ll first going to define a DataFrame made out of some random numbers that we have generated.

import pandas as pd

#define lists containing data
week = ['2021-01-31', '2021-02-28', '2021-03-31', '2021-04-30',
               '2021-05-31', '2021-06-30', '2021-07-31', '2021-08-31']
salary = [127., 125., 105., 126., 131., 113., 110., 106.]
candidates = [40., 38., 49., 74., 31., 46., 64., 52.]

Let’s quickly create our Pandas DataFrame using the pd.DataFrame constructor as shown below:

hr_df = pd.DataFrame ({ 'week': week, 
    'salary': salary})

Adding a list or series as a new DataFrame column

We’ll show three methods for adding a Series as a new column to the DataFrame.

Method #1 : Assign a Series to the DataFrame

We’ll start by creating a Series out of our candidates list. We do that using the pd.Series constructor.

cand_s = pd.Series (candidates)

Now we’ll append the newly created series as a column into the DataFrame using the pd.assign() method.

new_hr = hr_df.assign(candidates = cand_s)

Note: if the DataFrame is longer than the list, the assignment will still work. Missing values will be populated with NaN values.

In that case you can use the series fillna() method to fill those missing values:

import numpy as np
new_hr['candidates'].fillna(np.mean, inplace=true)

Method #2 : Add a Python list to a DataFrame using Join

Here we first need to convert the list to a DataFrame, then join its content to the source DataFrame:

cand_df = pd.DataFrame (candidates)
new_hr_2 = hr_df.join(cand_df)

Method #3 : Append the list directly to the DataFrame

hr_df['candidates'] = candidates

Adding a column based on other column

We can easily derive column values based on other column values. In our example we’ll define a column named weekly salary.

hr_df['weekly_salary'] = hr_df['salary']/4

An alternative way is to use the pd.assign method:

hr_df.assign(weekly_salary = hr_df['salary']/4)

Append a Series as a DataFrame Row

Another use case to cover is adding a Series to an existing DataFrame.

#first we define a Series from a list
new_s = pd.Series(['2021-09-30', 137, 48], index = hr_df.columns) 

# using the loc indexer, append the series to the end of the df
hr_df.loc[len(hr_df)] =  new_s

Related: insert row at specific index in Pandas

In the previous section, we added a Series as the last row of our DataFrame. That said, what if we would like to insert it at an arbitrary position?

Let’s assume that we would like to add a new row between the 2nd and 3rd row of our DataFrame:

# insert a list at a location between the 2nd and 3rd row
hr_df.loc[1.5] = ['2021-09-30', 137, 48]

# sort the index and drop the previous index column
hr_df = hr_df.sort_index().reset_index(drop=True)

Suggested learning