How to merge Pandas columns into one?

In this quick tutorial we will go ahead and learn how to combine multiple Pandas columns. We will cover the following use cases, which you will typically encounter in your data preparation process.

  • Concatenate multiple columns into one
  • Merge several columns into a list
  • Merge columns if one of them has NAN values.

Example Data

As we typically do, we will start by importing the Pandas library and creating data:

import pandas as pd
name =  ['Dave', 'John', 'Matt','Horace']
family =  ['Jordan', 'Jones', 'Temp', 'Joyce']
count = [12,15,14,18]

perf_dict = dict(name=name, family=family, count=count)
sales = pd.DataFrame (perf_dict)

print (sales.head())
namefamilycount
0DaveJordan12
1JohnJones15
2MattTemp14
3HoraceJoyce18

Merge multiple Pandas columns into one

We can easily combine two text / string columns into one int two ways:

# Method 1
sales['full_name'] = sales['family']+' , '+ sales['name']

Alternatively, we can use the str.cat function:

sales['full_name'] = sales['family'].str.cat(sales['name'], sep = ' , ')

Note that we have defined commas (,) as the separator between the two objects.

Merge two dataframe columns into a list

In this case, we would like to concatenate the column values, and export them to a list. In this case, we’ll first create the new column, and then go ahead and call the to_list() Series method.


sales['full_name'] = sales['family']+' , '+ sales['name']
sales['full_name'].to_list()

The result will be:

['Jordan , Dave', 'Jones , John', 'Temp , Matt', 'Joyce , Horace']

Combine two column if one is null

When trying to concatenate two values which one of them in null, the result will be a null / NAN / NA value.

Therefore before merging two columns we should first handle the null values using the fillna() DataFrame or Series method.

# Handling NAN values 
sales.fillna(' ', inplace=True)

#combining the two columns
sales['full_name'] = sales['family']+' , '+ sales['name']

Rename the new column in Pandas

Once we create the new column, we can easily rename it:

sales.rename(columns= {'full_name': 'last_first'})