How to append multiple Pandas DataFrames?

In today’s tutorial we’ll learn how to concatenate two or more DataFrames having the same columns in Pandas. This will come very handy in cases that you receive several comma separated value (CSV) files or Excel spreadsheets that you need to merge together before starting your Data Analysis process.

Creating example data

Let’s start by creating several DataFrames that we would like to append to each other.

import pandas as pd

europe_cand = {'area': ['Python', 'R', 'Python'],
 'city': ['London', 'Paris', 'Madrid'],
 'candidates': [24, 65, 82]}

usa_cand = {'area': ['R', 'Python', 'R'],
 'city': ['NYC', 'LA', 'Seattle'],
 'candidates': [72, 110, 123]}

asia_cand = {'area': ['R', 'Python', 'R'],
 'city': ['Delhi', 'Tokyo', 'Seoul'],
 'candidates': [72, 110, 123],
            'salary': [100, 150, 125]}

# create DataFrame from dictionaries

europe_df = pd.DataFrame (europe_cand)
usa_df = pd.DataFrame (usa_cand)
asia_df = pd.DataFrame (asia_cand)

cand_df_lst = [europe_df, usa_df, asia_df]

Concatenate multiple DataFrames

In the first example we would like to use the pd.concat() function to append two or more DataFrames. Let’s see how we make it with Python:

cand = pd.concat([europe_df, usa_df], ignore_index=True)


Note: Watch out for TypeError messages if passing a list of lists or dictionary object to pd.concat. You can pass a list of DataFrames a shown above.

TypeError: cannot concatenate object of type '<class 'list'>'; only Series and DataFrame objs are valid

Here’s the resulting DataFrame values:


Note: The pd.concat function receives an iterable, in our case a list of two DataFrames and simply brings them together. Using this method is specially useful if both DataFrames have the same columns.

We could have reached a similar result if using the append DataFrame method:

cand = europe_df.append(usa_df, ignore_index=True)

Append DataFrames using a for loop

A more interesting example is when we would like to concatenate DataFrame that have different columns. In this example we’ll loop into a list of DataFrames and append their content to an empty DataFrame. Let’s take a look:

cand =pd.DataFrame()

for df in cand_df_lst:
    cand = cand.append(df, ignore_index=True).fillna('')

Note that we replaced NAN values using the fillna() DataFrame method.

We could have accomplish the same with pd.concat:

cand1 = pd.concat(cand_df_lst, ignore_index=True).fillna('')

Append column to DataFrame

We have a thorough tutorial on adding columns to DataFrames, which you can find here.