As part of your Data wrangling process you might need to add new rows and columns into your DataFrame. The source of that data can be Python list objects. In today’s tutorial we’ll learn to write some very basic Python code that will help us to append one or multiple lists into new or existing Pandas DataFrames.
Create example DataFrame
Let’s get started by defining a DataFrame that you are able to use in order to follow along with this tutorial.
import pandas as pd
period = ['January', 'January', 'March', 'May', 'March']
language = ['R', 'R', 'Java', 'Python', 'Python']
salary = [242.0, 172.0, 242.0, 123.0, 229.0]
salaries_dict = dict(period = period, language = language, salary = salary)
salary_df = pd.DataFrame(data=salaries_dict)
salary_df.head()
Let’s take a look at our data:
period | language | salary | |
---|---|---|---|
0 | January | R | 242.0 |
1 | January | R | 172.0 |
2 | March | Java | 242.0 |
3 | May | Python | 123.0 |
4 | March | Python | 229.0 |
Python list as a new DataFrame column
Our first example will be to convert a Python list into a new row of our DataFrame.
Let’s start by defining a random list:
interviews_lst = [14,18,20,13,16]
Next we’ll assign the list as a new column. Note that we’ll need to define a name for the new column
salary_df = salary_df.assign(interviews =interviews_lst )
We can accomplish a similar outcome by using the brackets notation:
salary_df ['interviews'] = interviews_lst
And here’s our DataFrame:
period | language | salary | interviews | |
---|---|---|---|---|
0 | January | R | 242.0 | 14 |
1 | January | R | 172.0 | 18 |
2 | March | Java | 242.0 | 20 |
3 | May | Python | 123.0 | 13 |
4 | March | Python | 229.0 | 16 |
Change the new column name
In case it’s needed we can easily change the name of the new column we just assigned to the DataFrame:
salary_df.rename(columns={'interviews':'num_interviews'}, inplace=True)
Python list to dataframe row
We can also add lists as new DataFrame rows.
Let’s start by defining a list representing the new observation we want to add:
new_row = ['April', 'Python', 235, 13]
If we want to add the new observation to the end of the DataFrame we can use the loc accessor:
salary_df.loc[len(salary_df)] = new_row
The following row was added. Note that we could as well add multiple lists or dictionaries as rows.
period | language | salary | num_interviews | |
---|---|---|---|---|
5 | April | Python | 235.0 | 13 |
Note that using the iloc accessor will in this case return the following index error message:
IndexError: iloc cannot enlarge its target object
Append a list as the DataFrame first row
If you want to add your list as the first DataFrame row, you can use the following code. For simplicity, we are just adding a single row to the DataFrame, but we can obviously pass a list of lists or dictionaries to the DataFrame constructor function (pd.DataFrame).
# create a new DataFrame from the list
salary_appnd = pd.DataFrame([new_row], columns = salary_df.columns)
#append the content of our original DataFrame to the new one
salary_appnd.append(salary_df, ignore_index=True)