In this tutorial we will learn how to append the contents of a Python list to a pandas DataFrame.
We will start by creating a simple DataFrame and a Python list. You can use both to follow along with this example.
import pandas as pd
stamps = pd.date_range(start='4/1/2023', periods = 6, freq = 'B' )
interviews = [261, 183, 232, 271, 267, 275]
#Initialize the DataFrame
campaign = pd.DataFrame (dict (interview_date = stamps, num_interviews = interviews))
# create a Python list
domain = ['Python', 'R', 'Javascript']*2
#visualize the DataFrame
campaign .head()
Assign a Python list to a DataFrame
From my experience, the easiest ways to append a list as a column of a Pandas DataFrame is to use the assign() df method.
The syntax goes as following:
campaign = campaign.assign(domain = domain)
This effectively adds a new column to the DataFrame and assigns the list values to it. If we take a look at the DataFrame head we’ll see the following:
campaign.head()
interview_date | num_interviews | domain | |
---|---|---|---|
0 | 2023-04-03 | 261 | Python |
1 | 2023-04-04 | 183 | R |
2 | 2023-04-05 | 232 | Javascript |
3 | 2023-04-06 | 271 | Python |
4 | 2023-04-07 | 267 | R |
5 | 2023-04-10 | 275 | Javascript |
A common error that happens when using assign, is that you don’t pass the correct parameters to assign. This render the following TypeError exception:
# TypeError: assign() takes 1 positional argument but 2 were given
Replace column with list values
You can use the assign method also to replace an existing column values with another list.
Let’s define a new list:
lang = ['Python', 'R', ]*3
We can assign the lang list to the already populated domain column:
campaign = campaign.assign(domain=lang)
Here’s the result – Note the new values in the domain column:
interview_date | num_interviews | domain | |
---|---|---|---|
0 | 2023-04-03 | 261 | Python |
1 | 2023-04-04 | 183 | R |
2 | 2023-04-05 | 232 | Python |
3 | 2023-04-06 | 271 | R |
4 | 2023-04-07 | 267 | Python |
5 | 2023-04-10 | 275 | R |
Replace column with random values from list
Next case is that you would like to replace values in a new or existing column / Series with random picks from a list. We will use the random module of Python to generate a random list of languages which we will then assign to our column.
import random
# generate a list of random entries - number of entries equal to the number of df rows.
random_lang = random.sample(domain, len(campaign))
# assign to the df column
campaign.assign(domain=random_lang)
Export column values to a list of strings
You can easily list all column values by exporting those:
campaign['domain'].tolist()
This will return a Python list containing the Series values:
['Python', 'R', 'Python', 'R', 'Python', 'R']
Fill DataFrame columns with a value
You can insert a single value to all your DataFrame column cells.
campaign = campaign.assign(domain = 'Python')
Assign list content to a cell
We can use the at function to assign the contents of a list to a specific cell in our DataFrame. Note that we are converting the list to a string before inserting into the cell:
campaign.at[2,'domain'] = str(['Python', 'R',])
Note: As you can see, we are not able to insert a list directly into a DataFrame cell; and need to convert lists to Python strings, before storing its value in the DataFrame. But that said – that is not a recommended modeling practice which can lead to performance issues when querying and analyzing our data. When in need to included lists in a tabular format, consider working with dictionaries.
Follow up learning:
How to verify if a DataFrame cell contains a specific value?