In this tutorial we will learn how to append the contents of a Python list to a pandas DataFrame.
We will start by creating a simple DataFrame and a Python list. You can use both to follow along with this example.
import pandas as pd
stamps = pd.date_range(start='4/1/2023', periods = 6, freq = 'B' )
interviews = [261, 183, 232, 271, 267, 275]
#Initialize the DataFrame
campaign = pd.DataFrame (dict (interview_date = stamps, num_interviews = interviews))
# create a Python list
domain = ['Python', 'R', 'Javascript']*2
#visualize the DataFrame
campaign .head()
Assign a Python list to a DataFrame
Probably the easiest ways to append a list as a column of a Pandas DataFrame is to use the assign() df method.
The syntax goes as following:
campaign = campaign.assign(domain = domain)
This effectively adds a new column to the DataFrame and assigns the list values to it. If we take a look at the DataFrame head we’ll see the following:
campaign.head()
interview_date | num_interviews | domain | |
---|---|---|---|
0 | 2023-04-03 | 261 | Python |
1 | 2023-04-04 | 183 | R |
2 | 2023-04-05 | 232 | Javascript |
3 | 2023-04-06 | 271 | Python |
4 | 2023-04-07 | 267 | R |
5 | 2023-04-10 | 275 | Javascript |
A common error that happens when using assign, is that you don’t pass the correct parameters to assing. This reneder the following TypeError exception:
# TypeError: assign() takes 1 positional argument but 2 were given
Replace column with list values
You can use the assign method also to replace an existing column values with another list.
Let’s define a new list:
lang = ['Python', 'R', ]*3
We can assign the lang list to the already populated domain column:
campaign.assign(domain=lang)
Here’s the result – Note the new values in the domain column:
interview_date | num_interviews | domain | |
---|---|---|---|
0 | 2023-04-03 | 261 | Python |
1 | 2023-04-04 | 183 | R |
2 | 2023-04-05 | 232 | Python |
3 | 2023-04-06 | 271 | R |
4 | 2023-04-07 | 267 | Python |
5 | 2023-04-10 | 275 | R |
Replace column with random values from list
Next case is that you would like to replace values in a new or existing column / Series with random picks from a list. We will use the random module of Python to generate a random list of languages which we will then assign to our column.
import random
# generate a list of random entries - number of entries equal to the number of df rows.
random_lang = random.sample(domain, len(campaign))
# assign to the df column
campaign.assign(domain=random_lang)
Follow up learning:
How to verify if a DataFrame cell contains a specific value?