To map dictionary values into a new pandas DataFrame column, use the following code:
your_df ['your_new_col'] = your_df ['col_to_be_mapped'].map (your_dictionary)
Fill pandas column with dictionary values
There are several cases in which we need to add dictionary values to an existing DataFrame:
- When creating a DataFrame from scratch by using key value pairs from a dictionary.
- When cleaning up a dataset, we map between specific values in a DataFrame column and our dictionary values. This allows to harmonize erroneous values and filling missing ones.
- Merge values stored two dictionary objects into a DataFrame.
- Visualize data stored in a dictionary using using Pandas, MatplotLib or Seaborn libraries.
Create sample data
We will start by creating a simple DataFrame and a dictionary.
import pandas as pd
month = ['November', 'July', 'October', 'November', 'December', 'October']
lang_code = [1, 1, 2, 2, 3,1]
salary = [102.0, 79.0, 150.0, 160.0, 127.0, 165.0]
interviews = dict(month = month, lang_code = lang_code, salary = salary)
hr_df = pd.DataFrame(data=interviews)
hr_df.head()
Here’s our DataFrame content:
month | lang_code | salary | |
---|---|---|---|
0 | November | 1 | 102.0 |
1 | July | 1 | 79.0 |
2 | October | 2 | 150.0 |
3 | November | 2 | 160.0 |
4 | December | 3 | 127.0 |
Next i will define a simple dictionary made of programming language names:
lang_dict = { 1: 'R', 2: 'Python', 3: 'Javascript'}
Map Dictionary values to DataFrame column
Now the interesting part. We would like to insert a new column into our DataFrame based on the values of our dictionary. We will use the Python map function to map the values of the lang_code column to the respective values in the lang_dict dictionary:
hr_df['lang_name'] = hr_df['lang_code'].map(lang_dict)
Looking into the DataFrame header:
hr_df.head()
The lang_name column was appended to the DataFrame and shows up in the rightmost position:
month | lang_code | salary | lang_name | |
---|---|---|---|---|
0 | November | 1 | 102.0 | R |
1 | July | 1 | 79.0 | R |
2 | October | 2 | 150.0 | Python |
3 | November | 2 | 160.0 | Python |
4 | December | 3 | 127.0 | Javascript |
Fill DataFrame column according to condition
What if i would like to fill only specific values, and leave the other cells empty? Here’s a snippet you can use:
'define list of allowed values
allowed_val_lst = [1,2]
hr_df['lang_name_cond'] = hr_df['lang_code'].apply(lambda c: lang_dict[c] if c in allowed_val_lst else '')
hr_df.head()
This will render the following – look at the lang_name_cond column that displays values only if the language code is included in the list of allowed values.
month | lang_code | salary | lang_name | lang_name_cond | |
---|---|---|---|---|---|
0 | November | 1 | 102.0 | R | R |
1 | July | 1 | 79.0 | R | R |
2 | October | 2 | 150.0 | Python | Python |
3 | November | 2 | 160.0 | Python | Python |
4 | December | 3 | 127.0 | Javascript |
Append list to DataFrame column
A somewhat related use case is when you need to insert a list as a DataFrame column. We have a short tutorial on that which you might want to check out.