In this short tutorial we will describe a simple method for converting a pandas DataFrame column into a Python list object made of string values.
We’ll first import the pandas library and then define a Series object using the pd.Series constructor:
import pandas as pd office_series = pd.Series(['Bangalore', 'Osaka', 'Hong Kong', 'Paris', 'Osaka'])
Note: When creating the Series you might encounter the following type error:
#TypeError: Index(...) must be called with a collection of some kind
The pd.Series function expects to receive an index value that is a collection; otherwise it will create one automatically. Make sure you pass a list to the Serires constructor or define the index as a collection.
Convert pandas Series to strings list
Now that we have a pandas Series defined we can easily convert it to a Python list:
office_lst = office_series.to_list() print ( office_lst )
This will return a list of strings as shown below
['Bangalore', 'Osaka', 'Hong Kong', 'Paris', 'Osaka']
Pandas Series to unique values
In our example above, we have seen that the value ‘Osaka’ appears twice in our Series. What if we want to ensure that our list contains only unique list of values?
One solution is to use the Series unique() method, which returns a Numpy array, we can then use the array tolist() method to create a list of distinct items:
unique_office_lst = office_series.unique().tolist() print(unique_office_lst)
This will return:
['Bangalore', 'Osaka', 'Hong Kong', 'Paris']
The second option is to convert our list to a Python set object and then back to a list:
This will return:
['Hong Kong', 'Bangalore', 'Paris', 'Osaka']
Note: we can use the Python list append() method to add the contents of the list we created to an existing list.
Pandas series index to list
If we would like to convert the Series index to a list we can use the following code: