Find unique values in pandas DataFrame column

To find a count unique values in on or multiple columns of your pandas Dataframe, use the unique() and the value_counts() pandas functions as show in the snippet below. Replace the pseudo code with your actual Dataframe and column values:

#unique values
uq_values = your_df['your_col_name'].unique()

# count unique values
uq_count = your_df['your_col_name'].value_counts()

Get unique column cell values in DataFrames

Let’s start by creating some sample data that we will use in the example.

import pandas as pd

language = ['Python', 'Python', 'Java', 'Javascript', 'Python', 'R']
office = ['Paris', 'Toronto', 'Paris', 'Osaka', 'Buenos Aires', 'Bangkok']
salary = [216.0, 123.0, 99.0, 166.0, 145.0, 170.0]
hr_campaign = dict(office = office, language = language, salary = salary)
interviews_df = pd.DataFrame(data=hr_campaign)

Print unique occurrences in one or multiple columns

To check how many unique values in a single column, type the following code into your Jupyter Notebook:

uq_values = interviews_df['language'].unique()
print ('The unique values are: ' , uq_values )

This returns:

The unique values are:  ['Python' 'Java' 'Javascript' 'R']

Note: the unique() function returns a NumPy Array object.

To view uniques in two or more columns we need to loop through the relevant columns first. In this example we will search for uniques in each of our DataFrame columns.

for col in interviews_df:
   print ( interviews_df[col].unique())

This returns the following list objects:

['Paris' 'Toronto' 'Osaka' 'Buenos Aires' 'Bangkok']
['Python' 'Java' 'Javascript' 'R']
[216. 123.  99. 166. 145. 170.]

Find number of unique values in a Series object

We use the value_counts function to get the number of occurrences for each value in one or multiple columns:

uq_count = interviews_df['language'].value_counts()
print ('The value count is:\n' , uq_count)

This will return a count of each unique value in the specific column:

The value count is:
Python        3
Java          1
Javascript    1
R             1
Name: language, dtype: int64 

Export unique values to a Python list

We have already seen that when getting the number of uniques for multiple columns we get a list of objects for every column. However, when using the unique() function on a Series object, we get a NumPy array. We can export the array to a list by using the following code:


Write unique values to a CSV file

We can write the unique values array to a CSV or Excel file using the following code:

uq_salary = interviews_df['salary'].unique()
uq_salary.tofile('my_salary_values.csv', sep = ',')

Get and sort unique values

Once we find them, we can also sort the list of unique values:

uq_salary = interviews_df['salary'].unique()
print ('The salary values are: ' , uq_salary )

This will return the sorted array:

The salary values are:  [ 99. 123. 145. 166. 170. 216.]