How to count rows with conditions in a Pandas DataFrame?

Today we’ll learn how to count rows meeting specific values in a Pandas Dataframe. We’ll look into several cases as listed below:

  1. Count row occurrences based on a specific value.
  2. Count rows meeting multiple conditions.
  3. Finding the number of occurrences of a specific value in a row.
  4. Counting unique values in a column

Create example data

Let us get started with creating a DataFrame that you can use in order to follow along with this tutorial:

import pandas as pd

prog_lang = ['Python', 'Java', 'Javascript', 'Java', 'R']
office = ['Toronto', 'Rome', 'Madrid', 'Rome', 'Madrid']
candidates = [163.0, 145.0, 163.0, 193.0, 146.0]
hr = dict(office = office, prog_lang=prog_lang, candidates = candidates)
hr_df = pd.DataFrame(data=hr)
print( hr_df.head())

Here’s our data:

officeprog_langcandidates
0TorontoPython163.0
1RomeJava145.0
2MadridJavascript163.0
3RomeJava193.0
4MadridR146.0

Count filtered rows based on value

Let’s find which rows meet a certain condition. We’ll start by define a simple filter.

filt = hr_df['prog_lang'].str.contains('Java')

This will return a series of boolean values indicating whether our DataFrame rows meet the criteria we just defined. Now let’s filter our DataFrame rows:

hr_df[filt]

Here’s the result:

officeprog_langcandidates
1RomeJava145.0
2MadridJavascript163.0
3RomeJava193.0

Now we can easily count the relevant rows using the len function or shape method.

len(hr_df[filt])

or

hr_df[filt].shape[0]

Both will return the result 3.

Note: You might want to persist the filtered rows as a DataFrame that you can use in your Data Analysis:

hr_subset = hr_df[filt]

Count rows meeting multiple conditions

In the same fashion we can filter our DataFrame according to multiple complex conditions, concatenated by AND (&) / OR (|) operands.

# multiple conditions
filt = (hr_df['prog_lang'].str.contains('Java')) & (hr_df['candidates'] > 190)

'filter the DataFrame
hr_df[filt]

'Count the rows

hr_df[filt].shape[0]

The result will be 1.

Counting unique values in a Pandas Series

If you want to count unique values in a column, you just use value_counts()

hr_df['candidates'].value_counts()

Count occurrences of a value in a column

What if we want to count occurrences of a specific value in a DataFrame column / Series. In the example below we would like to select all rows in the candidates column which are equal to 145.

hr_df['candidates'].value_counts()[145]

Additional learning