In today’s tutorial we’ll learn how to easily find the index values of specific rows in Pandas DataFrames which answer specific condition. We’ll be looking at
We’ll start by importing Pandas and Numpy, then we’ll go ahead and create some example data.
import pandas as pd import numpy as np # Example data office = ['Atlanta', 'Boston', 'New York', 'Miami'] candidates = [100, 120, 130, 105] hired = [34,43,32,np.nan] # create data hr_dict = dict(office= office, candidates= candidates, hired=hired) hr = pd.DataFrame(hr_dict) hr.head()
Here’s our DataFrame
Find index of specific column value
val = hr[hr['office'] == 'New York'] val.index.tolist() Result:
The result will be a list containing the relevant element row number. In our case: .
Alternatively we can go ahead and use the following syntax:
filt = hr['office'] == 'New York' hr.index[filt].tolist() Result:
Index of value meeting a condition
In this case we will be looking for one or more rows that meet a specific condition. We define a set of arithmetic conditions concatenated by boolean operators (AND (&) / OR (|)), like in the example below:
filt = (hr['office'] == 'New York') | (hr['candidates'] < 120) hr.index[filt].tolist() Result: [0, 2, 3]
Index of maximum row in column
In this example we are interested in the index value of the maximum value element of a specific column.
hr['candidates'].idxmax() Result: 2
Index of minimum value element
Similar to the example from above, but this time we are looking for the minimum value.
Index of row closes to specific value
In this case, we are looking for the row index in which the corresponding element value is the closest to an arbitratry value we provide.
val = 115 idx = abs(hr['candidates']-val).idxmin() Result: 1
Index of rows containing empty (nan) values
filt = hr['hired'].isnull() hr.index[filt].tolist() Result: