How to select rows by index in an R dataframe?

Filter rows by index in R dataframes

You can subset one or multiple rows from an R DataFrame using the following R syntax

#subset a single row using R base

# choose multiple rows using  R base

# subset rows using dplyr
library (dplyr)
subset <- slice (your_df, row_number_vector)

We’ll start by creating the following example DataFrame that you can use to follow along:

# create vectors
month <- c ('October', 'December', 'October', 'June', 'June', 'December')
office <- c ('Hong Kong', 'Toronto', 'Hong Kong', 'Buenos Aires', 'Toronto', 'Los Angeles')
salary <- c (134.0, 99.0, 234.0, 134.0, 86.0, 186.0)

# initialize DataFrame
hrdf <- data.frame (month = month, office = office, salary = salary)

This will return the following R DataFrame:

1OctoberHong Kong134
3OctoberHong Kong234
4JuneBuenos Aires134
6DecemberLos Angeles186

Subset a specific single row by index

To choose a single row by index we simply pass the row number using the brackets notation. The following snippet select the second row of our DataFrame. Note that in R the index count starts at 1, unlike in Python / Pandas.

second_row <- hrdf[2,]

Note: The statement hrdf[2] will return the second column of your DataFrame.

Choose multiple rows by index

To select multiple rows we’ll pass a vector containing the row order:

row_vector <- c(2,3,4)

This will return the second, third and fourth rows.

We could have achieved the same output by passing a range of rows:

hrdf [2:4,]

Filter R DataFrame rows with dplyr

Whenever possible i try to use the dplyr library, which simplifies data wrangling in R. In this case, we’ll start by importing dplyr (requires to install it first in your R development environment). We then use the dplyr slice function and pass the DataFrame and row vector (in this case, we passed the second and fifth rows).

library(dplyr)  #1
subset <- slice(hrdf, c(2,5))
print (subset)

This will return the following rows:


Select the last row by index

We can use the nrow DataFrame function to find the number of DataFrame rows, we can then use to subset the last row of the DataFrame:


For completeness, you can do the same using the tail() function: