How to check if an R DataFrame column contains a string value?

Use the stringr library str_detect function to check if a specific value is included in an R DataFrame column:

library(stringr)
if sum(str_detect(my_df$my_column, 'string_value) > 0)
    {
       Your code here
    }

Note: Make sure to install the stringr library (also included in tidyverse) before utilizing the str_detect function in your RStudio script or Jupyter notebook.

Check if column contains value in R – Practical example

Create a sample DataFrame

Use the following code to create an R DataFrame that we’ll use in this tutorial:

language <- c ('JavaScript', 'Python', 'Java', 'R')
interviews <- c (12,15,23,25)
hiring <- data.frame (language = language, interviews = interviews)
print (hiring)

Here’s our DataFrame:

languageinterviews
1JavaScript12
2Python15
3Java23
4R25

Check if a specific string exists in a column

To verify that a specific value exists in a column we can use the following snippet:

sum(str_detect(hiring$language, '^Java'))

This will return the value 2, as we have one row containing the word Java and other the word JavaScript.

Note the usage of the regular expression ^ denoting that the start of the string we are searching for.

Check if column contains one or more values

The following snippet checks whether a column contains either the string Java or the string Python:

sum(str_detect(hiring$language, 'Java|Python')) 

This will return the value 3, as we have one row containing the string Java , one containing the string JavaScript and one containing the string Python.

Verify that a column contains an exact value

The following snippet verifies that the word Java is contained in our language column.

sum(str_detect(hiring$language, '^Java$'))

The regex $ determines the end of the string. Hence, the snippet above will return the value 1, as only cells matching the exact word Java will be counted.

Adding to a conditional statement in R

You can now include the snippets above in a conditional if/else statement to verify that a specific value appears in one of your column cells.

if (sum(str_detect(hiring$language, '^Java')) > 0) {
  print ('The string you specified appears in the column.')
}else 
  {print ('The string does not exist in the column.')} }

Additional Questions

Can i use the grep() function to find if a column contains a string?

You can use grep() or grepl() to perform both case sensitive or insensitive searches in R DataFrame rows or columns. Use the ingore.case = TRUE statement to perform a case-insensitive search.

my_matches <- grep('Java', hiring, ignore.case = TRUE )

Can i extract rows containing a specific strings to a DataFrame?

Yes, to subset your DataFrame using the grep() results. You can then filter the relevant rows accordingly.

my_matches <- grep('Java', hiring$language, ignore.case = TRUE )
my_matching_values <- hiring[my_matches,]

Suggested learning

How to create text files using R?