Select Rows with Partial String Match in R (2 Examples)

 

In this article you’ll learn how to filter rows where a specific column has a partial string match in the R programming language.

Table of contents:

Let’s do this:

 

Creation of Exemplifying Data

First, we’ll have to load some data that we can use in the examples later on. In this tutorial, we are using the iris data set:

data(iris)                                         # Example data
head(iris)                                         # Head of example data
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

The previous output of the RStudio console shows that our example data has five columns, whereby the variable species contains character strings. In the examples of this tutorial, we assume that we want to select rows where the variable species partially matches the character string “virg”. Let’s do this…

 

Example 1: Detect Rows with Partial Match Using stringr Package

This Example explains how to extract rows with a partial match using the stringr package. We first need to install and load the stringr package:

install.packages("stringr")                        # Install stringr package
library("stringr")                                 # Load stringr

Now we can subset our data with the str_detect function as shown below:

data1 <- iris[str_detect(iris$Species, "virg"), ]  # Extract matching rows with str_detect
head(data1)
#     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
# 101          6.3         3.3          6.0         2.5 virginica
# 102          5.8         2.7          5.1         1.9 virginica
# 103          7.1         3.0          5.9         2.1 virginica
# 104          6.3         2.9          5.6         1.8 virginica
# 105          6.5         3.0          5.8         2.2 virginica
# 106          7.6         3.0          6.6         2.1 virginica

As you can see, we have extracted only rows where the Species column partially matches the character string “virg”.

 

Example 2: Detect Rows with Partial Match Using data.table Package

In Example 2, I’ll show how to detect rows with a partial match using the data.table package. Again, we need to install and load the package first:

install.packages("data.table")                     # Install data.tablepackage
library("data.table")                              # Load data.table

Now, we can use the %like%-operator as follows:

data2 <- iris[iris$Species %like% "virg", ]        # Extract matching rows with %like%
head(data2)
#     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
# 101          6.3         3.3          6.0         2.5 virginica
# 102          5.8         2.7          5.1         1.9 virginica
# 103          7.1         3.0          5.9         2.1 virginica
# 104          6.3         2.9          5.6         1.8 virginica
# 105          6.5         3.0          5.8         2.2 virginica
# 106          7.6         3.0          6.6         2.1 virginica

Exactly the same result as in Example 1 – but this time with a completely different R code.

 

Video & Further Resources

Would you like to know more about the subsetting of data frames? Then I can recommend to have a look at the following video which I have published on my YouTube channel. In the video, I explain the R codes of this article:

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Furthermore, you might have a look at the related tutorials of my homepage:

 

To summarize: This tutorial showed how to extract data frame rows based on a partial match of a character string in R. Let me know in the comments, if you have any additional questions and/or comments.

 

Subscribe to my free statistics newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top