grep & grepl R Functions (3 Examples) | Match One or Multiple Patterns in Character String

 

This tutorial explains how to search for matches of certain character pattern in the R programming language.

The article is mainly based on the grep() and grepl() R functions. The basic R syntax and the definitions of the two functions are as follows:

 

Basic R Syntax of grep & grepl:

grep("char", x)
grepl("char", x)

 

Definitions of grep & grepl:

The grep R function searches for matches of certain character pattern in a vector of character strings and returns the indices that yielded a match.

The grepl R function searches for matches of certain character pattern in a vector of character strings and returns a logical vector indicating which elements of the vector contained a match.

 

In the following R programming tutorial, I’ll explain in three examples how to apply grep, grepl, and similar functions in R.

Let’s dive in!

 

Example 1: grep vs. grepl R Functions

In the examples of this tutorial, we will use the following vector of character strings:

x <- c("d", "a", "c", "abba")      # Create example character vector

Let’s apply grep…

grep("a", x)                       # Apply grep function in R
# 2 4

…and grepl to this character string:

grepl("a", x)                      # Apply grepl function in R
# FALSE  TRUE FALSE  TRUE

As you can see based on the RStudio console output of the two functions, both functions search for matches of the input character “a” within the example vector x.

The grep R function returns the indices of vector elements that contain the character “a” (i.e. the second and the fourth element).

The grepl function, in contrast, returns a logical vector indicating whether a match was found (i.e. TRUE) or not (i.e. FALSE).

Note: Both functions consider the fourth element of our example vector (i.e. “abba”) as a match, since it contains the letter “a”.

 

Example 2: Apply grep & grepl with Multiple Patterns

We can also use grep and grepl to check for multiple character patterns in our vector of character strings. We simply need to insert an |-operator between the patterns we want to search for.

Consider the following example for grep…

grep("a|c", x)
# 2 3 4

…and the following example for grepl:

grepl("a|c", x)
# FALSE  TRUE  TRUE  TRUE

As you can see, both functions where searching for multiple pattern in the previous R code (i.e. “a” or “c”).

 

Example 3: Similar Functions: regexpr, gregexpr & regexec

The R programming language provides several functions that are very similar to grep and grepl. Some of the most common alternatives are regexpr, gregexpr, and regexec.

Let’s have a look at the definitions of the functions (as shown in the R help documentation):

 

R Help Documentation Regular Expressions

Figure 1: Excerpt of the R Help Documentation of regexpr, gregexpr & regexec.

 

As you can see, all of the functions are very similar to grep and grepl, but provide additional advantageous.

Let’s apply each of the functions in practice. First, we apply regexpr…

regexpr("a", x)                    # Apply regexpr function in R
 
# [1] -1  1 -1  1
# attr(,"match.length")
# [1] -1  1 -1  1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE

…then we use the gregexpr command…

gregexpr("a", x)                   # Apply gregexpr function in R
 
# [[1]]
# [1] -1
# attr(,"match.length")
# [1] -1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE
# 
# [[2]]
# [1] 1
# attr(,"match.length")
# [1] 1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE
# 
# [[3]]
# [1] -1
# attr(,"match.length")
# [1] -1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE
# 
# [[4]]
# [1] 1 4
# attr(,"match.length")
# [1] 1 1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE

…and finally, we apply the regexec R function:

regexec("a", x)                    # Apply regexec function in R
 
# [[1]]
# [1] -1
# attr(,"match.length")
# [1] -1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE
# 
# [[2]]
# [1] 1
# attr(,"match.length")
# [1] 1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE
# 
# [[3]]
# [1] -1
# attr(,"match.length")
# [1] -1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE
# 
# [[4]]
# [1] 1
# attr(,"match.length")
# [1] 1
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE

The R code for the different functions is very similar. However, each of the functions returns different information to the RStudio console. Depending on your specific needs, you might therefore prefer one of these alternatives to grep and grepl.

 

Further Resources

If you need more examples for grep & grepl I can recommend the following video of Ronak Shah’s YouTube channel. The video provides additional live examples for the application of grep vs. grepl in the R programming language:

 

 

Furthermore, you might be interested in some of the other R tutorials of my website:

This article showed how to use grep, grepl, regexpr, gregexpr, and regexec in R. However, if you have any further comments or questions, let me know in the comments section below.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top