Locate & Extract Regular Expression Match in R (2 Examples)

 

On this page you’ll learn how to locate and return a regular expression match in R.

The tutorial will contain this information:

Here’s the step-by-step process!

 

Creating Example Data

The following data is used as basement for this R tutorial:

my_string <- "xxxx10yyyy"                # Create example character string
my_string                                # Print example character string
# [1] "xxxx10yyyy"

As you can see based on the previous output of the RStudio console, the example data is a single character string object containing the letters x and y as well as the number 10.

Let’s create a regular expression:

my_expression <- "[0-9]+"                # Create example regular expression
my_expression                            # Print example regular expression
# [1] "[0-9]+"

Our regular expression should match all numbers.

Let’s do this in R!

 

Example 1: Locate & Extract Regular Expression Match Using Base R

This example shows how to locate and extract matches of our regular expression in our character string using the functions of the basic installation of the R programming language.

Let’s first apply the regexpr function to find the location of the regular expression match in our character string:

regexpr(my_expression, my_string)        # Locate regular expression
# [1] 5
# attr(,"match.length")
# [1] 2
# attr(,"index.type")
# [1] "chars"
# attr(,"useBytes")
# [1] TRUE

The previous RStudio console output shows that our regular expression match starts at the fifth position of our string and has a character length of two.

We can use the regmatches function in combination with the regexpr function to extract the regular expression match:

regmatches(my_string,                    # Return regular expression
           regexpr(my_expression, my_string))
# [1] "10"

The regular expression match in our string is the number 10.

Note that our match is formatted as a character string and not as a numerical value. However, you may convert this character to numeric as explained in this tutorial.

 

Example 2: Locate & Extract Regular Expression Match Using stringr Package

Alternatively to the Base R functions, we can also use the stringr package of the tidyverse environment.

We first need to install and load the stringr package, in order to use the functions that are included in the package.

install.packages("stringr")              # Install & load stringr
library("stringr")

We can now locate the regular expression match in our string using the str_locate function…

str_locate(my_string, my_expression)     # Locate regular expression

 

table 1 matrix locate and extract regular expression match r

 

…and we can return the regular expression match using the str_extract function:

str_extract(my_string, my_expression)    # Return regular expression
# [1] "10"

Whether you want to use the functions of Base R or of the stringr package is a matter of taste. Both sets of functions lead to the same result.

 

Video & Further Resources

Do you want to learn more about regular expressions? Then I recommend watching the following video of my YouTube channel. In the video, I show the R programming syntax of this tutorial.

 

 

Furthermore, you may want to read some of the other articles on my homepage.

 

In this R tutorial you have learned how to locate and extract a regular expression match. Let me know in the comments section below, in case you have any additional comments and/or questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • Hey I have tried the same method to extract weekday from a column in a dataframe that contain a string but it did not work
    if you have a data.frame with 2 column
    where x=Personaltrainer and y=preferable_day

    y contains data like:
    “I want to train Monday 12”
    “Preferred Tuesday at 15
    “Satuday at 8:00

    How can I extract the week and put in a third column?

    Reply
    • Hey Eva,

      Please have a look at the following example code:

      x <- c("I want to train Monday 12",
             "Preferred Tuesday at 15",
             "Saturday at 8:00")
       
      x_match <- regmatches(x, regexpr("Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday", x))
      x_match
      # [1] "Monday"   "Tuesday"  "Saturday"

      Regards,
      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top