Locate & Extract Regular Expression Match in R (2 Examples)
On this page you’ll learn how to locate and return a regular expression match in R.
The tutorial will contain this information:
Here’s the step-by-step process!
Creating Example Data
The following data is used as basement for this R tutorial:
my_string <- "xxxx10yyyy" # Create example character string my_string # Print example character string # [1] "xxxx10yyyy"
As you can see based on the previous output of the RStudio console, the example data is a single character string object containing the letters x and y as well as the number 10.
Let’s create a regular expression:
my_expression <- "[0-9]+" # Create example regular expression my_expression # Print example regular expression # [1] "[0-9]+"
Our regular expression should match all numbers.
Let’s do this in R!
Example 1: Locate & Extract Regular Expression Match Using Base R
This example shows how to locate and extract matches of our regular expression in our character string using the functions of the basic installation of the R programming language.
Let’s first apply the regexpr function to find the location of the regular expression match in our character string:
regexpr(my_expression, my_string) # Locate regular expression # [1] 5 # attr(,"match.length") # [1] 2 # attr(,"index.type") # [1] "chars" # attr(,"useBytes") # [1] TRUE
The previous RStudio console output shows that our regular expression match starts at the fifth position of our string and has a character length of two.
We can use the regmatches function in combination with the regexpr function to extract the regular expression match:
regmatches(my_string, # Return regular expression regexpr(my_expression, my_string)) # [1] "10"
The regular expression match in our string is the number 10.
Note that our match is formatted as a character string and not as a numerical value. However, you may convert this character to numeric as explained in this tutorial.
Example 2: Locate & Extract Regular Expression Match Using stringr Package
Alternatively to the Base R functions, we can also use the stringr package of the tidyverse environment.
We first need to install and load the stringr package, in order to use the functions that are included in the package.
install.packages("stringr") # Install & load stringr library("stringr")
We can now locate the regular expression match in our string using the str_locate function…
str_locate(my_string, my_expression) # Locate regular expression
…and we can return the regular expression match using the str_extract function:
str_extract(my_string, my_expression) # Return regular expression # [1] "10"
Whether you want to use the functions of Base R or of the stringr package is a matter of taste. Both sets of functions lead to the same result.
Video & Further Resources
Do you want to learn more about regular expressions? Then I recommend watching the following video of my YouTube channel. In the video, I show the R programming syntax of this tutorial.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you may want to read some of the other articles on my homepage.
- Match One or Multiple Patterns in Character String
- str_match & str_match_all Functions
- Match Wildcard Pattern and Character String
- Select Rows with Partial String Match
- R Programming Tutorials
In this R tutorial you have learned how to locate and extract a regular expression match. Let me know in the comments section below, in case you have any additional comments and/or questions.
Statistics Globe Newsletter
2 Comments. Leave new
Hey I have tried the same method to extract weekday from a column in a dataframe that contain a string but it did not work
if you have a data.frame with 2 column
where x=Personaltrainer and y=preferable_day
y contains data like:
“I want to train Monday 12”
“Preferred Tuesday at 15
“Satuday at 8:00
How can I extract the week and put in a third column?
Hey Eva,
Please have a look at the following example code:
Regards,
Joachim