Extract Characters Between Parentheses in R (2 Examples)
This tutorial shows how to return all characters inside parentheses within a character string in the R programming language.
Table of contents:
Here’s how to do it:
Introduction of Example Data
The first step is to create some example data:
x <- "aa (bb)cde(f) (ggg) hi" # Create example character string x # Print example character string # [1] "aa (bb)cde(f) (ggg) hi"
The previous RStudio console output shows the structure of our example data: It’s a character string containing different letters. Some of these letters are between parentheses.
Example 1: Return All Characters Inside Parentheses Using gsub(), regmatches() & gregexpr()
In Example 1, I’ll demonstrate how to find all characters within parentheses using the basic installation of the R programming language.
For this, we can apply the gsub, regmatches, and gregexpr functions as illustrated in the following R syntax:
x_extract1 <- gsub("[\\(\\)]", # Extract characters within parentheses "", regmatches(x, gregexpr("\\(.*?\\)", x))[[1]]) x_extract1 # Return characters within parentheses # [1] "bb" "f" "ggg"
As you can see, the previous R code has returned a vector containing three vector elements. Each of these vector elements corresponds to the characters within a set of brackets in our input string.
Example 2: Return All Characters Inside Parentheses Using gsub(), str_extract_all() of stringr Package
In Example 2, I’ll explain how to use the stringr package to extract characters between parentheses.
First, we have to install and load the stringr package:
install.packages("stringr") # Install stringr package library("stringr") # Load stringr package
Next, we can apply the gsub with the str_extract_all function as shown below:
x_extract2 <- gsub("\\(([^()]+)\\)", # Extract characters within parentheses "\\1", str_extract_all(x, "\\(([^()]+)\\)")[[1]]) x_extract2 # Return characters within parentheses # [1] "bb" "f" "ggg"
The RStudio console output is the same as in Example 1.
Video & Further Resources
If you need further info on the R programming codes of this tutorial, I recommend having a look at the following video on the Statistics Globe YouTube channel. I explain the R syntax of this tutorial in the video.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you may want to have a look at the other tutorials on my homepage. You can find some tutorials below:
- Remove Parentheses in Character String
- Extract Numbers from Character String Vector
- Extract First or Last n Characters from String
- Extract First Entry from Character String Split
- The R Programming Language
To summarize: You have learned in this tutorial how to return all characters inside parentheses in a string in R. Don’t hesitate to let me know in the comments, in case you have additional questions. Furthermore, please subscribe to my email newsletter for updates on the newest articles.
Statistics Globe Newsletter
4 Comments. Leave new
As much as I’m aware, there needs to be two backslashes before the brackets.
Hey Nevil,
Could you specify which part of the code you mean exactly?
Thanks,
Joachim
Hi Joachim,
Thanks for putting this together! However there should be two backslashes, i.e. \\ that come after each smooth bracket, and therefore the code becomes;
gsub(“[\\(\\)]”, “”, regmatches(x, gregexpr(“\\(.*?\\)”, x))[[1]])
Hey Cornelius,
Thanks a lot for the hint!
This is weird, I’m pretty sure this worked at the time when I have created the tutorial. Anyway, I have just corrected the code.
Thanks again!
Joachim