Extract Characters Between Parentheses in R (2 Examples)

 

This tutorial shows how to return all characters inside parentheses within a character string in the R programming language.

Table of contents:

Here’s how to do it:

 

Introduction of Example Data

The first step is to create some example data:

x <- "aa (bb)cde(f) (ggg) hi"           # Create example character string
x                                       # Print example character string
# [1] "aa (bb)cde(f) (ggg) hi"

The previous RStudio console output shows the structure of our example data: It’s a character string containing different letters. Some of these letters are between parentheses.

 

Example 1: Return All Characters Inside Parentheses Using gsub(), regmatches() & gregexpr()

In Example 1, I’ll demonstrate how to find all characters within parentheses using the basic installation of the R programming language.

For this, we can apply the gsub, regmatches, and gregexpr functions as illustrated in the following R syntax:

x_extract1 <- gsub("[\\(\\)]",          # Extract characters within parentheses
                   "",
                   regmatches(x,
                              gregexpr("\\(.*?\\)",
                                       x))[[1]])
x_extract1                              # Return characters within parentheses
# [1] "bb"  "f"   "ggg"

As you can see, the previous R code has returned a vector containing three vector elements. Each of these vector elements corresponds to the characters within a set of brackets in our input string.

 

Example 2: Return All Characters Inside Parentheses Using gsub(), str_extract_all() of stringr Package

In Example 2, I’ll explain how to use the stringr package to extract characters between parentheses.

First, we have to install and load the stringr package:

install.packages("stringr")             # Install stringr package
library("stringr")                      # Load stringr package

Next, we can apply the gsub with the str_extract_all function as shown below:

x_extract2 <- gsub("\\(([^()]+)\\)",    # Extract characters within parentheses
                   "\\1",
                   str_extract_all(x,
                                   "\\(([^()]+)\\)")[[1]])
x_extract2                              # Return characters within parentheses
# [1] "bb"  "f"   "ggg"

The RStudio console output is the same as in Example 1.

 

Video & Further Resources

If you need further info on the R programming codes of this tutorial, I recommend having a look at the following video on the Statistics Globe YouTube channel. I explain the R syntax of this tutorial in the video.

 

 

In addition, you may want to have a look at the other tutorials on my homepage. You can find some tutorials below:

 

To summarize: You have learned in this tutorial how to return all characters inside parentheses in a string in R. Don’t hesitate to let me know in the comments, in case you have additional questions. Furthermore, please subscribe to my email newsletter for updates on the newest articles.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


4 Comments. Leave new

  • As much as I’m aware, there needs to be two backslashes before the brackets.

    Reply
  • Hi Joachim,

    Thanks for putting this together! However there should be two backslashes, i.e. \\ that come after each smooth bracket, and therefore the code becomes;

    gsub(“[\\(\\)]”, “”, regmatches(x, gregexpr(“\\(.*?\\)”, x))[[1]])

    Reply
    • Hey Cornelius,

      Thanks a lot for the hint!

      This is weird, I’m pretty sure this worked at the time when I have created the tutorial. Anyway, I have just corrected the code.

      Thanks again!

      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top