sub & gsub R Functions (2 Examples) | Replace One or Multiple Patterns

 

This article explains how to replace pattern in characters in the R programming language.

The tutorial is based on the R functions sub() and gsub(). Let’s first have a look at the basic R syntax and the definitions of the two functions:

 

Basic R Syntax of sub & gsub:

sub("old", "new", x)
gsub("old", "new", x)

 

Definitions of sub & gsub:

The sub R function replaces the first match in a character string with new characters.

The gsub R function replaces all matches in a character string with new characters.

 

In the following tutorial, I’ll explain in two examples how to apply sub and gsub in R.

All right. So first I’m going to compare the basic applications of sub vs. gsub…

 

Example 1: sub vs. gsub R Functions

Before we can apply sub and gsub, we need to create an example character string in R:

x <- "aaabbb"              # Example character string

Our example character string contains the letters a and b (each of them three times).

In our example, we are going to replace the character pattern “a” with the new character “c”. Let’s see how this looks in practice:

sub("a", "c", x)           # Apply sub function in R
# "caabbb"

The sub function replaces only the first match with our new character (i.e. the first “a” is replaced by “c”).

gsub("a", "c", x)          # Apply gsub function in R
# "cccbbb"

The gsub function, in contrast, replaces all matches with “c” (i.e. all “a” of our example character string).

 

Example 2: Replace Multiple Patterns with sub & gsub

In Example 1, we replaced only one character pattern (i.e. “a”). However, sometimes we might want to replace multiple patterns with the same new character.

In this case, we can simply write an |-operator between the different patterns that we want to match. Consider the following examples of sub…

sub("a|b", "c", x)         # sub function with multiple patterns
# "caabbb"

…and gsub:

gsub("a|b", "c", x)        # gsub function with multiple patterns
# "cccccc"

As you can see, the RStudio console output of sub didn’t change, because the first match is still the first “a” of our example character string. However, the gsub function replaces all characters with “c”, since each of the characters in our example character string matches “a” or “b”.

 

Further Resources

At this point you have learned how to replace one or several character patterns with sub and gsub in R. However, the two functions provide further options that can be specified within the two functions. I can therefore encourage you to have a look at the R help documentation of sub and gsub:

 

sub gsub help documentation in R

Figure 1: Excerpt of the R Help Documentation of sub & gsub.

 

You may also have a look at the following YouTube video of Ronak Shah. The video provides further examples for sub and gsub:

 

 

In addition, you could check out the other R programming articles on my website:

In this article, I have shown you how to use the sub and gsub functions of the R programming language. However, if you have any further questions or comments, let me know in the comments below.

 



Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top