Extract First or Last n Characters from String in R (3 Examples) | Get Leading & Trailing Chars
In this tutorial, I will explain how to extract n leading or trailing characters from a string in R. I will structure the article as follows:
- Creation of Example Data
- Extract the First n Characters from String (Example 1)
- Extract the Last n Characters from String (Example 2)
- Extract the Last n Characters from String with the stringr Package (Example 3)
- Further Resources for the Handling of Characters in R
Let’s move directly to the R syntax…
Create Example Character String
In the examples of this R tutorial, I will use the following character string:
x <- "this is an example" # Create example character string
Our example string contains the sentence “this is an example”. Now let’s see how we can get the first and last n characters of this example string…
Example 1: Extract First n Characters from String in R
In the first example, you will learn how to get the first n characters of a string. For this task, we can use the substr function:
substr(x, 1, 3) # Extract first three characters # "thi"
As you can see based on the previous R code, the substr function returned thi to the RStudio console (i.e. the first three characters).
In order to extract the first n characters with the substr command, we needed to specify three values within the function:
- The character string (in our case x).
- The first character we want to keep (in our case 1).
- The last character we want to keep (in this specific example we extracted the first 3 values).
Looks good! So what if we want to get the last n characters of our example data? That’s what you will learn in the next two examples…
Example 2: Extract Last n Characters from String in Base R
In the second example, I will also use the substr function (as in Example 1). However, this time the code is getting a bit more complex:
n_last <- 3 # Specify number of characters to extract substr(x, nchar(x) - n_last + 1, nchar(x)) # Extract last three characters # "ple"
As you can see based on the previous output, we have extracted the characters ple, i.e. the last three characters of our example string.
In order to do that, we had to specify three inputs for the substr function again:
- The character string (in our case x).
- The first character we want to keep. For this part, we first had to use the nchar function to find out the length of our string (i.e. nchar(x)). Then we had to subtract the amount of characters we want to extract from the length of our string (i.e. nchar(x) – n_last). Since we need to specify the starting character with the substr function, we had to add the value 1 to our calculation (i.e. nchar(x) – n_last + 1).
- The last character we want to keep (in our case the last character of our string, i.e. nchar(x)).
Works fine, but admittedly this code is more complicated as in Example 1. For that reason, I’m going to show you an easier solution for extracting the last n characters of a string in the next example…
Example 3: Extract Last n Characters from String in R with the stringr Package
The stringr R package provides an easy way for getting the last n characters of a string. Let’s install and load the package first:
install.packages("stringr") # Install stringr package in R library("stringr") # Load stringr package
Now we can use the str_sub function of the stringr package as follows:
str_sub(x, - 3, - 1) # Extract last characters with str_sub # "ple"
The same output as before with the substr function (i.e. ple), but this time with a much simpler R syntax. To get this output, we had to specify three inputs for the str_sub function:
- The character string (in our case x).
- The first character we want to keep (in our case – 3). Note: The str_sub function allows counting backward by putting a minus sign in front of the counting number.
- The last character we want to keep (in our case – 1, i.e. the last character of x).
For further explanations, you might also have a look at the R documentation of the str_sub command:
Figure 1: str_sub Function Help Documentation.
Much easier than the substr base R solution, if you ask me!
Tutorial Video & Further Resources on the Handling of Strings
I have also published a video tutorial on this topic, so if you are still struggling with the code, watch the following video on my YouTube channel:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Dealing with strings in the R programming language can be tricky. If you want to improve your knowledge about the handling of strings in R, I can recommend the following video of the Tutorials Point YouTube channel. The speaker gives general advice how to deal with strings in R.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you could also have a look at the following articles and overviews of this website:
- substr & substring Functions in R
- nth, first & last Functions of dplyr Package
- str_sub R Function (stringr Package)
- The nchar R Function
- parse, deparse & expression Functions in R
- The R eval Function
- Get Last Value of Vector in R
- The R Programming Language
In this tutorial, you should have learned how to extract the first and the last n characters of a string in R. However, if you have further questions, don’t hesitate to let me know in the comments below!
Statistics Globe Newsletter