Extract Numbers from Character String Vector in R (2 Examples)
In this tutorial you’ll learn how to return numeric values from a vector of alphanumeric character strings in the R programming language.
The post looks as follows:
Let’s dive into it:
Example Data
As a first step, we have to construct some data that we can use in the example syntax later on:
x <- c("aaa3bbb1x12", "5AGAGAGA", "2A3k4GGG5") # Create character vector x # Print vector to console # "aaa3bbb1x12" "5AGAGAGA" "2A3k4GGG5"
The previous output of the RStudio console shows that our exemplifying data is a vector containing different character strings. Each of these strings is a mix of numeric values and alphabetic letters.
Example 1: Extract First Number from String Using gsub Function
Example 1 illustrates how to return numeric components of character strings by applying the gsub function. Have a look at the following R code:
as.numeric(gsub(".*?([0-9]+).*", "\\1", x)) # Apply gsub # 3 5 2
As you can see based on the previous RStudio output, we have extracted a vector of three numeric values. Note that the previous R code only extracted the first numeric element of our character strings. For instance, the numbers 1 and 12 were not returned from the first character string element.
Example 2: Extract All Numbers from String Using gregexpr & regmatches Functions
Example 2 explains how to return all numeric components from our character string vector using a combination of the gregexpr and regmatches functions. The following R code stores all numeric components in a list:
x_numbers <- regmatches(x, gregexpr("[[:digit:]]+", x)) # Apply gregexpr & regmatches x_numbers # Print list with numbers # [[1]] # [1] "3" "1" "12" # # [[2]] # [1] "5" # # [[3]] # [1] "2" "3" "4" "5"
As you can see based on the previous output, each list element contains the numbers contained in one of the vector elements of our input vector. We can convert this list to a vector of numeric values as shown below:
as.numeric(unlist(x_numbers)) # Convert characters to numeric # 3 1 12 5 2 3 4 5
Looks good!
Video & Further Resources
Have a look at the following video of my YouTube channel. I show how to test for numbers in character strings using the R programming code of this article in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you may want to read the related articles on this website. You can find a selection of tutorials about the manipulation of character strings here.
- sub & gsub R Functions
- Extract Substring Before or After Pattern
- Find Position of Character in String
- The R Programming Language
In this post you learned how to check and select numerics by removing all non-numerics from a character string array in the R programming language. In case you have further questions, don’t hesitate to let me know in the comments section below.
Statistics Globe Newsletter