Modify Numbers with Comma as Thousand Separator in R (2 Examples)
This tutorial illustrates how to change thousand separators from comma to point in the R programming language.
Table of contents:
Let’s just jump right in…
Example 1: Convert Vector with Comma as Thousand Separator
In this example, I’ll explain how to modify a character vector containing numbers with comma separator to numeric data objects.
First, we have to create an example vector in R:
my_vec <- c("1234,560", "5,555", "10,101") # Create example vector my_vec # Print example vector # [1] "1234,560" "5,555" "10,101"
You can see based on the previous output of the RStudio console that our example vector consists of three character elements. Each element is a number with comma as thousand separator.
Now, we can use the gsub and as.numeric functions to remove the commas and convert our data object to numeric:
my_vec_updated <- as.numeric(gsub(",", "", my_vec)) # Convert example vector my_vec_updated # Print updated example vector # [1] 1234560 5555 10101
We have created a new vector object called my_vec_updated. This vector object contains our numbers in numeric data format.
Example 2: Convert Data Frame Columns with Comma as Thousand Separator
In the previous example I have explained how to change the format of a vector object. In this example, I’ll explain how to adjust all falsely formatted columns of a data frame.
First, we have to create some example data:
my_data <- data.frame(x1 = c("6,131", "7,835", "2,222"), # Create example data x2 = c("5,999", "3,136", "1,501"), x3 = 1001:1003) my_data # Print example data
Table 1 illustrates our example data frame. As you can see, the first two columns x1 and x2 contain numbers with comma punctuation mark as thousand separators. The variable x3 is already formatted the correct way.
As the next step, we have to define the columns we want to change (i.e. x1 and x2):
col_conv <- c("x1", "x2") # Define variables to convert
Now, we can create a new data frame without comma separators using the lapply, as.numeric, as.character, and gsub functions:
my_data_updated <- my_data # Duplicate data my_data_updated[ , col_conv] <- lapply(my_data_updated[ , col_conv], # Convert data function(x){ as.numeric(as.character(gsub(",", "", x))) }) my_data_updated # Print updated data
Table 2 shows the updated data frame that we have created with the previous R code. All commas were removed and the variable class of all columns is numeric.
Video, Further Resources & Summary
Would you like to learn more about data types and separators? Then I can recommend having a look at the following video of my YouTube channel. I illustrate the content of this tutorial in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
In addition, you might read the other tutorials of this homepage:
Summary: In this tutorial you learned how to convert comma separators to numeric data objects in R. Tell me about it in the comments section, if you have further questions or comments.
Statistics Globe Newsletter
3 Comments. Leave new
I used your codes to remove commas from a data.frame, but I get an error message:
> # remove comma (‘) form numbers
> col_conv indigpop.df[ , col_conv] <- lapply(indigpop.df[ , col_conv],
+ function(x){as.numeric(as.character(gsub(",", "", x)))})
Error in .subset(x, j) : invalid subscript type 'list'
Your help is greatly appreciated!
Sorry this is the entire codes:
# remove comma (‘) form numbers
col_conv <- indigpop.df[,2:27]
indigpop.df[ , col_conv] <- lapply(indigpop.df[ , col_conv],
function(x){as.numeric(as.character(gsub(",", "", x)))})
Error in .subset(x, j) : invalid subscript type 'list'
Hey Ahmad,
Could you illustrate how your data in indigpop.df looks like?
Regards, Joachim