How to Convert a Character to Numeric in R
Basic R Syntax:
x_num <- as.numeric(x) |
x_num <- as.numeric(x)
Do you need more explanations? In the following article, Iโll provide you with all important information for the conversion of character vectors to numeric in R.
Example: Convert Character to Numeric in R
Before we can dive into the transformation of a character variable to numeric, we need to create an example character in R. Consider the following vector:
set.seed(55555) # Set seed x <- as.character(sample(c(2, 5, 7, 8), 50, replace = TRUE)) # Example character vector |
set.seed(55555) # Set seed x <- as.character(sample(c(2, 5, 7, 8), 50, replace = TRUE)) # Example character vector
Our string consists of the four character values 2, 5, 7 & 8:
x # Print example vector to R console |
x # Print example vector to R console
Graphic 1: Example Character String Printed to the RStudio Console
Now, we can continue with the important part – How to convert this character string to numeric?
No Problem:
x_num <- as.numeric(x) # Convert string to numeric in R x_num # Print converted x to the console # 8 7 5 8 2 5 2 5 2 7 7 7 7... |
x_num <- as.numeric(x) # Convert string to numeric in R x_num # Print converted x to the console # 8 7 5 8 2 5 2 5 2 7 7 7 7...
That’s basically how to apply the as.numeric function in R. However, if you need some more explanations for the conversion of data types, you might have a look at the following video of my YouTube channel. In the video, I’m explaining how to convert character and factors to numeric in R:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Convert All Characters of a Data Frame to Numeric
As you have seen, to convert a vector or variable with the character class to numeric is no problem. However, sometimes it makes sense to change all character columns of a data frame or matrix to numeric.
Consider the following R data.frame:
x1 <- c("5", "2", "7", "5") # Character x2 <- c("77", "23", "84", "11") # Another character x3 <- as.factor(c("4", "1", "1", "8")) # Factor x4 <- c(3, 3, 9, 7) # Numeric data <- data.frame(x1, x2, x3, x4, # Create data frame stringsAsFactors = FALSE) sapply(data, class) # Print classes of all colums # x1 x2 x3 x4 # "character" "character" "factor" "numeric" |
x1 <- c("5", "2", "7", "5") # Character x2 <- c("77", "23", "84", "11") # Another character x3 <- as.factor(c("4", "1", "1", "8")) # Factor x4 <- c(3, 3, 9, 7) # Numeric data <- data.frame(x1, x2, x3, x4, # Create data frame stringsAsFactors = FALSE) sapply(data, class) # Print classes of all colums # x1 x2 x3 x4 # "character" "character" "factor" "numeric"
Table 1: Example Data Frame with Different Variable Classes
With the following R code, you are able to recode all variables – no matter which variable class – of a data frame to numeric:
data_num <- as.data.frame(apply(data, 2, as.numeric)) # Convert all variable types to numeric sapply(data_num, class) # Print classes of all colums # x1 x2 x3 x4 # "numeric" "numeric" "numeric" "numeric" |
data_num <- as.data.frame(apply(data, 2, as.numeric)) # Convert all variable types to numeric sapply(data_num, class) # Print classes of all colums # x1 x2 x3 x4 # "numeric" "numeric" "numeric" "numeric"
However, in many situations it is better to convert only character columns to numeric (i.e. not column X3, since this column should be kept as factor). You could do that with the following code in R:
char_columns <- sapply(data, is.character) # Identify character columns data_chars_as_num <- data # Replicate data data_chars_as_num[ , char_columns] <- as.data.frame( # Recode characters as numeric apply(data_chars_as_num[ , char_columns], 2, as.numeric)) sapply(data_chars_as_num, class) # Print classes of all colums # x1 x2 x3 x4 # "numeric" "numeric" "factor" "numeric" |
char_columns <- sapply(data, is.character) # Identify character columns data_chars_as_num <- data # Replicate data data_chars_as_num[ , char_columns] <- as.data.frame( # Recode characters as numeric apply(data_chars_as_num[ , char_columns], 2, as.numeric)) sapply(data_chars_as_num, class) # Print classes of all colums # x1 x2 x3 x4 # "numeric" "numeric" "factor" "numeric"
Example Video: How to Change Variable Types
Further examples needed? Have a look at the following R Programming tutorial of the YouTube Channel LearnR. The speaker discusses different data transformations from one data class to another.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Further Reading
- How to Convert Factor to Numeric in R
- Convert a Data Frame Column to Numeric
- Convert Factor to Character
- type.convert R Function
- The R Programming Language
Statistics Globe Newsletter
26 Comments. Leave new
I spent all afternoon scrolling through Stack Overflow trying to figure out how to do this. Once I found it on your site, it took 5 minutes. Thank you!!
Wow that’s an amazing feedback! Thanks Don ๐
How can I keep this from converting my rowname chars to numeric?
Hey Don,
You may convert only a subset of your data frame to numeric. Have a look here for more info.
Regards,
Joachim
I just had 3 variables in a 17-variable data set to convert from character to numeric. This didn’t help at all I’m afraid.
Hi Larry,
Could you explain why it didn’t help? Maybe we can figure out a solution for your problem ๐
Regards,
Joachim
Hi. How to stop getting the “Coerced to NA” warning?
Hi Vaibhav,
This typically happens when your characters are not representing numbers (e.g. “5!”, “1,55”, “seven”). Have you checked how your data is formatted before converting it?
Regards,
Joachim
This comment saved my day. I spent almost 3 hours trying to sort out similar problem with my data and then I found it to solve it. Thank you Joachim
Hey Alexandre,
That’s great to hear! Glad it helped ๐
Regards
Joachim
I have factors with levels and labels; how do I convert them to numeric
Hey Busola,
I have another article on converting factors to numeric: https://statisticsglobe.com/how-to-convert-a-factor-to-numeric-in-r/
Let me know if this solved your problem.
Regards,
Joachim
Hello Jo
I have a data frame which is all in text. Recent Kaggle competition (https://www.kaggle.com/c/kaggle-survey-2020)
While converting from Character to Numeric, I am having problem of NA Coercion
So I saw your Youtube Video and then looked up the above tutorial. Here a2 is the original data set of Kaggle imported as Header F.
However, I am still getting NA coercion. Please help
data <- data.frame(a2$V2, a2$V3, a2$V4, a2$V5, stringsAsFactors = FALSE)
sapply(data, class)
a2.V2 a2.V3 a2.V4 a2.V5
"numeric" "character" "character" "character"
data1 sapply(data_num, class)
Error in lapply(X = X, FUN = FUN, …) : object ‘data_num’ not found
> sapply(data1, class)
a2.V2 a2.V3 a2.V4 a2.V5
“numeric” “numeric” “numeric” “numeric”
> head(data1, 6)
a2.V2 a2.V3 a2.V4 a2.V5
1 NA NA NA NA
2 NA NA NA NA
3 NA NA NA NA
4 NA NA NA NA
5 NA NA NA NA
6 NA NA NA NA
>
> data sapply(data, class)
a2.V2 a2.V3 a2.V4 a2.V5
“numeric” “character” “character” “character”
> data1 sapply(data_num, class)
Error in lapply(X = X, FUN = FUN, …) : object ‘data_num’ not found
> sapply(data1, class)
a2.V2 a2.V3 a2.V4 a2.V5
“numeric” “numeric” “numeric” “numeric”
> head(data1,6)
a2.V2 a2.V3 a2.V4 a2.V5
1 NA NA NA NA
2 NA NA NA NA
3 NA NA NA NA
4 NA NA NA NA
5 NA NA NA NA
6 NA NA NA NA
>
Hey Shrinivas,
I hope you doing fine?
Regarding your question, I would have a look at two things:
1) The NA coercion problem usually appears, because the character numbers are not formatted properly. More info: https://statisticsglobe.com/warning-message-nas-introduced-by-coercion-in-r
2) It seems like you are trying to use a data frame that does not exist in your workspace. For that reason you get the error “object โdata_numโ not found”. Please check if this data frame really exists.
I hope that helps!
Joachim
Dear Joachim,
thanks for your great videos and website. This time however, I wasn’t able to solve my problem as its more specific but I hope you can bring more light into this:
I have a given column (CV that I want to test in a ANCOVA) in my data set which contains numbers similar to this “-,038040659585351” and its structure is character by default. Now nothing has worked to convert this into numeric. It either got changed into NAs or a whole lot of different numbers.
Do you happen to have an idea to convert it into numeric?
Many thanks in advance!
Leo
Hey Leo,
Thanks a lot for the nice feedback! I’m happy to hear that you like my content ๐
Regarding your question, I would try the following:
Step 1) Make sure that the column is a character (not a factor) as explained here: https://statisticsglobe.com/convert-factor-to-character-in-r
Step 2) Use the gsub function to replace all non-numeric symbols and letters in your data. This depends a bit on the exact structure of your data, however, you can find a detailed tutorial here: https://statisticsglobe.com/sub-gsub-r-function-example
Step 3) Convert your column to numeric as shown in this tutorial.
Let me know if it worked ๐
Joachim
Hi Joachim,
I really appreciate your examples. I am struggling to figure out how to convert a character field with the values Y or N to a Numeric where the Y=1 and the N=0. Similarly, I need to understand how to change Education Levels e.g., Sixth grade to 6, 9th grade to 9th, High School Diploma to 12, … PhD to 22 etc.
Any help you can provide with this is much appreciated.
Thanks again.
Hey Elisa,
Thank you for the nice feedback!
Regarding your question, please follow these steps:
1) Make sure that your column has the character class: https://statisticsglobe.com/convert-factor-to-character-in-r
2) Replace your values: https://statisticsglobe.com/r-replace-value-of-data-frame-variable-dplyr-package
3) Convert your column to numeric as shown in this tutorial.
Regards,
Joachim
Hi Joachim,
I was trying to convert some character variables into numeric variables accrding to your recipe. As you can see I managed to convert them. However, when I checked the entire data set with str() function, its clearly seen that, the variables are still presented as character variables. And they still behave strange when I run them in LM. How can I convert every variable containing numbers into numeric variables in R. Please see below:
> F.BEHAV M.BEHAV OVERALL.SUCCESS i data_num sapply(data_num, class)
M.BEHAV F.BEHAV OVERALL.SUCCESS
“numeric” “numeric” “numeric”
When using the str() function:
> str(GRW_ANALYSIS)
‘data.frame’: 94 obs. of 42 variables:
$ LOCATION : chr “Bager” “Bager” “Kigyos kolut” “Pista” …
$ HABITAT : chr “MP” “MP” “SC1” “MP” …
$ HABITAT.ID : int 1 1 4 1 5 3 5 1 3 5 …
$ YEAR : int 2008 2009 2009 2009 2009 2009 2010 2010 2010 2010 …
$ NO.OF.NESTS : int 17 11 10 5 9 10 8 39 16 14 …
$ NEST.DENSITY…HA : chr “13,0769230769231” “8,46153846153846” “10” “7,142857” …
$ FIRST.FLEDGL.JULIAN : int 45 52 52 44 58 NA 89 49 NA 57 …
$ NEST.HEIGHT : chr “77,1” “123,4” “102,9” “114,848484848485” …
$ WATER.DEPTH : chr “59,9” “7,4” “0” “5,78125” …
$ PROP.MANAG : int 0 85 0 0 0 20 100 0 15 0 …
$ JULIAN.DATE.MANAG : int 0 300 0 0 0 310 290 0 295 0 …
$ PEARCH.AVAIL.10m : int 1 1 2 4 3 4 3 1 4 2 …
$ PRECIP : chr “190,6” “184” “184” “184” …
$ PRECIP.DAY : chr “6,35333333333333” “5,75” “5,75” “5,75” …
$ AVG.MAX.DAILY.PREC : chr “24,6666666666667” “18,9” “18,9” “18,9” …
$ PREC.DAYS…10mm : int 4 5 5 5 5 5 10 10 10 10 …
$ MEAN.TEMP : chr “20,57” “20,3” “20,3” “20,3” …
$ PRECIP.AUG.APR : chr “553,5” “377,9” “377,9” “377,9” …
$ MEAN.TEMP.ANN : chr “12,4” “12,3” “12,3” “12,3” …
$ WIND..6B : int 21 29 29 29 29 29 33 33 33 33 …
$ WIND..8B : int 2 4 4 4 4 4 7 7 7 7 …
$ PROP.PRED.NESTS : chr “0,125” “0” “0,2” “0” …
$ PROP.ABAND.NESTS : chr “0,25” “0,273” “0,2” “0” …
$ PROP.PARA.NESTS : chr “0,059” “0” “0” “0,4” …
$ PROP.SUCC.NESTS : chr “0,588” “0,727” “0,6” “0,6” …
$ MIN.EGGS : int 2 1 3 3 2 1 1 1 2 2 …
$ MAX.EGGS : int 7 5 5 5 5 5 4 6 5 6 …
$ MEAN.EGGS : chr “3,353” “4,09” “4” “4,2” …
$ MIN.FLEDGLING : int 1 2 3 2 2 0 4 1 0 3 …
$ MAX.FLEDGLING : int 5 4 4 5 4 0 4 5 0 4 …
$ MEAN.FLEDGLING : chr “3,1” “3,5” “2” “2,2” …
$ MEAN.EGG.LOSSES : chr “0,176” “0,909” “1” “0,8” …
$ MEAN.NESTLING.LOSSES: chr “0,882” “0,364” “1” “1” …
$ MEAN.UNHATCHED : chr “0,471” “0,273” “0” “0,2” …
$ EGG.DSR : chr “0,9892” “0,9822” “0,9659” “0,9813” …
$ NESTLING.DSR : chr “0,9636” “0,992” “0,9629” “0,97” …
$ Z.VALUE : chr “2,31” “-1,43” “0,14” “0,69” …
$ P.VALUE : chr “0,0211” “0,1528” “0,8911” “0,4851” …
$ HATCHING.RATE : chr “0,851851851851852” “0,914285714285714” “1” “0,941176470588235” …
$ M.BEHAV : chr “” “” “” “” …
$ F.BEHAV : chr “” “” “” “” …
$ OVERALL.SUCCESS : chr “0,479219917494639” “0,669300512211985” “0,418948107828922” “0,520659094344318” …
Thank you, sincerely,
Thomas
Hi Thomas,
It seems like you have stored the numeric variables in a new data.frame called data_num, but then you are running the str() function on your old data.frame GRW_ANALYSIS. You can convert the variables in GRW_ANALYSIS one-by-one using the following R code:
I hope that helps!
Joachim
Thank you. Your page is really helpful.
Thanks a lot Suju, that’s really great to hear! ๐
Hello Joe
Hope you are doing well and keeping safe.
Your website is my frequent stop for any debugging issue. Compliments on these monumental efforts.
I keep encountering the following problem in R
Whenever in Data Frame there is a column with is actually numeric but R has treated it as String data.
There are two ways I approach and both fail
If I write StringAsFactor= T , the strings column gets converted to factor but then subsequently if I try to convert it to numeric I get all elements in that column coerced to NA
If I write StringAsFactor=F the strings column remains as it is but then subsequently if I try to convert it to numeric I get all elements in that column coerced to NA
Please Guide
SHRINIVAS DHARMA
Hey Shrinivas,
Nice to hear from you, I’m good and you? ๐
Thanks a lot for the awesome feedback, it’s really nice to see that you are still reading my website!
Regarding your question:
Usually, it should be possible to convert you string to numeric values using the as.numeric function. However, it seems like your strings are not formatted as proper numbers. Could you share some example values of your data?
Regards
Joachim
Good Afternoon Joachim,
I hope your day is going well. I’m having an issue converting a character column from excel data that I read in. My data column involves negative numbers and when I use the following code it does successfully convert to numeric, but the negative numbers are replaced with NA.
chars <- sapply(prob2, is.character)
as.data.frame(apply(prob2[chars],2,as.numeric))
this is the code I used. I'm new to R and could use any help. I also don't know why I had to put a 2 before the as.numeric.
Please Help,
Joshua Fallo
Hey Joshua,
Thanks, I’m fine, and you? ๐
It seems like your negative numbers are not formatted correctly. Is there maybe a space in those character strings (i.e. “- 1”)? In this case, you would have to remove this space before converting your data to numeric.
I hope that helps!
Joachim