How to Convert a Character to Numeric in R

 

Basic R Syntax:

x_num <- as.numeric(x)

 

Do you need more explanations? In the following article, I’ll provide you with all important information for the conversion of character vectors to numeric in R.

 

Example: Convert Character to Numeric in R

Before we can dive into the transformation of a character variable to numeric, we need to create an example character in R. Consider the following vector:

set.seed(55555)                                              # Set seed
x <- as.character(sample(c(2, 5, 7, 8), 50, replace = TRUE)) # Example character vector

Our string consists of the four character values 2, 5, 7 & 8:

x                                                            # Print example vector to R console

 

How to Convert Character to Numeric Example Vector

Graphic 1: Example Character String Printed to the RStudio Console

 

Now, we can continue with the important part – How to convert this character string to numeric?

No Problem:

x_num <- as.numeric(x)                                 # Convert string to numeric in R
x_num                                                  # Print converted x to the console
# 8 7 5 8 2 5 2 5 2 7 7 7 7...

That’s basically how to apply the as.numeric function in R. However, if you need more explanations, have a look at the following video of my YouTube channel. I’m explaining the topics of this post in the video.

 

 

Convert All Characters of a Data Frame to Numeric

As you have seen, to convert a vector or variable with the character class to numeric is no problem. However, sometimes it makes sense to change all character columns of a data frame or matrix to numeric.

Consider the following R data.frame:

x1 <- c("5", "2", "7", "5")                            # Character
x2 <- c("77", "23", "84", "11")                        # Another character
x3 <- as.factor(c("4", "1", "1", "8"))                 # Factor
x4 <- c(3, 3, 9, 7)                                    # Numeric
 
data <- data.frame(x1, x2, x3, x4,                     # Create data frame
                   stringsAsFactors = FALSE)
sapply(data, class)                                    # Print classes of all colums
#          x1          x2          x3          x4 
# "character" "character"    "factor"   "numeric"

 

Data Frame with Character, Factor & Numeric Columns

Table 1: Example Data Frame with Different Variable Classes

 

With the following R code, you are able to recode all variables – no matter which variable class – of a data frame to numeric:

data_num <- as.data.frame(apply(data, 2, as.numeric))  # Convert all variable types to numeric
sapply(data_num, class)                                # Print classes of all colums
#        x1        x2        x3        x4 
# "numeric" "numeric" "numeric" "numeric"

 

However, in many situations it is better to convert only character columns to numeric (i.e. not column X3, since this column should be kept as factor). You could do that with the following code in R:

char_columns <- sapply(data, is.character)             # Identify character columns
data_chars_as_num <- data                              # Replicate data
data_chars_as_num[ , char_columns] <- as.data.frame(   # Recode characters as numeric
  apply(data_chars_as_num[ , char_columns], 2, as.numeric))
sapply(data_chars_as_num, class)                       # Print classes of all colums
#        x1        x2        x3        x4 
# "numeric" "numeric"  "factor" "numeric"

 

Further Example Videos & Resources: How to Change Variable Types

Are you looking for more explanations on how to convert different data types? Have a look at this video on my YouTube channel:

 

 

Further examples needed? Have a look at the following R Programming tutorial of the YouTube Channel LearnR. The speaker discusses different data transformations from one data class to another.

 

 

Further Reading

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


42 Comments. Leave new

  • I spent all afternoon scrolling through Stack Overflow trying to figure out how to do this. Once I found it on your site, it took 5 minutes. Thank you!!

    Reply
  • How can I keep this from converting my rowname chars to numeric?

    Reply
  • I just had 3 variables in a 17-variable data set to convert from character to numeric. This didn’t help at all I’m afraid.

    Reply
  • Hi. How to stop getting the “Coerced to NA” warning?

    Reply
    • Hi Vaibhav,

      This typically happens when your characters are not representing numbers (e.g. “5!”, “1,55”, “seven”). Have you checked how your data is formatted before converting it?

      Regards,

      Joachim

      Reply
      • Alexandre Jordao Chauque
        May 3, 2021 8:23 am

        This comment saved my day. I spent almost 3 hours trying to sort out similar problem with my data and then I found it to solve it. Thank you Joachim

        Reply
      • Good Day, my issue is i am getting (as.numeric(dailyActivity_merged$ActivityDate)) :
        NAs introduced by coercion, i am a beginner please how do i format my data before converting it. I
        want to convert a date column which is a charter to numeric and also change the format to yyyymmdd .

        Reply
        • Hello Temitope,

          Are you sure that the class of your Activity Date column is a character? Could you please check it first with class(dailyActivity_merged$ActivityDate)? Besides, what is the reason that you would like to use it in a numeric class, not in a date class?

          Regards,
          Cansu.

          Reply
  • I have factors with levels and labels; how do I convert them to numeric

    Reply
  • Hello Jo

    I have a data frame which is all in text. Recent Kaggle competition (https://www.kaggle.com/c/kaggle-survey-2020)

    While converting from Character to Numeric, I am having problem of NA Coercion

    So I saw your Youtube Video and then looked up the above tutorial. Here a2 is the original data set of Kaggle imported as Header F.

    However, I am still getting NA coercion. Please help

    data <- data.frame(a2$V2, a2$V3, a2$V4, a2$V5, stringsAsFactors = FALSE)
    sapply(data, class)

    a2.V2 a2.V3 a2.V4 a2.V5
    "numeric" "character" "character" "character"

    data1 sapply(data_num, class)
    Error in lapply(X = X, FUN = FUN, …) : object ‘data_num’ not found

    > sapply(data1, class)

    a2.V2 a2.V3 a2.V4 a2.V5
    “numeric” “numeric” “numeric” “numeric”

    > head(data1, 6)

    a2.V2 a2.V3 a2.V4 a2.V5
    1 NA NA NA NA
    2 NA NA NA NA
    3 NA NA NA NA
    4 NA NA NA NA
    5 NA NA NA NA
    6 NA NA NA NA
    >
    > data sapply(data, class)
    a2.V2 a2.V3 a2.V4 a2.V5
    “numeric” “character” “character” “character”
    > data1 sapply(data_num, class)
    Error in lapply(X = X, FUN = FUN, …) : object ‘data_num’ not found
    > sapply(data1, class)
    a2.V2 a2.V3 a2.V4 a2.V5
    “numeric” “numeric” “numeric” “numeric”
    > head(data1,6)
    a2.V2 a2.V3 a2.V4 a2.V5
    1 NA NA NA NA
    2 NA NA NA NA
    3 NA NA NA NA
    4 NA NA NA NA
    5 NA NA NA NA
    6 NA NA NA NA
    >

    Reply
    • Hey Shrinivas,

      I hope you doing fine?

      Regarding your question, I would have a look at two things:

      1) The NA coercion problem usually appears, because the character numbers are not formatted properly. More info: https://statisticsglobe.com/warning-message-nas-introduced-by-coercion-in-r

      2) It seems like you are trying to use a data frame that does not exist in your workspace. For that reason you get the error “object ‘data_num’ not found”. Please check if this data frame really exists.

      I hope that helps!

      Joachim

      Reply
  • Dear Joachim,

    thanks for your great videos and website. This time however, I wasn’t able to solve my problem as its more specific but I hope you can bring more light into this:
    I have a given column (CV that I want to test in a ANCOVA) in my data set which contains numbers similar to this “-,038040659585351” and its structure is character by default. Now nothing has worked to convert this into numeric. It either got changed into NAs or a whole lot of different numbers.

    Do you happen to have an idea to convert it into numeric?

    Many thanks in advance!

    Leo

    Reply
  • Hi Joachim,

    I really appreciate your examples. I am struggling to figure out how to convert a character field with the values Y or N to a Numeric where the Y=1 and the N=0. Similarly, I need to understand how to change Education Levels e.g., Sixth grade to 6, 9th grade to 9th, High School Diploma to 12, … PhD to 22 etc.

    Any help you can provide with this is much appreciated.

    Thanks again.

    Reply
  • Thomas Oliver Mérő
    March 26, 2021 2:44 pm

    Hi Joachim,

    I was trying to convert some character variables into numeric variables accrding to your recipe. As you can see I managed to convert them. However, when I checked the entire data set with str() function, its clearly seen that, the variables are still presented as character variables. And they still behave strange when I run them in LM. How can I convert every variable containing numbers into numeric variables in R. Please see below:

    > F.BEHAV M.BEHAV OVERALL.SUCCESS i data_num sapply(data_num, class)
    M.BEHAV F.BEHAV OVERALL.SUCCESS
    “numeric” “numeric” “numeric”

    When using the str() function:

    > str(GRW_ANALYSIS)
    ‘data.frame’: 94 obs. of 42 variables:
    $ LOCATION : chr “Bager” “Bager” “Kigyos kolut” “Pista” …
    $ HABITAT : chr “MP” “MP” “SC1” “MP” …
    $ HABITAT.ID : int 1 1 4 1 5 3 5 1 3 5 …
    $ YEAR : int 2008 2009 2009 2009 2009 2009 2010 2010 2010 2010 …
    $ NO.OF.NESTS : int 17 11 10 5 9 10 8 39 16 14 …
    $ NEST.DENSITY…HA : chr “13,0769230769231” “8,46153846153846” “10” “7,142857” …
    $ FIRST.FLEDGL.JULIAN : int 45 52 52 44 58 NA 89 49 NA 57 …
    $ NEST.HEIGHT : chr “77,1” “123,4” “102,9” “114,848484848485” …
    $ WATER.DEPTH : chr “59,9” “7,4” “0” “5,78125” …
    $ PROP.MANAG : int 0 85 0 0 0 20 100 0 15 0 …
    $ JULIAN.DATE.MANAG : int 0 300 0 0 0 310 290 0 295 0 …
    $ PEARCH.AVAIL.10m : int 1 1 2 4 3 4 3 1 4 2 …
    $ PRECIP : chr “190,6” “184” “184” “184” …
    $ PRECIP.DAY : chr “6,35333333333333” “5,75” “5,75” “5,75” …
    $ AVG.MAX.DAILY.PREC : chr “24,6666666666667” “18,9” “18,9” “18,9” …
    $ PREC.DAYS…10mm : int 4 5 5 5 5 5 10 10 10 10 …
    $ MEAN.TEMP : chr “20,57” “20,3” “20,3” “20,3” …
    $ PRECIP.AUG.APR : chr “553,5” “377,9” “377,9” “377,9” …
    $ MEAN.TEMP.ANN : chr “12,4” “12,3” “12,3” “12,3” …
    $ WIND..6B : int 21 29 29 29 29 29 33 33 33 33 …
    $ WIND..8B : int 2 4 4 4 4 4 7 7 7 7 …
    $ PROP.PRED.NESTS : chr “0,125” “0” “0,2” “0” …
    $ PROP.ABAND.NESTS : chr “0,25” “0,273” “0,2” “0” …
    $ PROP.PARA.NESTS : chr “0,059” “0” “0” “0,4” …
    $ PROP.SUCC.NESTS : chr “0,588” “0,727” “0,6” “0,6” …
    $ MIN.EGGS : int 2 1 3 3 2 1 1 1 2 2 …
    $ MAX.EGGS : int 7 5 5 5 5 5 4 6 5 6 …
    $ MEAN.EGGS : chr “3,353” “4,09” “4” “4,2” …
    $ MIN.FLEDGLING : int 1 2 3 2 2 0 4 1 0 3 …
    $ MAX.FLEDGLING : int 5 4 4 5 4 0 4 5 0 4 …
    $ MEAN.FLEDGLING : chr “3,1” “3,5” “2” “2,2” …
    $ MEAN.EGG.LOSSES : chr “0,176” “0,909” “1” “0,8” …
    $ MEAN.NESTLING.LOSSES: chr “0,882” “0,364” “1” “1” …
    $ MEAN.UNHATCHED : chr “0,471” “0,273” “0” “0,2” …
    $ EGG.DSR : chr “0,9892” “0,9822” “0,9659” “0,9813” …
    $ NESTLING.DSR : chr “0,9636” “0,992” “0,9629” “0,97” …
    $ Z.VALUE : chr “2,31” “-1,43” “0,14” “0,69” …
    $ P.VALUE : chr “0,0211” “0,1528” “0,8911” “0,4851” …
    $ HATCHING.RATE : chr “0,851851851851852” “0,914285714285714” “1” “0,941176470588235” …
    $ M.BEHAV : chr “” “” “” “” …
    $ F.BEHAV : chr “” “” “” “” …
    $ OVERALL.SUCCESS : chr “0,479219917494639” “0,669300512211985” “0,418948107828922” “0,520659094344318” …

    Thank you, sincerely,
    Thomas

    Reply
    • Hi Thomas,

      It seems like you have stored the numeric variables in a new data.frame called data_num, but then you are running the str() function on your old data.frame GRW_ANALYSIS. You can convert the variables in GRW_ANALYSIS one-by-one using the following R code:

      GRW_ANALYSIS$M.BEHAV <- as.numeric(GRW_ANALYSIS$M.BEHAV)
      GRW_ANALYSIS$F.BEHAV <- as.numeric(GRW_ANALYSIS$F.BEHAV)
      GRW_ANALYSIS$OVERALL.SUCCESS <- as.numeric(GRW_ANALYSIS$OVERALL.SUCCESS)

      I hope that helps!

      Joachim

      Reply
  • Thank you. Your page is really helpful.

    Reply
  • SHRINIVAS DHARMADHIKARI
    June 13, 2021 2:53 am

    Hello Joe
    Hope you are doing well and keeping safe.
    Your website is my frequent stop for any debugging issue. Compliments on these monumental efforts.

    I keep encountering the following problem in R

    Whenever in Data Frame there is a column with is actually numeric but R has treated it as String data.

    There are two ways I approach and both fail

    If I write StringAsFactor= T , the strings column gets converted to factor but then subsequently if I try to convert it to numeric I get all elements in that column coerced to NA

    If I write StringAsFactor=F the strings column remains as it is but then subsequently if I try to convert it to numeric I get all elements in that column coerced to NA

    Please Guide

    SHRINIVAS DHARMA

    Reply
    • Hey Shrinivas,

      Nice to hear from you, I’m good and you? 🙂

      Thanks a lot for the awesome feedback, it’s really nice to see that you are still reading my website!

      Regarding your question:

      Usually, it should be possible to convert you string to numeric values using the as.numeric function. However, it seems like your strings are not formatted as proper numbers. Could you share some example values of your data?

      Regards

      Joachim

      Reply
  • Good Afternoon Joachim,
    I hope your day is going well. I’m having an issue converting a character column from excel data that I read in. My data column involves negative numbers and when I use the following code it does successfully convert to numeric, but the negative numbers are replaced with NA.
    chars <- sapply(prob2, is.character)
    as.data.frame(apply(prob2[chars],2,as.numeric))
    this is the code I used. I'm new to R and could use any help. I also don't know why I had to put a 2 before the as.numeric.
    Please Help,
    Joshua Fallo

    Reply
    • Hey Joshua,

      Thanks, I’m fine, and you? 🙂

      It seems like your negative numbers are not formatted correctly. Is there maybe a space in those character strings (i.e. “- 1”)? In this case, you would have to remove this space before converting your data to numeric.

      I hope that helps!

      Joachim

      Reply
  • hi Joachim,

    Thanks for the tutorial. I am having the issue at third step, will you be able to help please?
    Here are the details:

    > sapply(data2, class) # Print classes of all colums
    ID conc value
    “factor” “character” “numeric”
    > char_columns data_chars_as_num data_chars_as_num[ , char_columns] sapply(data_chars_as_num, class) # Print classes of all colums
    ID conc value
    “factor” “character” “numeric”

    Thanks.

    Regards,
    Clara

    Reply
    • Hey Clara,

      Thanks for the comment! Could you please explain what your issue exactly is? Is there an unexpected output, or did you get any error messages?

      Regards,
      Joachim

      Reply
  • Hi Joachim,
    I ran this:
    compare_df_cols(oct21, nov21, dec21, jan22, feb22, mar22, apr22, may22, jun22,
    jul22, sept22, return = “mismatch”)
    and got this result:
    column_name oct21 nov21 dec21 jan22 feb22 mar22 apr22 may22
    1 end_lat numeric numeric numeric numeric numeric character numeric numeric
    2 end_lng numeric numeric numeric numeric numeric character numeric numeric

    What code do I run to change that of “mar22” to numeric as the others? Dont know if this is the reason why my rbind and bind_rows don’t work.

    Reply
    • Hi Layefa,

      I apologize for the delayed reply. I was on a long holiday, so unfortunately I wasn’t able to get back to you earlier. Do you still need help with your syntax?

      Regards,
      Joachim

      Reply
  • Dear Joachim,

    John here, Hope you are doing good!
    I need a help, I have 3000 rows dataset, some of the columns have values like Simple, Medium and High. Would you be able to help to me to convert the values into Simple is 1, Medium is 2 and High is 3. so that I will be able to do ggplot visualization using the data. Thank you so much for your tutorial, It is really helping me a lot.

    Best Regards,
    John

    Reply
    • Hello John,

      Sorry for the late response, we were a bit busy with some new content creation. I hope you are doing well 🙂 I am afraid that you can not convert a string like “Simple”, etc. to a number directly via the method shown in this tutorial. What you can do is a simple data manipulation as follows.

      data$column[data$column=="Simple"]<-1
      data$column[data$column=="Medium"]<-2
      data$column[data$column=="High"]<-3
      head(data)

      For further content on data manipulation, you can check our tutorial about data manipulation!

      Regards,
      Cansu

      Reply
  • Rafael Alexandrino
    December 12, 2022 7:38 pm

    Hi! Thanks a lot. I just didn’t manage to do one thing. My data just starts at the 4th row, so I wanted to transform to numeric values skipping the first 3 rows. Can I do that? If not, what is the solution?

    Reply
  • Hello,
    I could change my factor to numeric, but all the numbers that have decimals, when I charge the data the program write NA, it only read the numbers that doesn’t have decimals. If someone could tell me what can i do please.

    Reply
  • Completely new to R so excuse my lack of understanding

    I have got a treatment group which has three levels when imported the column has come up as a character, with the treatments reading “Evit000” “Evit100” and “Evit200”, It will allow conversion to a factor fine and assigns correctly.
    However, when looking at conversion to a number- 0,100 and 200 respectively it assigns Na to all columns and will not allow for differentiation between them; is there a way as further down the line I would need to multiply the dose to feed intake which is in a different column.
    I noticed this page only allows different assignments when the data is already in a numeric format and has just not been read in correctly.

    many thanks

    Reply
    • Hello Amy,

      If I understand you correctly, you want to create a variable with numbers 0, 100, and 200. If so, you should apply a different data manipulation. As you have noticed, you can only convert the data type to numeric as long as your data is in a numeric format but has a factor or character data type.

      data<-data.frame(evit=c("Evit000" , "Evit000", "Evit000" , "Evit100", "Evit100", "Evit200","Evit200","Evit200" ))
      data 
      #   evit
      # 1 Evit000
      # 2 Evit000
      # 3 Evit000
      # 4 Evit100
      # 5 Evit100
      # 6 Evit200
      # 7 Evit200
      # 8 Evit200
       
      data$doses[data$evit=="Evit000"]<-0
      data$doses[data$evit=="Evit100"]<-100
      data$doses[data$evit=="Evit200"]<-200
      data
      #      evit doses
      # 1 Evit000     0
      # 2 Evit000     0
      # 3 Evit000     0
      # 4 Evit100   100
      # 5 Evit100   100
      # 6 Evit200   200
      # 7 Evit200   200
      # 8 Evit200   200

      Regards,
      Cansu

      Reply
  • Hi,
    I am doing senior projects and i have problem with running variables for multiple linear regression. I want to run heating map (or do correlation) before running ml function. However, I need to convert all 79 variables to numeric. Do you have any advice on this? I think some variables do not make sense to convert to numeric.

    Reply
    • Hello Nhi,

      I don’t know your dataset well, but if your variables are not compatible with conversion to numeric, maybe you should consider a different method to evaluate the associations. For instance, the chi-square test. See other alternatives on this page.

      Regards,
      Cansu

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top