# Convert Data Frame Column to Numeric in R (2 Examples) | Change Factor, Character & Integer

In this R tutorial, I’ll explain how to convert a data frame column to numeric in R. No matter if you need to change the class of factors, characters, or integers, this tutorial will show you how to do it.

The article is structured as follows:

Let’s dive right in!

## Create Example Data

First we need to create some data in R that we can use in the examples later on:

```data <- data.frame(x1 = c(1, 5, 8, 2),       # Create example data frame
x2 = c(3, 2, 5, 2),
x3 = c(2, 7, 1, 2))
data\$x1 <- as.factor(data\$x1)                # First column is a factor
data\$x2 <- as.character(data\$x2)             # Second column is a character
data\$x3 <- as.integer(data\$x3)               # Third column is an integer
data                                         # Print data to RStudio console```

You can see the structure of our example data frame in Table 1. The data contains three columns: a factor variable, a character variable, and an integer variable.

Table 1: Example Data Frame with Factor, Character & Integer Variables.

We can check the class of each column of our data table with the sapply function:

```sapply(data, class)                          # Get classes of all columns
#       x1          x2          x3
# "factor" "character"   "integer"```

The data is set up, so let’s move on to the examples…

## Example 1: Convert One Variable of Data Frame to Numeric

In the first example I’m going to convert only one variable to numeric. For this task, we can use the following R code:

`data\$x1 <- as.numeric(as.character(data\$x1))  # Convert one variable to numeric`

Note: The previous code converts our factor variable to character first and then it converts the character to numeric. This is important in order to retain the values (i.e. the numbers) of the factor variable. You can learn more about that in this tutorial.

However, let’s check the classes of our columns again to see how our data has changed:

```sapply(data, class)                           # Get classes of all columns
#        x1          x2          x3
# "numeric" "character"   "integer"```

As we wanted: The factor column was converted to numeric.

If you need more explanation on the R syntax of Example 1, you might have a look at the following YouTube video. In the video, I’m explaining the previous R programming code in some more detail:

## Example 2: Change Multiple Columns to Numeric

In Example 1 we used the as.numeric and the as.character functions to modify one variable of our example data. However, when we want to change several variables to numeric simultaneously, the approach of Example 1 might be too slow (i.e. too much programming). In this example, I’m therefore going to show you how to change as many columns as you want at the same time.

First, we need to specify which columns we want to modify. In this example, we are converting columns 2 and 3 (i.e. the character string and the integer):

`i <- c(2, 3)                                  # Specify columns you want to change`

We can now use the apply function to change columns 2 and 3 to numeric:

```data[ , i] <- apply(data[ , i], 2,            # Specify own function within apply
function(x) as.numeric(as.character(x)))```

Let’s check the classes of the variables of our data frame:

```sapply(data, class)                           # Get classes of all columns
#        x1        x2        x3
# "numeric" "numeric" "numeric"```

The whole data frame was converted to numeric!

## Further Resources

Converting variable classes in R is a complex topic. I have therefore listed some additional resources about the Modification of R data classes in the following.

If you want to learn more about the basic data types in R, I can recommend the following video of the Data Camp YouTube channel:

Also, you could have a look at the following R tutorials of this homepage:

I hope you liked this tutorial! Let me know in the comments if you have any further questions and of cause I am also happy about general feedback.

Subscribe to the Statistics Globe Newsletter

• Julio Alfonso Chia Wong
September 15, 2019 11:35 pm

Excellent tutorial, it helped me a lot!

• Thank you very much! Nice to hear that ðŸ™‚

• Tarequzzaman
April 30, 2020 2:28 pm

data[ , i] <- apply(data[ , i], 2, # Specify own function within apply
function(x) as.numeric(as.character(x)))
what does this "2" means and why we use it ?? Please explain.

• Hi Tarequzzaman,

Thank you for your question. The 2 within the apply function specifies that we want to use the apply function by column. You may also specify a 1 instead, to use the apply function by row.

Regards,

Joachim

• Das war auch meine Frage:). Vielen Dank #Joachim fÃ¼r die klare Artikel

• Hi Mike,

vielen Dank fÃ¼r den netten Kommentar. Freu mich, dass dir die Antwort ebenfalls geholfen hat!

Viele GrÃ¼ÃŸe
Joachim

• You saved me at the night before exam

• That’s great to hear, I hope the exam went well! ðŸ™‚

• Best Tutorial on R . Please upload some more videos of this kind . Appreciates and best wishes

• Thanks a lot for this awesome feedback Joshy! I’ll definitely upload more videos like that ðŸ™‚

Regards

Joachim

• I have breast cancer data, from the TCGA, however when I uploaded it and try to read it always giving me the data are characters not numeric, the data is huge, so how can I solve this, how can I take the genes of my interest in and let the others?

• Hi Ali,

I’m sorry for the delayed reply. I was on a long vacation, so unfortunately I wasn’t able to get back to you earlier. Do you still need help with your syntax?

Regards,
Joachim

• This is working the variables are numeric now, but I still have a problem, some values are turned to NA

• Hey Saleh,

Is the warning message “NAs Introduced by Coercion” returned?

If so, please have a look here: https://statisticsglobe.com/warning-message-nas-introduced-by-coercion-in-r

Regards

Joachim

• Thank you! that was helpful.

I used the function gsub to substitute “,” by “.” to overcome the coercion issue.

• Thanks a lot for the kind words Saleh, glad you found a solution! ðŸ™‚

• Prateek Singh
October 1, 2021 9:17 pm

Hi Joachim,

Could you please answer a situation where we need to keep such characters. For eg, “1990-93”, if such data is there in a column and we cannot omit “-” there.

• Hey Prateek,

In this case, it is not possible to use the numeric class. You would have to use the character or factor class instead.

Regards

Joachim

• I uploaded some files that I found on the internet. It is the historical data of some companies, this is a school project, the project is to optimize the investment portfolio and see how the numbers of the companies develop and which of all is the best option. Sorry for writing so much; but I wanted to make it clear in context.
The columns of these files have a class of “character” which makes it difficult to do something .. So I took on the task of changing the class of the columns. I leave you here the code that I used. it happened that many values â€‹â€‹were deleted. And now I don’t even know how to return the file to how it was before.

• Hey JR,

I cannot see the code or data you have used. Have you maybe forgotten to include it in your comment?

Regards,
Joachim

• what if you had x1 – x2000 , and in that range you had 400 random columns you wanted to convert to numeric. Is there a way to do the conversion without having to manually enter each of the 400 columns in a vector?

• Hey Frank,

Please have a look at this tutorial. It shows how to change data types of columns automatically to the appropriate data type.

Regards,
Joachim

• Hey Frank

How do you convert X1-X2000 columns to numeric at once?

Thanks

G.

• Hey,

I’m not sure who Frank is ðŸ˜‰ However, this should be possible by changing i in Example 2 to 1:2000.

Regards,
Joachim

• Ana Cecilia Ramirez Licon
September 5, 2022 8:12 pm

Hello, Joachim.

I really liked this tutorial. Quick question, is there anyway I can use the for loop to convert columns in data.frame to numeric? Here is the code I have been trying to use, using your data.frame example:

i <- (2,3) #to establish the columns I want to change.

for(i in data[,i]){
if(is.integer(data[,i])
as.numeric(as.character(data[,i]))}

I have been trying with different variations of this code but everything marks an error. If you could tell me what am I doing wrong, I would really appreciate it. Thank you and have a nice day.

• Hey Ana,

Thank you for the kind words, glad you liked the tutorial.

Yes, this is possible. Please have a look at the following example code:

```for(i in 1:ncol(data)) {
if(is.integer(data[ , i])) {
data[ , i] <- as.numeric(as.character(data[ , i]))
}
}```

Regards,
Joachim

• Ana Cecilia Ramirez Licon
September 6, 2022 4:04 pm

Thank you so much for answering my question. You have helped me a lot. Have a great day.

• This is great to hear, glad it helped! ðŸ™‚

Regards,
Joachim

• Hi Joaquim,

thanks for this tutorial ðŸ™‚ it worked fine with my data.
one question: all values are rounded (ie: 101.2179 is now 101).
is there a way to keep the original format?

• Hello Costanza,

Sorry for the late response. The numbers shouldn’t be rounded. If you haven’t solved your problem yet, could you please share your code? Then I can check it.

Regards,
Cansu

• Hi Cansu,
Thanks for replying. I realised it had to do with the visualization of the console in R. When I downloaded the file, data maintained the decimals.
Thanks again and I hope you have a great week.
Cheers.

• Hey Constanz,

Perfect! Thank you for the information, it is good to know. Have a good one! ðŸ™‚

Regards,
Cansu

• Hello, thank you for the helpful information!
I have a dataset called “a” and variable called “cancer_num”. The variable cancer_num is indexed in column number 113.
However, when I tried to run the following 2 codes, it gave different results
class(a\$cancer_num)
class(a [, 113])
The first one returns “numeric”, and the second returns “tbl_df”.
I am sure that the index number of the cancer_num variable is correct as 113.
I tried to check with other variables as well, and they also gave different results. If I use the first syntax, they return correctly as numeric, factor, etc. However, the second syntax always returns “tbl_df”.
Any idea why they give different results?
Thank you!