# Select Only Numeric Columns from Data Frame in R (Example)

In this tutorial, I’ll explain how to subset only numeric variables from a data frame in the R programming language.

The article consists of these contents:

It’s time to dive into the examples.

## Creation of Exemplifying Data

In the examples of this tutorial, we will use the following data frame in R:

```data <- data.frame(x1 = 1:5,                         # Create example data frame
x2 = LETTERS[1:5],
x3 = 2,
x4 = factor(c(1, 3, 2, 2, 1)),
stringsAsFactors = FALSE)
data                                                 # Print example data to console
# x1 x2 x3 x4
#  1  A  2  1
#  2  B  2  3
#  3  C  2  2
#  4  D  2  2
#  5  E  2  1```

Our example data contains four columns and five rows.

Let’s use the str() function to have a look at the variable classes of our columns:

```str(data)                                            # Inspect variable classes
# 'data.frame':	5 obs. of  4 variables:
# \$ x1: int  1 2 3 4 5
# \$ x2: chr  "A" "B" "C" "D" ...
# \$ x3: num  2 2 2 2 2
# \$ x4: Factor w/ 3 levels "1","2","3": 1 3 2 2 1```

As you can see based on the output of the RStudio console, the columns x1 and x3 are numeric. x2 is a character string and x4 is a factor variable.

Next, I’ll show you how to extract only numeric columns from our data set. Keep on reading!

## Example 1: Extract Numeric Columns from Data Frame [Base R]

In Example 1, I’ll show you how to subset numeric data with the base installation of the R programming language.

First, we need to identify all columns that are numeric. For this task, we can use a combination of the R functions unlist(), lapply(), and is.numeric():

```num_cols <- unlist(lapply(data, is.numeric))         # Identify numeric columns
num_cols
# x1    x2    x3    x4
# TRUE FALSE  TRUE FALSE```

As you can see, the previous R code returned a logical vector illustrating which of our variables are numeric.

Now, we can use this logical vector to take a subset of our data frame:

```data_num <- data[ , num_cols]                        # Subset numeric columns of data
data_num                                             # Print subset to RStudio console
# x1 x3
#  1  2
#  2  2
#  3  2
#  4  2
#  5  2```

The remaining subset only contains the numeric columns (i.e. x1 and x3). Looks good!

## Example 2: Extract Numeric Columns from Data Frame [dplyr Package]

You might say the previous R code of Example 1 was a bit difficult to remember. Fortunately, the dplyr package provides a much simpler solution for the subsetting of numeric columns from a data frame.

First, we need to install and load the dplyr package in R:

```install.packages("dplyr")                            # Install dplyr

Now, we can use the select_if function of the dplyr package as shown below:

```data_num2 <- select_if(data, is.numeric)             # Subset numeric columns with dplyr
data_num2                                            # Print subset to RStudio console
# x1 x3
#  1  2
#  2  2
#  3  2
#  4  2
#  5  2```

As you can see, the output is exactly the same, but the R syntax was much easier to apply.

## Video, Further Resources & Summary

Do you need further explanations on the R code of this article? Then you could have a look at the following video of my YouTube channel. In the video, I’m explaining the R codes of this article:

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party. If you accept this notice, your choice will be saved and the page will refresh.

In addition, I can recommend to read the related posts of https://statisticsglobe.com/:

To summarize: On this page you learned how to clean your data from non-numeric variables in the R programming language. In case you have additional comments and/or questions, let me know in the comments section.

Subscribe to the Statistics Globe Newsletter

• Gilbert
September 28, 2019 9:21 pm

Clearly explained!

• September 30, 2019 7:27 am