Calculate Correlation Matrix Only for Numeric Columns in R (2 Examples)
In this tutorial, I’ll explain how to apply the cor function only to numeric variables in the R programming language.
The content is structured as follows:
If you want to know more about these content blocks, keep reading!
Creation of Exemplifying Data
As a first step, I’ll need to create some example data:
set.seed(972634) # Create example data data <- data.frame(x1 = rnorm(100), x2 = runif(100), x3 = letters[1:10], x4 = rpois(100, 3)) head(data) # Print example data # x1 x2 x3 x4 # 1 -0.1690348 0.99026508 a 2 # 2 -0.5750698 0.82617886 b 1 # 3 2.4263886 0.58398940 c 7 # 4 0.3364569 0.07942794 d 4 # 5 0.4823497 0.22178303 e 4 # 6 0.3244318 0.39867537 f 2
Have a look at the previous output of the RStudio console. It shows that our example data has four columns. The variables x1, x2, and x4 are numeric, but the variable x3 is a factor.
Example 1: Error in cor(data) : ‘x’ must be numeric
This example shows the problem with non-numeric variables when computing a correlation matrix in R.
Consider the following R code:
cor(data) # Trying to apply cor function # Error in cor(data) : 'x' must be numeric
As you can see, the error message “‘x’ must be numeric” was returned. The reason for this is that the cor function cannot handle non-numeric columns in the input data.
Let’s fix this!
Example 2: Applying cor() Function Only to Numeric Variables
In this example, I’ll show how to apply the cor function only to numeric columns of a data frame.
For this, we can select a subset of all numeric variables in our data using the unlist, lapply, and is.numeric functions as shown below:
cor(data[, unlist(lapply(data, is.numeric))]) # Properly apply cor # x1 x2 x4 # x1 1.000000000 -0.003369401 0.1028944 # x2 -0.003369401 1.000000000 0.1382353 # x4 0.102894402 0.138235274 1.0000000
As you can see, the previously shown correlation matrix contains only the numeric columns of our example data.
Video & Further Resources
I have recently published a video on my YouTube channel, which illustrates the R programming syntax of this tutorial. You can find the video below.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Besides the video, you may read the other posts on this website. I have released numerous articles about related topics such as extracting data, importing data, and numeric values:
- Only Import Selected Columns of Data in R
- Use apply Function Only for Specific Data Frame Columns
- Correlation Matrix in R
- Select Only Numeric Columns from Data Frame
- R Programming Examples
At this point you should have learned how to correlation matrices in the R programming language. If you have any further questions, let me know in the comments.
Statistics Globe Newsletter