R Error – Undefined Columns Selected when Subsetting Data Frame
In this article, I’ll illustrate how to debug the error message – undefined columns selected in the R programming language.
The content of the tutorial looks as follows:
Let’s jump right to the exemplifying R code.
Construction of Example Data
The following data will be used as basement for this R tutorial:
data <- data.frame(x1 = 1:4, # Create example data x2 = letters[1:4], x3 = 3) data # Print example data # x1 x2 x3 # 1 1 a 3 # 2 2 b 3 # 3 3 c 3 # 4 4 d 3
Have a look at the previous output of the RStudio console. It shows that our example data consists of four rows and three columns.
Example 1: Replicating Error: Undefined Columns Selected
data[data$x1 > 2] # Error: undefined columns selected # Error in `[.data.frame`(data, data$x1 > 2) : undefined columns selected
As you can see, the previous syntax returned the error message “undefined columns selected”. The reason for this is that we didn’t specify whether we want to select certain rows or columns.
In the R programming language, this can be done by specifying a comma within square brackets. More on that in the next example…
Example 2: Fixing Error & Properly Subsetting Data Frame
Example 2 explains how to properly extract a data frame subset without getting the error message “undefined columns selected”. Have a look at the following R syntax:
data[data$x1 > 2, ] # Comma after logical condition # x1 x2 x3 # 3 3 c 3 # 4 4 d 3
Works beautifully! The reason is that we have specified a comma after our logical condition. By doing this, the R programming language knows that we are selecting rows. If we would specify our logical condition after the comma, we would take a subset of variables.
Video & Further Resources
In case you need more information on the examples of the present tutorial, you may watch the following video of my YouTube channel. In the video, I’m explaining the R programming syntax of this tutorial:
In addition, I can recommend to have a look at the related posts on this homepage:
- Subset Data Frame Rows by Logical Condition
- Extract Subset of Data Frame Rows Containing NA
- Select First Row of Each Group in Data Frame
- Split Data Frame Randomly
- message, warning & stop Functions in R
- The R Programming Language
In summary: In this article you learned how to solve the problem of undefined columns when subsetting data in R. If you have additional questions and/or comments, don’t hesitate to let me know in the comments below. Furthermore, please subscribe to my email newsletter in order to receive updates on the newest articles.