R Error – Undefined Columns Selected when Subsetting Data Frame


In this article, I’ll illustrate how to debug the error message – undefined columns selected in the R programming language.

The content of the tutorial looks as follows:

Let’s jump right to the exemplifying R code.


Construction of Example Data

The following data will be used as basement for this R tutorial:

data <- data.frame(x1 = 1:4,    # Create example data
                   x2 = letters[1:4],
                   x3 = 3)
data                            # Print example data
#   x1 x2 x3
# 1  1  a  3
# 2  2  b  3
# 3  3  c  3
# 4  4  d  3

Have a look at the previous output of the RStudio console. It shows that our example data consists of four rows and three columns.


Example 1: Replicating Error: Undefined Columns Selected

This Example illustrates how and why we are sometimes getting the error message “undefined columns selected” when we try to extract a data frame subset. Consider the following R code:

data[data$x1 > 2] # Error: undefined columns selected
# Error in `[.data.frame`(data, data$x1 > 2) : undefined columns selected

As you can see, the previous syntax returned the error message “undefined columns selected”. The reason for this is that we didn’t specify whether we want to select certain rows or columns.

In the R programming language, this can be done by specifying a comma within square brackets. More on that in the next example…


Example 2: Fixing Error & Properly Subsetting Data Frame

Example 2 explains how to properly extract a data frame subset without getting the error message “undefined columns selected”. Have a look at the following R syntax:

data[data$x1 > 2, ]             # Comma after logical condition
#   x1 x2 x3
# 3  3  c  3
# 4  4  d  3

Works beautifully! The reason is that we have specified a comma after our logical condition. By doing this, the R programming language knows that we are selecting rows. If we would specify our logical condition after the comma, we would take a subset of variables.


Video & Further Resources

In case you need more information on the examples of the present tutorial, you may watch the following video of my YouTube channel. In the video, I’m explaining the R programming syntax of this tutorial:


The YouTube video will be added soon.


In addition, I can recommend to have a look at the related posts on this homepage:


In summary: In this article you learned how to solve the problem of undefined columns when subsetting data in R. If you have additional questions and/or comments, don’t hesitate to let me know in the comments below. Furthermore, please subscribe to my email newsletter in order to receive updates on the newest articles.


Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.

4 Comments. Leave new

  • I’m new to R and i’m not sure exactly where to place the comma in the code here:

    for (i in 1:length(factors[[1]])){
    alspac.table_ACE[,factors[[1]][i]] <- factor(alspac.table_ACE[,factors[[1]][i]])

  • Hello Joachim, again thank you so much for the last advice you gave it was very helpful. Today I’ve got another issue and would be glad if you could help me with this.
    I Got a data frame containing 8 variables and would like to extract one of the variables and store result in another data frame to do some basic text analysis by summarising the individual words in the PROD_NAME column.

    chr [1:264836] “Natural Chip Compny SeaSalt175g” “CCs Nacho Cheese 175g” “Smiths Crinkle Cut Chips Chicken 170g”

    I tried the following command but I each time have the same error code Can’t subset columns that don’t exist. Here’s the code

    productWords <- data.table(unlist(strsplit(unique(Transactions[, Transactions$PROD_NAME]), "")))

    Can you help please !!

    • Hey again 🙂

      I think you are looking for this code snippet:

      data1$new_col <- data2$your_col

      data1 is the name of the data frame to which you want to add a new column; new_col is the name of the new column; data2 is the data frame from which you would like to extract a column; and your_col is the name of the column that you would like to extract.



Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.