Determine Classes of All Data Frame Columns in R (2 Examples)

 

In this article you’ll learn how to identify the data types of all variables of a data frame in the R programming language.

Table of contents:

Let’s jump right to the exemplifying R syntax!

 

Creation of Example Data

We’ll use the following data frame as basis for the following examples:

data <- data.frame(x1 = 1:5,                                # Create example data
                   x2 = letters[1:5],
                   x3 = c("g1", "g2", "g1", "g3", "g2"))
data$x1 <- as.integer(data$x1)
data$x2 <- as.character(data$x2)
data                                                        # Print example data
#   x1 x2 x3
# 1  1  a g1
# 2  2  b g2
# 3  3  c g1
# 4  4  d g3
# 5  5  e g2

As you can see based on the output of the RStudio console, our example data contains five rows and three columns. The variables contain different types of information.

Now, let’s inspect the classes of all columns of our example data frame!

 

Example 1: Get Data Type with sapply Function

Example 1 illustrates how to determine the classes of our variables with the sapply function:

sapply(data, class)                                         # Inspect classes with sapply
#        x1          x2          x3 
# "integer" "character"    "factor"

The variable x2 is an integer; the variable x2 is a character; and the variable x3 is a factor.

 

Example 2: Get Data Type with str Function

Another alternative for the identification of column formats is the str function:

str(data)                                                   # Inspect classes with str
# 'data.frame':	5 obs. of  3 variables:
# $ x1: int  1 2 3 4 5
# $ x2: chr  "a" "b" "c" "d" ...
# $ x3: Factor w/ 3 levels "g1","g2","g3": 1 2 1 3 2

As you can see, the str function returns more information than just the variable types. Depending on your personal needs, the str function may fit better or worse than sapply.

 

Video & Further Resources

Would you like to know more about data types in R? Then you might watch the following video which I have published on my YouTube channel. I explain the R code of this tutorial in the video:

 

 

Furthermore, you may have a look at the related tutorials of my website.

 

In this article you learned how to find the data formats of multiple columns in the R programming language. Let me know in the comments, if you have further questions or comments. Furthermore, don’t forget to subscribe to my email newsletter to receive updates on the newest tutorials.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • How do I make sure that each column in my data frame has been specified
    as the appropriate class?
    note, I am studying linear regression

    Reply
    • Hey Manal,

      As far as I know, there is no other approach than going through your variables one by one, since the class depends very much on the structure of your specific data and the content of its variables.

      Regards,

      Joachim

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top