stringsAsFactors Argument of data frame Function in R (2 Examples)

 

In this tutorial, I’ll explain how to apply the stringsAsFactors argument of the data.frame function in R programming.

Table of contents:

Let’s start right away.

 

Example 1: Keep Character Class of Columns when Creating a Data Frame

In Example 1, I’ll explain how to keep the character class for variables of a data frame when creating a new data frame in R.

In this case, we have to specify the stringsAsFactors function to be equal to FALSE as shown in the following R code:

data1 <- data.frame(x1 = 5:1,    # Specifying stringsAsFactors = FALSE
                    x2 = letters[1:5],
                    x3 = letters[9:5],
                    stringsAsFactors = FALSE)
data1                            # Print data frame

 

table 1 data frame stringsasfactors argument data frame function r

 

The output of the previous code is shown in Table 1 – We have created a data frame with three columns.

Let’s apply the class() function to check the data types of our variables:

sapply(data1, class)             # Check classes of data frame columns
#          x1          x2          x3 
#   "integer" "character" "character"

The previous R code shows the class of each column, i.e. integer, character, and character.

Please note that stringsAsFactors = FALSE is the default specification of the data.frame function, in case you are using R version 4.0 or newer. In older versions, the default specification was stringsAsFactors = TRUE.

This is also explained in the help documentation of the data.frame function:

stringsAsFactors

logical: should character vectors be converted to factors? The ‘factory-fresh’ default has been TRUE previously but has been changed to FALSE for R 4.0.0. Only as short time workaround, you can revert by setting options(stringsAsFactors = TRUE) which now warns about its deprecation.

Anyway, let’s see what else we can do with the stringsAsFactors argument…

 

Example 2: Convert Character Columns to Factors when Creating a Data Frame

The following R programming code explains how to automatically convert characters to factors when creating a new data frame.

For this, we simply need to specify the logical value TRUE to the stringsAsFactors argument:

data2 <- data.frame(x1 = 5:1,    # Specifying stringsAsFactors = TRUE
                    x2 = letters[1:5],
                    x3 = letters[9:5],
                    stringsAsFactors = TRUE)
data2                            # Print data frame

 

table 2 data frame stringsasfactors argument data frame function r

 

In Table 2 it is shown that we have created another data frame by executing the previous R programming code.

The values of this data frame are exactly the same as in the previous example. However, the difference between the two data frames is shown by testing the classes of our new data frame columns:

sapply(data2, class)             # Check classes of data frame columns
#        x1        x2        x3 
# "integer"  "factor"  "factor"

The variables x2 and x3 have been character strings in the previous example. In this example, however, they have the factor class.

 

Video, Further Resources & Summary

In case you need further information on the examples of this article, you may watch the following video of my YouTube channel. I illustrate the contents of this article in the video.

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

In addition, you might want to have a look at the related tutorials of https://www.statisticsglobe.com/:

 

This tutorial has explained how to keep character classes when using the data.frame function in the R programming language.

By the way, the stringsAsFactors argument can also be used when importing data into R, e.g. when using the read.table or read.csv functions. However, the basic principles shown in this article can also be applied in this context.

Don’t hesitate to tell me about it in the comments section, in case you have additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


2 Comments. Leave new

  • I am new to R. What does changing the data type to a factor do? Why would i do this? Why would i not do this?

    Reply
    • Hello Melissa,

      What a great question! Converting the data type to factor is used to specify that there are predefined values (levels/categories) that that particular variable can ever get. It is important for statistical modeling and visualizations. For instance, if you want to plot a barplot by group, the grouping variable should be a factor to tell R that each different value refers to a specific group. I hope these answer your question, if not, let me know!

      Regards,
      Cansu

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top