case_when & cases Functions in R (2 Examples)

 

This article illustrates how to apply the case_when and cases functions in R.

Table of contents:

You’re here for the answer, so let’s get straight to the examples.

 

Creation of Exemplifying Data

As first step, let’s create some example data. In this example, we’ll use the numeric vector x1…

x1 <- 1:6                                     # Create first vector
x1                                            # Print first vector
# 1 2 3 4 5 6

…and the character vector x2 as basement:

x2 <- letters[1:6]                            # Create second vector
x2                                            # Print second vector
# "a" "b" "c" "d" "e" "f"

Next, I’ll explain how to create a new vector based on logical conditions that involve the example vectors x1 and x2.

In other words: I’m showing R programming functions that are equivalent to the popular CASE WHEN SQL statement.

 

Example 1: Distinguish between Cases Using case_when() Function of dplyr Package

In this example, I’ll explain how to apply the cases_when function of the dplyr package to conditionally create a new vector in R.

First, we have to install and load the dplyr package:

install.packages("dplyr")                     # Install & load dplyr package
library("dplyr")

Now, we can apply the case_when function as shown below:

new_dplyr <- case_when(x1 < 3 ~ "Group 1",    # Applying case_when
                       x2 %in% letters[2:5] ~ "Group 2",
                       TRUE ~ "Group 3")
new_dplyr                                     # Print output
# [1] "Group 1" "Group 1" "Group 2" "Group 2" "Group 2" "Group 3"

Have a look at the previous output of the RStudio console: It shows that our new vector contains three different groups that were assigned depending on the logical conditions that we have specified within the case_when function.

Note that some cases multiple conditions are TRUE. For instance, the second vector elements of our two input vectors (i.e. x1 = 2 and x2 = “b”) are TRUE in all three logical conditions.

In such a case, the case_when function automatically assigns the first output (i.e. “Group 1”) to the new vector.

Anyway, let’s compare the syntax of the case_when function to the cases function…

 

Example 2: Distinguish between Cases Using cases() Function of memisc Package

In this example, I’ll illustrate how to apply the cases function of the memisc package.

First, we need to install and load the memisc package to RStudio:

install.packages("memisc")                    # Install memisc package
library("memisc")                             # Load memisc

Now, we can apply the cases command as shown below:

new_memisc <- cases("Group 1" = x1 < 3,       # Applying cases
                    "Group 2" = x2 %in% letters[2:5],
                    "Group 3" = TRUE)
# Warning message:
#   In cases(`Group 1` = x1 < 3, `Group 2` = x2 %in% letters[2:5], `Group 3` = TRUE) :
#   conditions are not mutually exclusive

The RStudio console returns a warning message after running the previous R code. This is due to the overlap of logical conditions that we have discussed before in Example 1. However, the cases function also automatically uses the first logical condition that is TRUE to define the final output.

Let’s see how the new vector created by the cases function looks like:

new_memisc                                    # Print output
# [1] Group 1 Group 1 Group 2 Group 2 Group 2 Group 3
# Levels: Group 1 Group 2 Group 3

As you can see, the group values are exactly the same as in Example 1. However, the cases function returns its output with the factor class (in contrast to the cases_when function that returned character strings).

 

Video & Further Resources

Do you need more explanations on the examples of this article? Then I can recommend to have a look at the following video that I have published on my YouTube channel. I show the contents of this tutorial in the video:

 

The YouTube video will be added soon.

 

Besides that, you might want to read the other tutorials on my website. Some articles about data manipulation in R can be found here.

 

This tutorial showed how to conditionally specify a new vector in the R programming language. Don’t hesitate to let me know in the comments, in case you have further questions.

 

Subscribe to my free statistics newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top