Select Rows if Value in One Column is Smaller Than in Another in R (Examples)
This tutorial shows how to subset rows if the value in variable No. 1 is smaller than the value in variable No. 2 in the R programming language.
Table of contents:
So now the part you have been waiting for – the examples.
Creation of Example Data
As a first step, we’ll need to create some data that we can use in the exemplifying syntax below:
data <- data.frame(x1 = 1:10, # Creating example data frame x2 = 10:1, x3 = "X") data # Printing example data frame # x1 x2 x3 # 1 1 10 X # 2 2 9 X # 3 3 8 X # 4 4 7 X # 5 5 6 X # 6 6 5 X # 7 7 4 X # 8 8 3 X # 9 9 2 X # 10 10 1 X
Have a look at the previous output of the RStudio console. It reveals that our example data has ten rows and three columns. The variables x1 and x2 are numeric and the variable x3 is a factor.
Example 1: Subset Rows of Data Frame Using Square Brackets
The following code shows how to extract only rows where the values in the first column are smaller than in the second column by using a logical condition within square brackets.
data1 <- data[data$x1 < data$x2, ] # Using square brackets data1 # Printing updated data frame # x1 x2 x3 # 1 1 10 X # 2 2 9 X # 3 3 8 X # 4 4 7 X # 5 5 6 X
Have a look at the previous output of the RStudio console: The new data frame data1 contains only those rows where the values in x1 are smaller than in x2.
Example 2: Subset Rows of Data Frame Using subset Function
Alternatively to the R code shown in Example 1, we can also use the subset function in R:
data2 <- subset(data, x1 < x2) # Applying subset function data2 # Printing updated data frame # x1 x2 x3 # 1 1 10 X # 2 2 9 X # 3 3 8 X # 4 4 7 X # 5 5 6 X
The output is exactly the same as in Example 1.
Example 3: Subset Rows of Data Frame Using filter Function of dplyr Package
Another alternative for the subsetting of data is provided by the dplyr package.
We first need to install and load the dplyr package, if we want to use the functions that are contained in the package:
install.packages("dplyr") # Install & load dplyr package library("dplyr")
Now, we can use the filter function of dplyr to create the same output as in the previous examples:
data3 <- filter(data, x1 < x2) # Applying filter function data3 # Printing updated data frame # x1 x2 x3 # 1 1 10 X # 2 2 9 X # 3 3 8 X # 4 4 7 X # 5 5 6 X
Video, Further Resources & Summary
Do you want to learn more about subsetting data frames? Then you might have a look at the following video of my YouTube channel. I show the R syntax of this page in the video.
The YouTube video will be added soon.
In addition, you could have a look at some of the related posts on my website. I have published several posts on topics such as character strings, extracting data, and vectors:
- Sample Random Rows of Data Frame
- Select Rows with Partial String Match in R
- Select Data Frame Rows based on Values in Vector
- Select Row with Maximum or Minimum Value in Each Group
- Introduction to R Programming
You learned on this page how to filter certain rows based on the values in two columns in the R programming language.
In this tutorial, we have discussed how to extract rows where the column values are smaller than a certain threshold.
However, please note that we could apply the same logic in case we would like to conditionally select rows where a column is greater than a specific value.
For this, we would simply have to specify the logical condition operator to reflect “larger than” instead of “smaller than”.
In case you have additional questions, please let me know in the comments. Furthermore, please subscribe to my email newsletter for updates on new articles.
Statistics Globe Newsletter