filter R Function of dplyr Package (Example)
This article illustrates how to subset rows by logical conditions with the filter function of the dplyr package in R programming.
Table of contents:
- Creation of Example Data
- Example: Extract Rows by Logical Condition with filter Function
- Video & Further Resources
So now the part you have been waiting for – the examples!
Creation of Example Data
First, we need to install and load dplyr to RStudio:
install.packages("dplyr") # Install and load dplyr library("dplyr")
Then, we have to create some example data:
data <- data.frame(x1 = 1:5, # Create example data x2 = letters[1:5], group = c("gr1", "gr2", "gr1", "gr3", "gr2")) data # Print data to RStudio console # x1 x2 group # 1 1 a gr1 # 2 2 b gr2 # 3 3 c gr1 # 4 4 d gr3 # 5 5 e gr2
Our example data is a data frame with five rows and three columns. The third column contains a grouping variable with three groups.
Note that we could also apply the following code to a tibble.
Example: Extract Rows by Logical Condition with filter Function
If we want to subset certain rows of our data based on a logical condition, we can apply the filter function of the dplyr package as follows:
filter(data, group == "gr2") # Subset data with filter function # x1 x2 group # 1 2 b gr2 # 2 5 e gr2
As you can see, we extracted only rows where the grouping variable is equal to gr2.
Video & Further Resources
Would you like to learn more about the handling of data frames and tidyverse tibbles in R? Then you might watch the following video instruction of my YouTube channel. I’m explaining the R syntax of this tutorial in the video:
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Furthermore, you might read some of the other tutorials of this homepage:
- Subset Data Frame Rows by Logical Condition in R
- dplyr Package in R
- R Functions List (+ Examples)
- The R Programming Language
In summary: This article showed how to retain only specific rows of a data frame with the filter function of the dplyr package in the R programming language. Please let me know in the comments, if you have any additional questions. Furthermore, don’t forget to subscribe to my email newsletter for updates on new articles.
Statistics Globe Newsletter
2 Comments. Leave new
I have a large data set of 39K rows, and 6 columns, I want to select rows in which the difference between the first and any other columns is more than 2 or less -2. It is a gene expression dataset and is normalized log2 values.
Hello Harish,
What comes to my mind is that you create new variables, such as diff1_2, diff1_3, etc. for each pair of differences, then you can filter the data using the method in the tutorial.
Regards,
Cansu