# filter R Function of dplyr Package (Example)

This article illustrates how to subset rows by logical conditions with the filter function of the dplyr package in R programming.

So now the part you have been waiting for – the examples!

## Creation of Example Data

First, we need to install and load dplyr to RStudio:

```install.packages("dplyr")                        # Install and load dplyr
library("dplyr")```

Then, we have to create some example data:

```data <- data.frame(x1 = 1:5,                     # Create example data
x2 = letters[1:5],
group = c("gr1", "gr2", "gr1", "gr3", "gr2"))
data                                             # Print data to RStudio console
#   x1 x2 group
# 1  1  a   gr1
# 2  2  b   gr2
# 3  3  c   gr1
# 4  4  d   gr3
# 5  5  e   gr2```

Our example data is a data frame with five rows and three columns. The third column contains a grouping variable with three groups.

Note that we could also apply the following code to a tibble.

## Example: Extract Rows by Logical Condition with filter Function

If we want to subset certain rows of our data based on a logical condition, we can apply the filter function of the dplyr package as follows:

```filter(data, group == "gr2")                     # Subset data with filter function
#   x1 x2 group
# 1  2  b   gr2
# 2  5  e   gr2```

As you can see, we extracted only rows where the grouping variable is equal to gr2.

## Video & Further Resources

Would you like to learn more about the handling of data frames and tidyverse tibbles in R? Then you might watch the following video instruction of my YouTube channel. I’m explaining the R syntax of this tutorial in the video:

Furthermore, you might read some of the other tutorials of this homepage:

In summary: This article showed how to retain only specific rows of a data frame with the filter function of the dplyr package in the R programming language. Please let me know in the comments, if you have any additional questions. Furthermore, don’t forget to subscribe to my email newsletter for updates on new articles.

Subscribe to the Statistics Globe Newsletter

• harish Sudarsanam
March 5, 2023 4:18 pm

I have a large data set of 39K rows, and 6 columns, I want to select rows in which the difference between the first and any other columns is more than 2 or less -2. It is a gene expression dataset and is normalized log2 values.