# Use of Tilde ~ in R (Example)

In this R tutorial you’ll learn how to use the tilde symbol (i.e. ~).

## Tilde Symbol Explained

Generally speaking, the tilde symbol is used within formulas of statistical models.

The left side of the tilde symbol specifies the target variable (also called dependent variable or outcome) and the right side of the tilde specifies the predictor variable(s) (also called independent variables).

Let’s move on to some R codes in action. This will make the use of the tilde symbol much easier to understand!

## Introducing Example Data

As a first step, let’s create some example data:

```set.seed(9782356)                         # Create example data
x1 <- rnorm(100)
x2 <- rnorm(100) + x1
x3 <- rnorm(100) + x1 + x2
y <- rnorm(100) + x1 + x2 + x3
data <- data.frame(x1, x2, x3, y)
#           x1         x2          x3          y
# 1  2.3390681  1.5383124  4.44456236  9.9461754
# 2  0.4751355 -0.4914280  0.68240263  1.0319888
# 3  0.2193361 -0.6160729 -0.04378242 -0.9127214
# 4  0.3338190 -0.4646736 -1.14482831 -1.8428651
# 5 -0.8114951 -0.8709525 -0.83104367 -2.7991183
# 6 -0.4757980  0.3989116  0.19974978 -0.8166479```

The previous RStudio console output illustrates the structure of our example data: It consists of four numeric columns. The variables x1, x2, and x3 will be used as predictors for the target variable y.

## Example: Using ~ within lm() Function to Estimate Linear Regression Model

This example illustrates how to use the tilde symbol within the lm function to fit a linear regression model in R.

Have a look at the following R code:

`my_model <- lm(y ~ x1 + x2 + x3, data)    # Estimate linear model`

Within the lm function, we have specified our formula (i.e. y ~ x1 + x2 + x3). The ~ symbol defines the predictors and the target variable.

We can now use the summary() function to produce descriptive statistics of our previously estimated linear regression model:

```summary(my_model)                         # Summary statistics of model
# Call:
# lm(formula = y ~ x1 + x2 + x3, data = data)
#
# Residuals:
#      Min       1Q   Median       3Q      Max
# -2.05220 -0.65160 -0.08235  0.48653  2.58307
#
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.000633   0.098582   0.006    0.995
# x1          1.261750   0.181533   6.951 4.38e-10 ***
# x2          1.023750   0.165229   6.196 1.44e-08 ***
# x3          0.880829   0.116128   7.585 2.12e-11 ***
# ---
# Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#
# Residual standard error: 0.9621 on 96 degrees of freedom
# Multiple R-squared:  0.9555,	Adjusted R-squared:  0.9541
# F-statistic:   687 on 3 and 96 DF,  p-value: < 2.2e-16```

Looks good!

## Video & Further Resources

Besides the video, I can recommend to have a look at the related articles of https://statisticsglobe.com/:

You learned in this tutorial how to apply the tilde symbol in R programming. Tell me about it in the comments, in case you have any further questions.

Subscribe to the Statistics Globe Newsletter