Split Data Frame in R (3 Examples) | Divide (Randomly) by Row & Column

In this R tutorial youâ€™ll learn how to separate a data frame into two different parts.

The content of the tutorial is structured as follows:

Hereâ€™s how to do it:

Creation of Example Data

As a first step, letâ€™s create some example data:

```data <- data.frame(x1 = 1:10,              # Creating example data
x2 = letters[1:10],
x3 = 20:11)
data                                       # Show example data in console
#    x1 x2 x3
# 1   1  a 20
# 2   2  b 19
# 3   3  c 18
# 4   4  d 17
# 5   5  e 16
# 6   6  f 15
# 7   7  g 14
# 8   8  h 13
# 9   9  i 12
# 10 10  j 11```

The previously shown RStudio console output reveals that our example data has ten rows and three columns. Letâ€™s split these data!

Example 1: Splitting Data Frame by Row Using Index Positions

In Example 1, Iâ€™ll explain how to divide a data table into two different parts by the positions of the data rows. The first part contains the first five rows of our example dataâ€¦

```data_1a <- data[1:5, ]                     # Extract first five rows
data_1a                                    # Print top part of data frame
#   x1 x2 x3
# 1  1  a 20
# 2  2  b 19
# 3  3  c 18
# 4  4  d 17
# 5  5  e 16```

â€¦and the second data frame contains the bottom five rows of our input data:

```data_1b <- data[6:10, ]                    # Extract last five rows
data_1b                                    # Print bottom part of data frame
#    x1 x2 x3
# 6   6  f 15
# 7   7  g 14
# 8   8  h 13
# 9   9  i 12
# 10 10  j 11```

Example 2: Splitting Data Frame by Row Using Random Sampling

Example 1 has explained how to split a data frame by index positions. The following R programming code, in contrast, shows how to divide data frames randomly.

First, we have to create a random dummy as indicator to split our data into two parts:

```set.seed(37645)                            # Set seed for reproducibility
dummy_sep <- rbinom(nrow(data), 1, 0.5)    # Create dummy indicator```

Now, we can subset our original data based on this dummy indicator. First, we are creating one data frameâ€¦

```data_2a <- data[dummy_sep == 0, ]          # Extract data where dummy == 0
data_2a                                    # Print data
#    x1 x2 x3
# 1   1  a 20
# 2   2  b 19
# 3   3  c 18
# 6   6  f 15
# 7   7  g 14
# 10 10  j 11```

â€¦and then we are creating the other data frame:

```data_2b <- data[dummy_sep == 1, ]          # Extract data where dummy == 1
data_2b                                    # Print data
#   x1 x2 x3
# 4  4  d 17
# 5  5  e 16
# 8  8  h 13
# 9  9  i 12```

Example 3: Splitting Data Frame by Column Names

In Example 3, Iâ€™ll illustrate how to separate data sets by column. More precisely, we are using the variable names of our data frame to split the data.

We are assigning the variables x1 and x3 to the first data frameâ€¦

```data_3a <- data[ , c("x1", "x3")]          # Select specific data frame columns
data_3a                                    # Print data
#    x1 x3
# 1   1 20
# 2   2 19
# 3   3 18
# 4   4 17
# 5   5 16
# 6   6 15
# 7   7 14
# 8   8 13
# 9   9 12
# 10 10 11```

â€¦and the variable x2 to the second junk of data:

```data_3b <- data[ , "x2"]                   # Select remaining column
data_3b                                    # Print data
#  [1] a b c d e f g h i j
# Levels: a b c d e f g h i j```

Note that the second part of the data was converted to a vector, since we only kept a single variable in this second data part.

Video & Further Resources

If you need more explanations on the examples of this tutorial, you may watch the following video of my YouTube channel. I show the R codes of this article in the video instruction:

Furthermore, you may have a look at the other tutorials on Statistics Globe.

Subscribe to the Statistics Globe Newsletter

• Hi! thank you so much this was very helpful. Is there a way to do this more efficiently when you have a lot more columns that you want to split? Say maybe using a dash and saying split from clumn “x1” – “x10” ?

`data[ , paste0("x", 1:10)]`