# Split Data Frame in R (3 Examples) | Divide (Randomly) by Row & Column

In this R tutorial you’ll learn how to **separate a data frame into two different parts**.

The content of the tutorial is structured as follows:

Here’s how to do it:

## Creation of Example Data

As first step, let’s create some example data:

data <- data.frame(x1 = 1:10, # Creating example data x2 = letters[1:10], x3 = 20:11) data # Show example data in console # x1 x2 x3 # 1 1 a 20 # 2 2 b 19 # 3 3 c 18 # 4 4 d 17 # 5 5 e 16 # 6 6 f 15 # 7 7 g 14 # 8 8 h 13 # 9 9 i 12 # 10 10 j 11 |

data <- data.frame(x1 = 1:10, # Creating example data x2 = letters[1:10], x3 = 20:11) data # Show example data in console # x1 x2 x3 # 1 1 a 20 # 2 2 b 19 # 3 3 c 18 # 4 4 d 17 # 5 5 e 16 # 6 6 f 15 # 7 7 g 14 # 8 8 h 13 # 9 9 i 12 # 10 10 j 11

The previously shown RStudio console output reveals that our example data has ten rows and three columns. Let’s split these data!

## Example 1: Splitting Data Frame by Row Using Index Positions

In Example 1, I’ll explain how to divide a data table into two different parts by the positions of the data rows. The first part contains the first five rows of our example data…

data_1a <- data[1:5, ] # Extract first five rows data_1a # Print top part of data frame # x1 x2 x3 # 1 1 a 20 # 2 2 b 19 # 3 3 c 18 # 4 4 d 17 # 5 5 e 16 |

data_1a <- data[1:5, ] # Extract first five rows data_1a # Print top part of data frame # x1 x2 x3 # 1 1 a 20 # 2 2 b 19 # 3 3 c 18 # 4 4 d 17 # 5 5 e 16

…and the second data frame contains the bottom five rows of our input data:

data_1b <- data[6:10, ] # Extract last five rows data_1b # Print bottom part of data frame # x1 x2 x3 # 6 6 f 15 # 7 7 g 14 # 8 8 h 13 # 9 9 i 12 # 10 10 j 11 |

data_1b <- data[6:10, ] # Extract last five rows data_1b # Print bottom part of data frame # x1 x2 x3 # 6 6 f 15 # 7 7 g 14 # 8 8 h 13 # 9 9 i 12 # 10 10 j 11

## Example 2: Splitting Data Frame by Row Using Random Sampling

Example 1 has explained how to split a data frame by index positions. The following R programming code, in contrast, shows how to divide data frames randomly.

First, we have to create a random dummy as indicator to split our data into two parts:

set.seed(37645) # Set seed for reproducibility dummy_sep <- rbinom(nrow(data), 1, 0.5) # Create dummy indicator |

set.seed(37645) # Set seed for reproducibility dummy_sep <- rbinom(nrow(data), 1, 0.5) # Create dummy indicator

Now, we can subset our original data based on this dummy indicator. First, we are creating one data frame…

data_2a <- data[dummy_sep == 0, ] # Extract data where dummy == 0 data_2a # Print data # x1 x2 x3 # 1 1 a 20 # 2 2 b 19 # 3 3 c 18 # 6 6 f 15 # 7 7 g 14 # 10 10 j 11 |

data_2a <- data[dummy_sep == 0, ] # Extract data where dummy == 0 data_2a # Print data # x1 x2 x3 # 1 1 a 20 # 2 2 b 19 # 3 3 c 18 # 6 6 f 15 # 7 7 g 14 # 10 10 j 11

…and then we are creating the other data frame:

data_2b <- data[dummy_sep == 1, ] # Extract data where dummy == 1 data_2b # Print data # x1 x2 x3 # 4 4 d 17 # 5 5 e 16 # 8 8 h 13 # 9 9 i 12 |

data_2b <- data[dummy_sep == 1, ] # Extract data where dummy == 1 data_2b # Print data # x1 x2 x3 # 4 4 d 17 # 5 5 e 16 # 8 8 h 13 # 9 9 i 12

## Example 3: Splitting Data Frame by Column Names

In Example 3, I’ll illustrate how to separate data sets by column. More precisely, we are using the variable names of our data frame to split the data.

We are assigning the variables x1 and x3 to the first data frame…

data_3a <- data[ , c("x1", "x3")] # Select specific data frame columns data_3a # Print data # x1 x3 # 1 1 20 # 2 2 19 # 3 3 18 # 4 4 17 # 5 5 16 # 6 6 15 # 7 7 14 # 8 8 13 # 9 9 12 # 10 10 11 |

data_3a <- data[ , c("x1", "x3")] # Select specific data frame columns data_3a # Print data # x1 x3 # 1 1 20 # 2 2 19 # 3 3 18 # 4 4 17 # 5 5 16 # 6 6 15 # 7 7 14 # 8 8 13 # 9 9 12 # 10 10 11

…and the variable x2 to the second junk of data:

data_3b <- data[ , "x2"] # Select remaining column data_3b # Print data # [1] a b c d e f g h i j # Levels: a b c d e f g h i j |

data_3b <- data[ , "x2"] # Select remaining column data_3b # Print data # [1] a b c d e f g h i j # Levels: a b c d e f g h i j

Note that the second part of the data was converted to a vector, since we only kept a single variable in this second data part.

## Video & Further Resources

If you need more explanations on the examples of this tutorial, you may watch the following video of my YouTube channel. I show the R codes of this article in the video instruction:

*The YouTube video will be added soon.*

Furthermore, you may have a look at the other tutorials on Statistics Globe.

- Split Data Frame into List of Data Frames Based On ID Column
- Split Data Frame Variable into Multiple Columns
- Convert Data Frame Rows to List
- The R Programming Language

In this R tutorial you learned how to **split a data frame into multiple subsets**. Let me know in the comments below, if you have additional questions. Furthermore, please subscribe to my email newsletter for regular updates on the newest articles.

**5**/

**5**(

**1**vote )

### Subscribe to my free statistics newsletter: