Get Frequency of Words in Character String in R (Example)

 

In this article you’ll learn how to get the word frequencies in a character string in the R programming language.

The article looks as follows:

So let’s get started!

 

Creation of Example Data

First of all, we need to create some example data:

x <- "hello yay hello what is going on going on on"  # Example string
x                                                    # Print string
# [1] "hello yay hello what is going on going on on"

The previous RStudio console output shows the structure of our example data – We have created a character string containing multiple words separated by a space.

 

Example: Create Frequency Table of Words Using strsplit, unlist, table & sort

This section demonstrates how to count the words in a character string – a very common method in text mining and text analysis.

For this task, we can use a combination of several Base R functions: strsplit, unlist, table, and sort.

Check out the following R code:

freq_x <- sort(table(unlist(strsplit(x, " "))),      # Create frequency table
               decreasing = TRUE)
freq_x                                               # Print word frequency
#    on going hello    is  what   yay 
#     3     2     2     1     1     1

As you can see, we have created a table showing the counts of each of the words in our character string. The word “on” occurs the most often (i.e. 3 times), and the words “is”, “what”, and “yay” appear only once.

 

Video, Further Resources & Summary

Have a look at the following video tutorial on my YouTube channel. In the video, I’m explaining the examples of this post.

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Furthermore, you may want to read the other posts on my homepage. I have published several articles already.

 

In summary: In this R tutorial you have learned how to list the word frequencies in a text element. Don’t hesitate to let me know in the comments section, in case you have any additional questions.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top