R Programming Language (Analysis Software for Statistics & Data Science)

R is a programming language and software that is becoming increasingly popular in the disciplines of statistics and data science.

R is a dialect of the S programming language and was developed by Ross Ihaka and Robert Gentleman in the year 1995. A stable beta version was released in the year 2000.

The R software is completely free and gets developed collaboratively by its community (open source software) – every R user can publish new add-on packages.

The open source ideology of R programming reflects a huge contrast compared to most traditional programming environments (e.g. SAS, SPSS, Stata etc.), where the software development is in the hands of a paid development team.

All R Programming Tutorials on Statistics Globe

In the following, you can find a list of R tutorials on statisticsglobe.com. In the tutorials, I’m explaining statistical concepts and provide reproducible example codes in R.

 

 

The Increasing Popularity of R Programming

Since the R programming language provides features for almost all statistical tasks without any costs for the user, R is rapidly growing since its release. Let’s check some numbers…

 

R Programming Language Increased Popularity

Graphic 1: Google Scholar Search Results for R Programming Filtered by Year

 

Reasons to Learn R

The pros:

+ R is free

+ R’s popularity is growing – More and more people will use it

+ Almost all statistical methods are available in R

+ New methods are implemented in add-on packages quickly

+ Algorithms for packages and functions are publicly available (transparency and reproducibility)

+ R provides a huge variety of graphical outputs

+ R is very flexible – Essentially everything can be modified for your personal needs

+ R is compatible with all operating systems (e.g. Windows, MAC, or Linux)

+ R has a huge community that is organized in forums to help each other (e.g. stackoverflow)

+ R is fun 🙂

 

The cons:

– Relatively high learning burden at the beginning (even though it’s worth it)

– No systematic validation of new packages and functions

– No company in the background that takes responsibility for errors in the code (this is especially important for public institutes)

– R is almost exclusively based on programming (no extensive drop-down menus such as in SPSS)

– R can have problems with computationally intensive tasks (only important for advanced users)

 

You are not sure yet, whether you should learn the R programming language? In that case, I can recommend the following video of the YouTube channel RenegadeThinking. The speaker provides you with many reasons, why it is advisable to learn R.

 

Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.

YouTube Content Consent Button Thumbnail

YouTube privacy policy

If you accept this notice, your choice will be saved and the page will refresh.

 

Appendix

Appendix 1: R code for the creation of Graphic 1

year <- 2018:2000                                       # Years
 
r_gs <- c(21600 * 2, 43300, 43100, 38100, 33200, 29800, # Google Scholar searches
          28500, 25500, 22400, 19100, 15900, 12000,
          8270, 5930, 3740, 2600, 1980, 1600, 1360)
 
data <- data.frame(software = rep("R", 19),             # Combine data
                   year = year, 
                   searches = r_gs)
 
ggplot(data) +                                          # Create plot
  geom_point(aes(x = year, y = searches, 
                 color = software, shape = software)) +
  geom_line(aes(x = year, y = searches, 
                color = software)) +
  theme(legend.title = element_blank(),
        legend.position = "none") +
  ggtitle("Google Scholar Search Results") +
  labs(x = "Year", y = "Search Results") +
  scale_y_continuous(labels = comma)

 

Appendix 2: How to create the header graphic of this page

par(mar = c(0, 0, 0, 0))                           # Remove space around plot
par(bg = "#1b98e0")                                # Set background color
 
set.seed(10293847)                                 # Seed
 
N <- 100000                                        # Sample size
x <- rnorm(N)                                      # X variable
y <- rnorm(N) + x                                  # Correlated Y variable
 
plot(x, y, col = "#353436", pch = 19, cex = 0.1    # Create plot
     , xlim = c(- 4, 4), ylim = c(- 7, 7))
text(0, 0, "R", col = "#1b98e0", cex = 12)         # Write R
points(0, 0, col = "#1b98e0", cex = 30, lwd = 5)   # Create circles
points(0, 0, col = "#1b98e0", cex = 50, lwd = 5)
points(0, 0, col = "#1b98e0", cex = 70, lwd = 5)
points(0, 0, col = "#1b98e0", cex = 90, lwd = 5)
points(0, 0, col = "#1b98e0", cex = 110, lwd = 5)
box(col="#1b98e0")                                 # Color of box

 

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu
Top