Statistical Methods for Data Analysis | Research Techniques & Applications
“The sexy job in the next 10 years will be statisticians!” – Hal Varian, chief economist at Google, could not have been more correct when he said this sentence in the early 2000s.
The knowledge about statistical methods for the analysis of large data sets is becoming more and more important for a modern curriculum vitae.
On statisticsglobe.com, you can learn how to use the techniques that are currently up to date in the research fields of statistics and data science – and even more important – how to apply these methods with modern statistical software such as R or Python.
List of Statistics Tutorials
In the following, you can find a list of statistics and data science tutorials that I have published on statisticsglobe.com. At the moment, the tutorials are mainly covering the handling of missing data and related topics. However, in the near future I will add further topics to the list. In the tutorials, I am explaining the theoretical concepts and show some practical applications for the different methods.
The Most Important Methods in Statistics & Data Science
Admittedly, the list of available statistical methods is huge. As a beginner, it therefore makes sense to learn some of the most important techniques first and then move on from there.
If you want to get a first overview about some of the most important statistical concepts, I can recommend the following video tutorial of the YouTube channel The Doctoral Journey. The speaker, Dr. Amanda J. Rockinson-Szapkiw, is explaining some basic descriptive and inferential methods.
Please accept YouTube cookies to play this video. By accepting you will be accessing content from YouTube, a service provided by an external third party.
If you accept this notice, your choice will be saved and the page will refresh.
Hi, please can you do a tutorial on machine learning?
Thanks a lot for your comment!
I have planned to create machine learning tutorials since a very long time. Unfortunately, I never found the time to do it.
However, this is definitely something that I have planned for the future!
I’ll keep you updated.
How to get R program software for hp laptop, I downloaded on but not opening on my laptop
This is difficult to tell without seeing your laptop. Have you also installed RStudio?
How do you install R
You can find detailed instructions here: https://courses.edx.org/courses/UTAustinX/UT.7.01x/3T2014/56c5437b88fa43cf828bff5371c6a924/
Really a wonderful and useful educational site. Well done!
Thank you very much for the wonderful feedback! Great to hear that you like my tutorials 🙂
I enjoy your very clear tutorials and refer my students to them. I realize you are offering at this stage basic statistics tutorials, so it will probably take a while till you get to a tutorial on the following topic, but I wonder if you have a recommendation for an R package that can perform non-parametric Repeated Measures ANOVA with at least three within factor and one between factor. Thanks!
First of all thanks a lot for the very kind feedback, and for sharing my tutorials with your students! This is great to hear. 🙂
Regarding your question: Unfortunately I don’t have such a tutorial yet, and I’m also not an expert on this topic.
However, you may ask your question in the Statistics Globe Facebook group: https://www.facebook.com/groups/statisticsglobe
In this group, many skilled people are participating, so I think it’s very likely that somebody has a good suggestion.
By the way, this group might also be a good recommendation for your students in case they have any statistics or R related questions.
I hope that helps!
Hello Joachim, may you do a tutorial on how to combine two NetCDF files and make them one continuous file.
I have no experience with NetCDF files myself, but I have found this thread on Stack Overflow, which seems to answer your question: https://stackoverflow.com/questions/45096730/merge-netcdf-files-in-r
Hey Joachim, can you help us with a tutorial about Mixed Models?
Thank you for the topic suggestion! Unfortunately, I’m not an expert on mixed models, but I’ll put it on my to-do list for future tutorials.
I am at the very beginning in learning programming with R.
Your examples are precise, short and clear.
They helped me many times.
Big thanks for that!
Thank you very much for the wonderful feedback, glad to hear that my tutorials are helpful to you!
Thank you very much for spreading this knowledge to everyone. Just a small request, If you prepare some videos on Bioinformatics data analysis, Genotypic and Phenotypic data analysis with examples, then it will be a great treat for the Life science students and researchers.
Thank you very much for the kind words and the topic suggestion!
I’m not an expert on this field, but I will hopefully be able to do some collaborations on this in the future. 🙂
I hope you are doing Great. I’m Dr.Khalid working in China.. I have a little problem regarding statistical technique called ” Panel Frequency domain test (Croux and Reusens 2013) (single country and Multi-countries). I do not know how to operate in Stata or R software. I hope you will help/guide me in this matter. I saw lots of videos but failed to find the process. . Thanks
looking forward to hear from you.
Thank you for your comment and your question. Unfortunately, I’m not an expert on this topic. However, I have recently created a Facebook discussion group where people can ask questions about R programming and statistics. Could you post your question there? This way, others can contribute/read as well: https://www.facebook.com/groups/statisticsglobe
Is it possible to enter in the same graph the points, the regression line, the confidence band and the identity function (i.e., y=x), so that if the identity function and the regression line coincide, the identity line will be on the foreground?
You can basically add as many layers to a plot as you want. I recommend using the ggplot2 package for this. For example, this tutorial explains how to add a regression line to a ggplot2 scatterplot.
Correction: Sorry, I meant if the identity line and the confidence band coincide somewhere, then the identity line will still be visible.
Please have a look at my response to your other comment.
I generate a random sample from a proposed model. I added the maximum likelihood method for type I censored
the result is the output generated. The estimates output of one parameter but the remaining estimates do not result from the program giving a warning
please if you can help me to solve this error? I am sending you the program and all trails
I’m sorry for the delayed reply. I was on a long vacation, so unfortunately I wasn’t able to get back to you earlier. Do you still need help with your syntax?
Enjoy I wait to any one response me
I’m not an expert on this, so I’m not sure if I’m the right person to ask. However, you may share your code here in the comments, I could have a look.
can you post a video on how to link twitter account with API keys for sentiment analysis?
Thank you for the topic suggestion! I’m not an expert for this, but this might be an interesting topic for the future.