
Image by author | Midjourney and Canva
Sister site of KDnuggets, Statisticshas a wide range of statistics-related content available, written by experts, content that has been accumulating over a few low years. We decided to assist our readers become aware of this great source of statistics, math, data science, and programming content by organizing and sharing some of its fantastic tutorials with the KDnuggets community.
Learning statistics can be complex. It can be frustrating. And most of all, it can be confusing. That’s why Statistics is here to assist.
This collection of tutorials covers the incredibly essential topic of describing data. Whenever we try to understand our data, it’s essential to be able to describe it in a specific way. These same description tools are useful for sharing with others the summary aspects of our data. Mastering the following common data description methodologies is key to better understanding your data, and to better understanding the rest of the content in Statology.
Measures of Central Tendency: Definition and Examples
A measure of central tendency is a single value that represents the midpoint of a data set. This value can also be called the “central location” of the data set.
There are three measures of central tendency commonly used in statistics:
- Mean
- Median
- Mode
Each of these measures finds the central location of a data set using different methods. Depending on the type of data being analyzed, one of these three measures may be better to employ than the other two.
Dispersion Measures: Definition and Examples
When we analyze a data set, we are often interested in two things:
- Where is the “middle” value? We often measure the “middle” using the mean and median.
- How “spread out” values are. We measure “spread” using range, interquartile range, variance, and standard deviation.
SOCS: A Helpful Acronym for Describing Distribution
In statistics, we are often interested in understanding how a data set is distributed. In particular, there are four things worth knowing about distribution:
1. Shape
Is the distribution symmetrical or tilted to one side?
Is the distribution unimodal (one peak) or bimodal (two peaks)?
2. Outliers
Are there any outliers in the distribution?
3. Center
What are the mean, median, and mode of the distribution?
4. Spread
What are the range, interquartile range, standard deviation, and variance of the distribution?
For more content like this, keep an eye on the Statology website and sign up for their weekly newsletter to make sure you don’t miss a thing.
Matthew Mayo (@mattmayo13) holds a Master’s degree in Computer Science and a postgraduate diploma in Data Mining. As Managing Editor of KDnuggets & Statisticsand contributing editor at Mastery in Machine LearningMatthew aims to make convoluted data science concepts accessible. His professional interests include natural language processing, language models, machine learning algorithms, and exploration of emerging artificial intelligence. He is driven by a mission to democratize knowledge within the data science community. Matthew has been coding since he was 6 years aged.
