
Image by author | Midjourney and Canva
Sister site of KDnuggets, Statisticshas a wide range of statistics-related content available, written by experts, content that has been accumulating over a few low years. We decided to aid our readers become aware of this great source of statistics, math, data science, and programming content by organizing and sharing some of its fantastic tutorials with the KDnuggets community.
Learning statistics can be hard. It can be frustrating. And most of all, it can be confusing. That’s why Statistics is here to aid.
This latest collection of tutorials focuses on data visualization. No data or statistical analysis is complete without data visualization. There are many tools that allow us to better understand our data through visualization, and these tutorials will aid us do just that. Learn about these different techniques, then continue reading the Statology archives for more gems.
Box plots
A box plot (also called a box-and-whisker plot) is a chart that shows a five-number summary of a set of data.
The five-number summary includes:
- Minimum
- First quartile
- Median
- Third quartile
- Maximum
A box plot allows us to easily visualize the distribution of values in a data set with one uncomplicated graph.
Stem-Leaf Graphs: Definition and Examples
A stem and leaf plot presents data by dividing each value in the data set into a “stem” and a “leaf.”
This tutorial explains how to create and interpret stem-and-leaf plots.
Scatter plots
Scatter plots are used to show the relationship between two variables.
Suppose we have the following dataset showing the weight and height of the players on a basketball team:


The two variables in this data set are height and weight.
To create a scatter plot, we place height on the x-axis and weight on the y-axis. Each player is then represented as a dot on the scatter plot:


Scatter plots aid us see the relationship between two variables. In this case, we see that height and weight have a positive relationship. As height increases, weight also tends to raise.
Relative Frequency Histogram: Definition + Example
Frequently in statistics you will come across tables that display information about frequencies. Frequencies simply tell us how many times a certain event occurred.
For example, the table below shows how many items a particular store sold during the week based on the price of the item:
This type of table is known as a frequency table. In one column we have the “class” and in the other column the frequency of the class.
We often exploit frequency histograms to visualize the values in a frequency table because it is usually easier to understand the data when we can visualize the numbers.
What are Density Curves? (Explanation and Examples)
A density curve is a curve on a graph that shows the distribution of values in a data set. It is useful for three reasons:
- The density curve gives us a good idea of the “shape” of the distribution, including whether the distribution has one or more “peaks” of frequently occurring values, and whether the distribution is skewed to the left or right.
- The density curve allows us to visually see where the mean and median of a distribution lie.
- The density curve allows us to visually see what percentage of observations in a data set fall between different values
For more content like this, keep an eye on the Statology website and sign up for their weekly newsletter to make sure you don’t miss a thing.
Matthew Mayo (@mattmayo13) holds a Master’s degree in Computer Science and a postgraduate diploma in Data Mining. As Managing Editor of KDnuggets & Statisticsand contributing editor at Mastery in Machine LearningMatthew aims to make convoluted data science concepts accessible. His professional interests include natural language processing, language models, machine learning algorithms, and exploration of emerging artificial intelligence. He is driven by a mission to democratize knowledge within the data science community. Matthew has been coding since he was 6 years aged.

