The “68–95–99.7” Rule of data…

Jagruti Pawashe
3 min readNov 18, 2022

--

Statistics — It is a branch of mathematics that deals with the study of collecting, analyzing, interpreting, presenting, and organizing data in a particular manner. Statistics is the process of collecting data, classifying data, representing the data for straightforward interpretation, and further analysis of data. Statistics is also referred to as arriving at conclusions from the sample data collected using surveys or experiments.

What is a normal distribution?

Normal distribution means the data is distributed with no skewness. When plotted on the graph, the data follows a bell-shaped curve, with most of the values clustering around the central region and tapering off as they go further away from the center.

Normal distributions are also called Gaussian distributions or bell curves because of their shape.

All kinds of variables in nature and science are normally or approximately normally distributed. For eg. Height, birth weight, reading ability, etc.

Because normally distributed variables are so common, many statistical tests, are designed for normally distributed populations.

Understanding the properties of normal distributions means you can use inferential statistics to compare different groups and make estimations about populations using samples.

Properties of Normal Distribution-

Normal distributions have some key characteristics that are-

  1. The mean, median & mode are exactly the same.
  2. The distribution of data is symmetric around the mean
  3. The distribution is described by mean & standard deviation.

The mean is the location parameter while the standard deviation is the scale parameter.

The mean determines where the peak of the curve is centered. Increasing the mean moves the curve right, while decreasing it moves the curve left.

What is the standard deviation?

The standard deviation is the average amount of variability in your dataset. It tells you, on average, how far each value lies from the mean.

A high standard deviation means that values are generally far from the mean, while a low standard deviation indicates that values are clustered close to the mean.

The standard deviation tells you how much spread out from the center of the distribution your data is on average.

The standard deviation stretches or squeezes the curve. A small standard deviation results in a narrow curve, while a large standard deviation leads to a wide curve.

Empirical Rule:

The empirical rule, or the 68–95–99.7 rule, tells you where most of the values lie in a normal distribution:

  • Around 68% of values are within 1 standard deviation of the mean.
  • Around 95% of values are within 2 standard deviations of the mean.
  • Around 99.7% of values are within 3 standard deviations of the mean.

The empirical rule is a quick way to get an overview of your data and check for any outliers or extreme values that don’t follow this pattern.

The remaining 0.3% from the distribution at both ends of the curve is for outliers or extreme values in the data.

--

--

Jagruti Pawashe
Jagruti Pawashe

Written by Jagruti Pawashe

Senior Analyst at ImarticusLearning .

No responses yet