Statistics Tutorials: The Normal Distribution and the Central Limit Theorem
There must be a reason why the normal distribution is SO popular. I mean, if we consider that a normal distribution with a mean of \(\mu\) and variance \({{\sigma }^{2}}\) has a density function as the one shown below
\[f\left( x \right)=\frac{1}{\sqrt{2\pi {{\sigma }^{2}}}}\exp \left( -\frac{{{\left( x-\mu \right)}^{2}}}{2{{\sigma }^{2}}} \right)\]
then one must think that it is popular not precisely due to the simplicity of its density function.
Manipulating the Normal Distribution
Indeed, Stats students dread to have to deal with the normal distribution in regards to its algebraic manipulation because, granted, it can be cumbersome. For example, the density function \(f\left( x \right)\) presented above is indeed a density, as it can be proven (although it is not elementary to do so) that
\[\int\limits_{-\infty }^{\infty }{\frac{1}{\sqrt{2\pi {{\sigma }^{2}}}}\exp \left( -\frac{{{\left( x-\mu \right)}^{2}}}{2{{\sigma }^{2}}} \right)dx}=1\]
And since this density \(f\left( x \right)\) is a valid density, we must have then that
\[\int\limits_{-\infty }^{\infty }{\frac{x}{\sqrt{2\pi {{\sigma }^{2}}}}\exp \left( -\frac{{{\left( x-\mu \right)}^{2}}}{2{{\sigma }^{2}}} \right)dx}=\mu\]
and\[\int\limits_{-\infty }^{\infty }{\frac{{{x}^{2}}}{\sqrt{2\pi {{\sigma }^{2}}}}\exp \left( -\frac{{{\left( x-\mu \right)}^{2}}}{2{{\sigma }^{2}}} \right)dx}={{\mu }^{2}}+{{\sigma }^{2}}\]
which are not trivial to prove (especially the last one). So, yes, it is hard to algebraically deal with the normal distribution. But then, why is it so popular??
Standard Normal Distribution and Z-scores
One good reason, which is a probably strong enough reason on its own, is that via a very simple standardization process, we can reduce ANY normal distribution \(N\left( \mu ,{{\sigma }^{2}} \right)\) to the standard normal distribution, with is the normal distribution that has a mean of zero and a standard deviation of 1, or \(N\left( 0,1 \right)\). The standardization consists of reducing the original variable X to z-scores using the following expression:
\[Z=\frac{X-\mu }{\sigma }\]
Indeed, it can be proven that if X has a normal distribution with mean \(\mu\) and variance \({{\sigma }^{2}}\), \(N\left( \mu ,{{\sigma }^{2}} \right)\), then \(Z\) defined as
\[Z=\frac{X-\mu }{\sigma}\]
also has a normal distribution, but with mean 0 and standard deviation 1. This little reduction turns out to be EXTREMELY efficient, because by using we can reduce the calculation of ANY normal distribution probabilities to the calculation of probabilities for the standard normal distribution. Have you even wondered why the back of the Stats textbooks come with normal distribution tables ONLY for the standard normal distribution? It is because all normal distributions can be reduced to the standard normal distributions, via z-scores, and it would be really impractical, or impossible, to print out ALL possible tables for all possible normal distributions.
Example: Assume that the mean weight of children in fifth grade is 72 pounds, with a standard deviation of 8 pounds, and the distribution follows normal distribution. Compute the probability that a random child weights less than 75.5 pounds.
Solution: Observe that the event \(X<75.5\) can be expressed equivalently as
\[X-72<75.5-72\]
Why? Because we simply subtracted 72 to both sides of the inequality, which does not change the solutions of the inequality. Along the same reasoning, I can divide both sides by 8 to get an equivalent event
\[\frac{X-72}{8}<\frac{75.5-72}{8}\]
PLEASE, DON'T GET CONFUSED HERE: All we are saying is that if X is a solution of \(X<75.5\), then X is also a solution of \(X-72<75.5-72\), and then X is also a solution of \(\frac{X-72}{8}<\frac{75.5-72}{8}\). And reversely, if X is a solution of \(\frac{X-72}{8}<\frac{75.5-72}{8}\), then X is also a solution of \(X-72<75.5-72\) and X is also a solution of \(X<75.5\). That is what we mean when we say that the events \(\left\{ X<75.5 \right\}\), \(\left\{ X-72<75.5-72 \right\}\) and \(\left\{ \frac{X-72}{8}<\frac{75.5-72}{8} \right\}\) are EQUIVALENT (this is, they define the same set of solutions).
So therefore, in this example, we need to compute the following probability:
\[\Pr \left( X<75.5 \right)=\Pr \left( \frac{X-72}{8}<\frac{75.5-72}{8} \right)=\Pr \left( Z<0.4375 \right)=0.6691\]
As you can see, standard with a certain normal distribution, I made the transformation to get an equivalent event that involves a Z-score, and then I can use any standard normal distribution table (or Excel) to compute the final probability.
The Central Limit Theorem (CLT)
If the above was not a strong enough reason for you to LOVE the normal distribution (in spite of its cumbersome algebraic shape), I'll give you a reason you cannot resists. It turns out that there are many types of probability distributions (I mean, MANY), that can have completely different properties than the normal distribution. But, if you take repetitions of a random variable, from ANY distribution, and you compute their average, those averages will be (what you think?) dangerously resembling to a normal distribution, especially when the sample size (number of repetitions) is large.
So then, the process of taking averages of a sample of values coming from ANY probability distribution and now analyzing the distribution of those averages, we start seeing a normal distribution (when the sample size is large). Somehow, taking averages bends the original shape of the distribution and turns it into normal, REGARDLESS of the underlying distribution. This fact is one of the most amazing discoveries in Statistics, made by Carl Friederich Gauss. A word of caution, the Central Limit Theorem has a formal statistical formulation, which we won't include here, but it states that the sample averages CONVERGE to a normal distribution, in a certain probability sense. Without entering into too many technicalities, that means that for most cases, the sample averages have an APPROXIMATE normal distribution for a sufficiently large sample size. It is all too common that sometimes instructors give the wrong interpretation by saying that the distribution of sample averages BECOMES a normal distribution, which is not true in general (actually, it is only true when the underlying original distribution is normal).
So that is why the normal distribution is highly cherished: it is because it has this kind of magic property that taking averages of any distribution you will end up with something that looks fairly normal, if you take a sample size large enough.
|
Submit your problems for a free quote and we will be back shortly (a couple of hours max). It costs you NOTHING to find out how much it would cost to solve your problems.
We provide a quality problem solving service on the following stats topics:
- Probability
- Basic Concepts: Sample Space, Events.
- Densities and Distributions.
- Descriptive statistics.
- Descriptive Analysis of data.
- Graphs and charts.
- Inferential Statistics
- Means, variances, populations, samples.
- Intervals of Confidence.
- Z-test, T-test and F-tests.
- Hypothesis Testing.
- ANOVA.
- Correlation.
- Linear and non-linear regression.
- Non-parametric Statistics.
- Sign Test.
- Wilkinson Tests.
- Kruskal-Wallis Test.
- Spearman Correlation Coefficient.
Our team is highly experienced in SPSS, Minitab, EXCEL and the majority of the statistical software packages out there. Request your free quote. We a have a satisfaction guarantee policy. If you're not satisfied, we'll refund you. Please see our terms of service for more information about the satisfaction guaranteed policy. See also a sample of our work.
Why we can help with your Stats?
Experience
We have successfully help customers online for more than 10 years now
Statistics Expertise
We can do handle any type of statistics analyis/homework/questions. Our tutors have real expertise, and big majaority of our customers are returning customers
Step-by-Step Solutions
We provide detailed, step-by-step solutions, and we strive to provide exactly what our customers want.
Free Quote
E-mail us your problems, we will review them and promptly come back to you with a free quote
Very Competitive Prices
We strive to provide the best possible prices for our services
We take pride of our work
Our tutors take pride on the work we do. We diligenty do work for our customers, and put great attention to details striving to always provide a great final product