Statistics Tutorials - Notation in Inferential Statistics

Statistics Tutorials: The Use of Notation in Basic Statistics - Part II

This is a follow up from the previous section, where the most common notations for descriptive statistics were presented. It is crucial to understand how notation is used, as notation in Math and Statistics are used as shortcuts, and as such, if you do not understand their meaning, you will be soon lost and REALLY not understanding what is being talked about.

In the following paragraphs we will continue this series, attempting to clarify the use of notation in Inferential Statistics, where more profuse and sophisticated notation are used, and consequently you should pay attention to what comes.

Notation in Inferential Statistics

The following symbols and notations are commonly used when working with Inferential Statistics. These symbols are still used throughout most of your Statistics class.

· $\mu$: This is the generic symbol the represents the population mean. This is a parameter (because it is constant that is not constructed with sample information). Sometimes $\mu$ comes with a sub-index to represent the population mean of which variable we are talking about. For example, if we see ${{\mu }_{X}}$, that symbol refers to the population mean of the random variable $X$. In general terms, if$f\left( x \right)$ is the distribution (density) random variable $X$, the population mean is computed with the following expression:

\[{{\mu }_{X}}=\int\limits_{-\infty }^{\infty }{x\,f\left( x \right)dx}\]

in the case of a continuous random variable, or

\[{{\mu }_{X}}=\sum\limits_{k}{{{x}_{k}}f\left( {{x}_{k}} \right)}\]

for the case of a discrete distribution.

A couple of things to keep in mind: Although $\mu$ is the generic symbol to refer to the population mean, there are certain distributions that customarily use different symbols. For example, if X is a Poisson random variable, the tradition is to use $\lambda$ as the symbol for the population mean. The important thing to keep in mind is that it is only a notation, this is, a CONVENTION.

· ${{\sigma }^{2}}$: This is the population variance, which is computed as

\[{{\sigma }^{2}}=\int\limits_{-\infty }^{\infty }{{{x}^{2}}\,f\left( x \right)dx}-{{\mu }^{2}}=\int\limits_{-\infty }^{\infty }{{{x}^{2}}\,f\left( x \right)dx}-{{\left( \int\limits_{-\infty }^{\infty }{xf\left( x \right)dx} \right)}^{2}}\]

This is population parameter, because it is a fixed number (not a random variable) that is not constructed from sample information). Same as with the population mean, it is customary to add a sub-index to represent the underlying variable. This is, $\sigma _{X}^{2}$ represents the population variance of the random variable X, whereas $\sigma _{Y}^{2}$ represents the population variance of the random variable Y.

Again, same as in the previous case, this is a most common NOTATION (or shortcut, if you will) to write the population variance. But there are cases where the tradition is to use something else. For example, if X has a Poisson distribution, then we mentioned before that the population mean is referred to as $\lambda$, and it turns out that when computing the population variance, we find that it is equal to $\lambda$ as well. In such case, we would write $\sigma _{X}^{2}=\lambda$. So, please, please, do not get confused between a the notation part of $\sigma _{X}^{2}=\lambda$ and the calculation part of $\sigma _{X}^{2}=\lambda$.

· $\sigma$: This is the population standard deviation, which is computed by taking the square root of the population variance, or simply by using the formula below,

\[\sigma =\sqrt{\int\limits_{-\infty }^{\infty }{{{x}^{2}}\,f\left( x \right)dx}-{{\left( \int\limits_{-\infty }^{\infty }{xf\left( x \right)dx} \right)}^{2}}}\]

This is parameter, because it is a fixed number that is not constructed with sample information.

· ${{H}_{0}}$: This is the notation for the null hypothesis. In hypothesis testing, the null hypothesis is the hypothesis of no effect

· ${{H}_{A}}$: This is the notation for the alternative hypothesis. In hypothesis testing, the alternative hypothesis is the hypothesis that can be proved if the sample data is sufficiently unlikely, if the null hypothesis Ho were true

· $\Theta$: This is a less commonly used symbol, and it represents the set of all possible values for the population parameter. For example, if X is a normally distributed random variable, with a population variance of ${{\sigma }^{2}}=1$, and an unknown population mean $\mu$, the set of all possible values that can be taken by $\mu$ is the whole real line. So, in other words, we would have in that case that $\Theta =\left( -\infty ,\infty \right)$.

· ${{\Theta }_{0}}$: In the context of the above symbol, this symbol represents the possible values taken by a population parameter as stated in the null hypothesis of a hypothesis test. For example, assume that X is a normally distributed random variable, with a population variance of ${{\sigma }^{2}}=1$, and an unknown population mean, and we are interested in testing the following null and alternative hypotheses:

\[\begin{align} & {{H}_{0}}:\mu =0 \\ & {{H}_{A}}:\mu \ne 0 \\ \end{align}\]

In that case, we would have that ${{\Theta }_{0}}=\left\{ 0 \right\}$.

· ${{\Theta }_{A}}$: Along the lines of the previous symbols, this symbol represents the possible values taken by a population parameter as stated in the alternative hypothesis of a hypothesis test. For example, assume that X is a normally distributed random variable, with a population variance of ${{\sigma }^{2}}=1$, and an unknown population mean, and we are interested in testing the following null and alternative hypotheses:

\[\begin{align} & {{H}_{0}}:\mu =0 \\ & {{H}_{A}}:\mu \ne 0 \\ \end{align}\]

In that case, we would have that ${{\Theta }_{A}}=\left( -\infty ,0 \right)\cup \left( 0,\infty \right)$. Notice that by definition, we need to have that $\Theta ={{\Theta }_{0}}\cup {{\Theta }_{A}}$.

· $\rho$: This corresponds to the population correlation between variables X and Y. In order to be more explicit about the variables involved, the notation can be written as $\rho \left( X,Y \right)$ or even ${{\rho }_{X,Y}}$.

· $\pi$: Although not universal, this symbol is used to represent a population proportion. Along those lines, ${{\pi }_{1}}$ will represent the population proportion (for some categorical variable) in population 1, etc. Sometimes, a plain $p$ is used to represent a population proportion, but I think that is a bad idea, although, more or less, $p$ is the most commonly used notation to represent a population proportion.

· $\sim$: The "tilde" symbol is used to represent that a certain random variable has a specified distribution. For example, if we see: $X\tilde{\ }Poisson\left( \lambda \right)$, we interpret it as: "X is a random variable that has a Poisson distribution with mean $\lambda$".

You can get quality and prompt Statistics Help Online.

We offer personalized help for any kind of Statistics subjects including Elementary Statistics, Business Statistics, Biostatistics, Probabilities, Advanced Statistics, etc.

Our service is convenient, efficient and confidential. We can solve stats problems for you.

We can help you with your EXCEL, SPSS, SAS, STATA, JMP and MINITAB assignments and projects.

Our rate starts $35/hour. We provide a Free Quote in hours. Quick turnaround!

E-mail us your Statistics Homework
for a Free Quote

Submit your problems for a free quote and we will be back shortly (a couple of hours max). It costs you NOTHING to find out how much it would cost to solve your problems.

We provide a quality problem solving service on the following stats topics:

Probability
- Basic Concepts: Sample Space, Events.
- Densities and Distributions.

Descriptive statistics.
- Descriptive Analysis of data.
- Graphs and charts.

Inferential Statistics
- Means, variances, populations, samples.
- Intervals of Confidence.
- Z-test, T-test and F-tests.
- Hypothesis Testing.
- ANOVA.
- Correlation.
- Linear and non-linear regression.

Non-parametric Statistics.
- Sign Test.
- Wilkinson Tests.
- Kruskal-Wallis Test.
- Spearman Correlation Coefficient.

Our team is highly experienced in SPSS, Minitab, EXCEL and the majority of the statistical software packages out there. Request your free quote. We a have a satisfaction guarantee policy. If you're not satisfied, we'll refund you. Please see our terms of service for more information about the satisfaction guaranteed policy. See also a sample of our work.

Why we can help with your Stats?

Experience

We have successfully help customers online for more than 10 years now

Statistics Expertise

We can do handle any type of statistics analysis/homework/questions. Our tutors have real expertise, and big majority of our customers are returning customers

Step-by-Step Solutions

We provide detailed, step-by-step solutions, and we strive to provide exactly what our customers want.

Free Quote

E-mail us your problems, we will review them and promptly come back to you with a free quote

Very Competitive Prices

We strive to provide the best possible prices for our services

We take pride of our work

Our tutors take pride on the work we do. We diligently do work for our customers, and put great attention to details striving to always provide a great final product

and more...

Prices

Prices start at $35 per hour, depending on the complexity of the work and the turnaround time

Contact Us

You can e-mail us your problems for a free quote.