Height of an Office Building is the Variable Discrete or ‹ Continuous Height of an Office Building

Some widely used applications of the popular statistical tables can be categorized as follows:

Conditions for using this table: Test for randomness of the data is needed before using this table. Test for normality condition of the population distribution is also needed if the sample size is small, or it may not be possible to invoke the central limit theorem.

Notes: As you know by now, in test of hypotheses concerning m, and construction of confidence interval for it, we start with s known, since the critical value (and the p-value) of the Z-Table distribution can be used. Considering the more realistic situations, when we don't know s, the T-Table is used. In both cases, we need to verify the normality condition of the population's distribution; however, if the sample size n is very large, we can in fact switch back to Z-Table by virtue of the central limit theorem. For perfectly normal populations, the t-distribution corrects for any errors introduced by estimating s with s when doing inference.

Note also that, in hypothesis testing concerning the parameter of binomial and Poisson distributions for large sample sizes, the standard deviation is known under the null hypotheses. That's why you may use the normal approximations for both of these distributions.

Conditions for using this table: Test for randomness of the data is needed before using this table. Test for normality condition of the population distribution is also needed if the sample size is small, or it may not be possible to invoke the Central Limit Theorem.

Conditions for using this table: The necessary conditions for using this table for all the above tests, except for the last one, can be found at Conditions for the Chi-square Based Tests. The last application requires normality (condition) of the population distribution.

Conditions for using this table: Tests for randomness of the data and normality (condition) of the populations are needed before using this table for ANOVA. Same conditions must be satisfied for the residuals in regression analysis.

The following chart summarizes application of statistical tables with respect to test of hypotheses and construction of confidence intervals for mean mand variance s 2 in one population or the comparison of two or more populations.

You may like using Online Statistical Computation in performing most of these tests. The P-values for the Popular Distributions Web site provides P-values useful in major statistical testing. The results are more accurate than those that can be obtained (by interpolation) from statistical tables of your textbook are.

Further Reading:
Balakrishnan N., and V. Nevzorov, A Primer on Statistical Distributions, Wiley, 2003.
Evans M., N. Hastings, and B. Peacock, Statistical Distributions, Wiley, 2000.
Kanji G., 100 Statistical Tests, Sage Publisher, 1995.


Numerical Examples for Statistical Tables

The presentation of the statistical tables is not universal. Some statistical textbooks authors' enjoy given tabular values of the right-tail probabilities, while for others left-tail probabilities are preferred. Even within each of these groups you will find some differences in presenting each table differently than others, never in a unified format. This lack of uniformity often confuses most of students while learning statistics.

The following presents some numerical examples of common statistical tables with some applications. You may like using The P-values for the Popular Distributions JavaScript.

Binomial Probability

X ~ B(n, p), read, the random variable X has a binomial distribution with parameters n trials, and probability of a success is p.

Example: Find probability of at most k = 3 success from B(n = 7, p = 0.4). Using any Binomial table, one should get:
P[k £ 3] = 0.7102.
Using The P-values for the Popular Distributions JavaScript, one gets:

P[k £ 3] = 1 – P[k ³ 4] = 1 – 0.2898 = 0.7102.

Questions for you: Which of the following two events is more likely to happen? Getting exactly 6 heads in tossing a fair coin (i.e, p=1/2), n = 10 times or tossing it n=20 times. Why?

Application: A traveling salesman has find that the probability of a sale on a single contact is 0.02. If the salesman contacts 200 prospects, find the probability that he will make at least one sale.

P[at least one sale] = 1 – P[no sale] = 1 – (1-0.02)200 = 1 – (0.98)200 = 98%

Normal Density Function

X ~ N(0, 1), read, the random variable X is distributed Normally with mean, and variance 0, and 1, respectively.

A Fact: If X ~ N(m, s), then

Z = (X - m) / s ~ N(0, 1)

Example: Let X ~ N(1, 2), compute P(X £ 5.21)
P[ (X -1) / 2 £ (5.21 -1) / 2] = P(Z £ 2.105) » P(Z £ 2.11) = .4826 + .5 = .9826
Notice that P(Z £ 0) = .5

Similarly, P(X ³ 2.1) = P(Z ³ (2.1 - 1) / (2)) = P(Z ³ .55) = 0.5 - .2088 = .2912
Using The P-values for the Popular Distributions JavaScript, the 2p-value is:

P[| Z | £ 2.1] = 0.582.

Questions for you: Compute P( X ³ 3),  P(1 £ X £ 4),  P(X ³ - 1),   find the value of a such that P(X ³ a) = 0.4515

Applications:

1. Testing hypotheses on the population's mean, with a known variance, at a given significance level a.

H0: m = m 0 Ha: m ¹ m 0

A Fact: Given X ~ N( ? , s) and having a random realization of size n: x 1 , x 2 , ..., x n , then

Z = [xbarn - m] / (s / n 1/2 ) ~ N(0,1).
Notice that in most cases, the standard deviation (s is unknown. However, one may use the sampling estimate S for s provided the sample size is large enough, say, over 30.)

Given n = 4, xbar 4 = 492  , test H0: m = 500 at significance level a = 0.05 if s = 16?
The Z-statistic is Z = [492 - 500] / [16 / (4 1/2 )] = -1, however the tabulated critical Z-value is Z .025 = 1.96
Conclusion: No reason to reject H0.

Question for you: Given the same sampling information, test H0: m = 505 vs. Ha: m ¹ 505.

2. Setting a confidence interval on the mean, variance known.
Given xbar 4 = 492   construct a 95% confidence interval for m given s = 16

P [xbar - Z a/2 s / n 1/2 £ m £ xbar + Z a/2 s / n 1/2 ] ³ 1- a
Plugging in the numerical values, one gets:
P[476.3 £ m £ 507.7] ³ 0.95

Notice the Duality between the test of hypothesis and confidence interval.

Question for you:   Given the same sampling information, construct a 90% confidence interval for m given the same information.

3. Central Limit Theorem (CLT)

A Fact: If E(X) = m, Var(X) = s 2 , then

(xbar - m) / (s / n 1/2 ) ~ N(0,1),
for large n, say, n ³ 30

As a strong result, the CLT implies that if the sample size is large enough, then one may relax the normality condition whenever dealing with the question of testing or constructing confidence interval for population's mean (m).

T-Density Function

A Fact: If X ~ N(m, ?), then

[xbar - m] / [S / n 1/2 ] ~ t n-1

Example: Find t such that P(T 11 > t) = .1 => t = 1.363
Using The P-values for the Popular Distributions JavaScript, the 2p-value is:

P[| T | £ 1.363] = 0.2.

Question for you: Find t such that P( T 8 > t ) = .01

Applications:

1. Testing hypotheses on mean, variance unknown
Given xbar 16 = 12.1 and S2 = 2.225, test m = 12.5 vs m ¹ 12.5, at a = 0.05 significance level.
The computed statistic is t = -1.07 but the critical value from t-table = 2.131.
Conclusion: There is no reason to reject that m = 12.5.

Question for you: Given the same sampling information perform the test H0: m = 11 vs m ¹ 11 at a = .01.

2. Construction of confidence interval for m, variance unknown
Example: Given xbar16 = 12.1, S 2 = 2.225 develop a 95% confidence interval for m

P[xbar - t(a / 2, n - 1) S / n 1/2 £ m £ xbar + t (a / 2, n- 1) S / n 1/2 ] ³ 1 - a
Therefore P[11.31 £ m £ 12.89] ³ 0.95 Again, notice the Duality between the test of hypothesis and confidence interval.

Question for you: Construct a 90% confidence interval for the same problem, is it wider than the other one, why or why not?

Notice that the T-density converges to the standard normal N(0, 1) as sample size gets larger. In fact the elements in the last row of t-table are the N(0,1) probabilities.

Chi-square Density Function

A Fact: If X ~ N(?, s), then the random variable

(n-1)S 2 / s 2 ~ c 2 (n - 1)
the parameter (n-1) = n is called degrees-of-freedom (d.f).
Example: if d.f. = n = 15, and a = 0.975 find the c 2 value. From the c 2 table, we get c 2 =6.26
Using The P-values for the Popular Distributions JavaScript, the p-value is:

P[c 2 £ 6.26] = 0.975.

Applications:

1. Tests of hypotheses on the variance of a normal population.

Given n = 16 and S 2 = 2.22 test that s 2 = 2.0 at a = .05. The sampling statistic is c 2 0 = 16.65, however from the table, the critical values are c 2 ( 15, .025) = 27.4884 and c 2 (15, .975) = 6.26
Conclusion: There is no reason to reject that s 2 = 2.0

2. Interval estimation of the variance of a normal population

P[(n-1)S 2 / c 2 (n, a /2) £ s 2 £ (n-1)S 2 / c 2 (n, 1-a /2) ] ³ 1 - a

Example: Given the same sampling information as above construct a 95% confidence interval for s
Plugging in the given information, you should get:
P[ 1.332 £ s 2 £ 4.587] ³ .95

Again, notice the Duality between the test of hypothesis and confidence interval.

Question for you: Given the same sampling information should we reject that s 2 = 2.0? at a = .1
Note that c 2 (15, .05) = 8.55, and c 2 (15, .95) = 7.26.

F-Density Function

A Fact: Consider two independent samples, one form two normal populations with known variance s 2 1, and s 2 2, then

(S 1 2 / s 1 2 ) / (S 1 2 / s 1 2 ) ~ F(n 1 - 1, n 2 - 1)

Example: Find F such that P[F 8, 7 ³ F] = .05 => The F value is F = 3.79

Notice: By now, you should have noticed that while every Statistical Table collected at the end of your textbook, provides the critical values for the right-tail as well as the left-tail probabilities, except the F-Table, which contains the critical values for the right-tail probabilities only. However, one might use the following nice property of F-distribution that:

F n1, n2, 1- a = 1 / F n2, n1, a

to obtain the critical values for the left-tail probabilities. Here is a numerical example:

F 2, 3, 0.9 = 1 / F 3, 2, 0.1 = 1 / 9.16 = 0.109

You need both tails probabilities for test of hypothesis and construction of confidence interval for the ratio of two independent populations' variances.

Example: Find P[F 8, 7 ³ F] = .95. We may not be able to get the critical value from the table, however, one may utilize the fact that:

F n 1 , n 2 , 1 - a = 1 / F n 2 , n 1 , a

Therefore, F = 1 / 3.50 = 0.2857
Using The P-values for the Popular Distributions JavaScript, the p-value is: P[ F £ 0.2857] = 0.942 (which is exact).

Applications:

1. Testing of hypothesis on the variance of two normal populations.

Example: Given n 1 = n 2 = 16, S 1 2 = 34.14, and S 2 2 = 47.32, should we reject that s 1 2 = s 2 2 at a = 0.1
The sampling statistics is F = S 1 2 / S 2 2 = .785, but the critical values are F 15, 15, .05 = 2.38, and F 15, 15, .95 = 1 / 2.38 =0.421.
Conclusion: Therefore is no reason to reject.

Question for you: Given the same sampling information, construct a 90% confidence interval for variance ratio: s 1 2 / s 2 2


Binomial Probability Function

An important class of decision problems under uncertainty involves situations for which there are only two possible random outcomes.

The binomial probability function gives probability of exact number of"successes" in n independent trials, when probability of success p on single trial is a constant. Each single trial is called a Bernoulli Trial satisfying the following conditions:

  1. Each trial results in one of two possible, mutually exclusive, outcomes. One of the possible outcomes is denoted (arbitrarily) as a success, and the other is denoted a failure.
  2. The probability of a success, denoted by p, remains constant from trial to trial. The probability of a failure, 1-p, is denoted by q.
  3. The trials are independent; that is, the outcome of any particular trial is not affected by the outcome of any other trial.

The number of ways of getting r successes in n trials is:

P (r successes in n trials) = nCr . pr . (1- p)(n-r)
= n! / [r!(n-r)!] . [pr . (1- p)(n-r)].

The mean and variance of random variable r, are np and np(1-p), respectively, where q = 1 - p. The skewness and kurtosis are (2q -1)/ (npq) ½ , and (1- 6pq)/(npq), respectively. From its skewness, we notice that the distribution is symmetric for p =1/2 and most skewed when p is 0 or 1.

Its mode is within interval [(n+1)p -1, (n+1)p], therefore if (n+1) p is not an integer, then the mode is an integer within the interval. However if (n+1)p is an integer, then its probability function has two but adjacent modes: (n+1)p -1, and (n+1)p.

Determination of probabilities for p over 0.5: The binomial tables in some textbooks are limited to deterring the probabilities for values of p up to 0.5. However, these tables can be used for values of p over 0.5. By recasting a problem in terms of p to 1 -p, and setting r to n-r, then the probability of obtaining r successes in n trials for a given value of p is equal to the probability of obtaining n-r failures in n trials with 1-p.

An Application: A large shipment of purchased parts is received at a warehouse, and a sample of 10 parts is checked for quality. The manufacturer's claim is that at most 5% might be defective. What is the chance that the sample includes one defective?

P (one defective out of ten) = {10! /[(1!)(9!)]}(0.05) 1 (0.95) 9 = 32%.

Know that the binomial distribution is to satisfy the five following requirements: (1) each trial can have only two outcomes or its outcomes can be reduced to two categories which are called pass and fail, (2) there must be a fixed number of trials, (3) the outcome of each trail must be independent, (4) the probabilities must remain constant, (5) and the outcome of interest is the number of successes.

Normal approximation for binomial: All binomial tables are limited in their scope; therefore it is necessary to use standard normal distribution in computing the binomial probabilities. The following numerical example illustrates how good the approximation could be. This provides an indication for real applications when n is beyond the given values in the available binomial tables.

Numerical Example: A sample of 20 items are taken randomly from a manufacturing process with defective probability p = 0.40. What is the probability of obtaining exactly 5 defective?

P (5 out of 20) = {20!/[(5!)(15!)]} ´ (0.40) 5 (0.6) 15 = 7.5%

Since the mean and standard deviation of distribution are:

m = np = 8, and s = (npq) 1/2 = 2.19,

respectively; therefore, the standardized observation for r = 5, by using the continuity factor (which always enlarges) are:

z 1 = [(r-1/2) - m] / s = (4.5 -8)/2.19 = -1.60, and

z 2 = [(r+1/2) - m] / s = (5.5 -8)/2.19 = -1.14.

Therefore, the approximated P (5 out of 20) is P (z being within interval -1.60, -1.14). Now, by using the standard normal table, we obtain:

P (5 out of 20) = 0.44520 - 0.37286 = 7.2%

Comments: The approximation for binomial distribution is used frequently in quality control, reliability, survey sampling, and other industrial problems.

Poisson approximation for binomial: Notice that, whenever you use Poisson approximation to the binomial distribution with parameters n and p, then the goodness of the approximation is largely determined by the smallness of the p parameter rather than how large is n.

You might like to use Common Discrete Probability Functions to obtain probability and the cumulative probability functions.

You might like to use the Exact Confidence Interval Construction and Test of Hypothesis for Binomial Population , and Binomial Probability Function JavaScript in performing some numerical experimentation for validating the above assertions for a deeper understanding.


Geometric Distribution

In a sequence of independent and identically distributed Bernoulli (p) trials, the number of trials required to get the 1st success has a Geometric(p) distribution.


A Typical Geometric Probability Function
Click on the image to enlarge it and THEN print it

If a single event or trial has two possible outcomes, say Xi can be 0 or 1 with P(Xi=1) = p, the probability of having to observe k trials before the first "one" appears is given by the geometric distribution.

The probability that the first "one" would appear on the first trial is p.
The probability that the first "one" appears on the second trial is p(1-p), because the first trial had to have been a zero followed by a one.
By generalizing this procedure, the probability that there will be k-1 failures before the first success is:

P (X = k) = (1 –p) k-1p

This is the geometric distribution.

A geometric distribution has a mean of 1/p and a variance of (1-p)/p2.

Application: A manufacturing process is monitored. As each product exits the process line, it is tested for defective versus non-defective. On the first defect, the process is stopped for re-adjustment. The random variable X follows a Geometric distribution with p = P(product is non-defective).

The Geometric distribution has the memoryless property. Mathematically, for any non-negative integers s and t, this property can be written

P(X = s + t | X ³ s ) = P(X = t)

Application: Gives probability of requiring exactly x binomial trials before the first success is achieved. Used in quality control, reliability, and other industrial situations.

Example: Determination of probability of requiring exactly five tests firings before first success is achieved.

The Geometric distribution is the discrete analogue of the Exponential distribution, which models the time needed to get a success.

The Exponential distribution is the continuous analog of the Geometric distribution. Like the Geometric distribution, the Exponential distribution also has the memoryless property.

Mathematically, for any non-negative real numbers s and t, this property can be written

P(X > s + t | X > s ) = P(X > t)

The Exponential distribution is a special case of the Gamma distribution (r = 1). Furthermore, the sum of r independent and identically distributed Exponential (l) random variables has a Gamma distribution with parameters r and theta.

In a Poisson (l) process, the waiting times between consecutive events are distributed as Exponential with mean 1/(l).

You might like to use Common Discrete Probability Functions to obtain probability and the cumulative probability functions.


Negative Binomial Distribution

This is an extension of the geometric distribution, describing the waiting time until r "ones" have appeared. The probability of the rth "one" appearing on the kth trial is given by the negative binomial distribution:

P (X = k) = r-1 C k-1pr-1 (1 –p) k-r p

in other words, the first part is the probability of r-1 success in the previous k-1 trails as a binomial probability, the last tem is the probability of success.

The following is a Negative Binomial probability function with parameters (r = 6 , k= 30, p = 0.5):


Click on the image to enlarge it and THEN print it.
A Negative Binomial Probability Function

A negative binomial distribution has:

mean = r/p and variance = r(1-p)/p2

Application: Suppose we are at a rifle range with an old gun that misfires 5 out of 6 times. Define ``success'' as the event the gunfires and let X be the number of failures before the third success. Then X has a negative binomial with parameters (3, 1/6). The probability that there are 10 failures before the third success is given by:

P(X = 10) = 2C12 (1/6)3 (5/6)10 = 5%

The expected value and variance of X are: E(X) = 3(1-5/6) / (1/6) = 15, and Var(X) = 3(1-5/6) / (1/6)2 = 90.

In a sequence of independent and identically distributed Bernoulli (p) trials, the number of trials required to get the rth success has a Negative Binomial (r,p) distribution.

Example: The number of oil wells that must be drilled to get r productive wells.

Relationships to Other Distributions: A Negative Binomial (r, p) random variable can be thought of as the sum of r independent and identically distributed Geometric(p) random variables. The Geometric (p) is a special case of the Negative Binomial with r=1.

Application: Gives probability similar to Poisson distribution when events do not occur at a constant rate and occurrence rate is a random variable that follows a gamma distribution.

Example: Distribution of number of cavities for a group of dental patients.

Comments: Generalization of Pascal distribution when s is not an integer. Many authors do not distinguish between Pascal and negative binomial distributions.

You might like to use Common Discrete Probability Functions to obtain probability and the cumulative probability functions.


Hypergeometric Distribution

The Hypergeometric (x; n, M, N) Distribution applies when we are sampling n items without replacement from a population of M successes and N-M failures.

The hypergeometric distribution arises when a random selection (without repetition) is made among objects of two distinct types. Typical examples:

Choose a team of 8 from a group of 10 men and 7 women.
Choose a committee of five from the legislature consisting of 52 Democrats and 48 Republicans.

The Concept of Hypergeometric Events
The Concept of Hypergeometric Events

The above Venn diagram depicts choosing a random subset of size r from n items of which M = m items belong in a particular category, the probability that x = k of the selected items belong to that category.

The Binomial distribution looks at n trials "with replacement." The hypergeometric distribution is for the case "without replacement."

Here p changes from one Bernoulli trial to the next. Specifically, we have a population of size N with M out of the N members being "Successes" and the remaining (N-M) being "Failures." We choose a random sample of n (equivalent to taking out n members in succession without replacement).

The probability that X = x and given by:

P (X = x) = x C M n-x C N-M / m C N

for all integers x between Max [0, n -(N+M)] and Min [n, M].

The expected value and variance of X are given by:

nM / N and nM(N-n)/(N-1)(N-M) / [N2(N-1)],

respectively.

In other words, there is a total number of N chips in the urn and n chips are drawn at random without replacement. Out of these n chips, k chips are red, and the remainder (n - k) are white. So, the formula is the number of ways to choose k chips from r red chips in the urn multiplied by the number of ways to choose n - k chips from white chips. This is divided by the sample space, or the number of ways to select n chips from the total of N chips in the urn.

Application: Gives probability of picking exactly x good units in a sample of n units from a population of N units when there are k bad units in the population. Used in quality control and related applications.

Example: Given a lot with 21 good units and four defective. What is the probability that a sample of five will yield not more than one defective?

Example: The number of defective items in a sample of size n from a box containing N items of which k are defective.

Application: A manufacturing process is monitored. As each product exits the process line, it is tested for defective versus non-defective. On the fifth defect, the process is stopped for re-adjustment. The random variable X follows a Negative Binomial distribution with r = 5 and p = P(product is non-defective).

Relationships to Other Distributions: The Hypergeometric (N, k, n) may be approximated by a Binomial (n, p = k/N) if N is very large relative to n. In this circumstance, replacement and non-replacement tend to become indistinguishable.

By extension, since the Binomial can be approximated by the Poisson, we can also approximate the Hypergeometric by a Poisson if the Binomial approximation is appropriate and n is reasonably large with k/N small.

You might like to use Common Discrete Probability Functions to obtain probability and the cumulative probability functions.


Exponential Density Function

An important class of decision problems under uncertainty concerns the random durations between events. For example, the the length of time between breakdowns of a machine not exceeding a certain time interval, such as the copying machine in your office not breaking down during this week.

Exponential distribution gives distribution of time between independent events occurring at a constant rate. Its density function is:

f(t) = l exp(-lt),

where l is the average number of events per unit of time, which is a positive number.

The mean and the variance of the random variable t (time between events) are 1/ l, and 1/l 2 , respectively.

Applications include probabilistic assessment of the time between arrivals of patients to the emergency room of a hospital, and time between arrivals of ships at a particular port.

Comments: Itis a special case of Gamma distribution.

You might like to use Exponential Density to perform your computations, and Lilliefors Test for Exponentiality to perform the goodness-of-fit test.


F-Density Function

The F-distribution is the distribution of the ratio of two independent sampling (of size of n 1 , and n 2 , respectively) estimates of variance from standard normal distributions. It is also formed by the ratio of two independent chi-square variables divided by their respective independent degrees of freedom.

Its main applications are in testing equality of two independent population variances based on two independent random samples, ANOVA, and regression analysis.

By now, you should have noticed that while every Statistical Table collected at the end of your textbook, provides the critical values for the right-tail as well as the left-tail probabilities, except the F-Table, which contains the critical values for the right-tail probabilities only. However, one might use the following nice property of F-distribution that:

F n1, n2, 1- a = 1 / F n2, n1, a

to obtain the critical values for the left-tail probabilities. Here is a numerical example:

F 2, 3, 0.9 = 1 / F 3, 2, 0.1 = 1 / 9.16 = 0.109

You need both tails probabilities for test of hypothesis and construction of confidence interval for the ratio of two independent populations' variances.

You might like to use F-Density Function to obtain its P-values.


Chi-square Density Function

The probability density curve of a Chi-square distribution is an asymmetric curve stretching over the positive side of the line and having a long right tail. The form of the curve depends on the value of a parameter known as the degree of freedom (d.f.).

The expected value of Chi-square statistic is its d.f., its variance is twice of its d.f., and its mode is equal to (d.f.- 2).

Chi square Distribution relation to Normal Distribution: The Chi-square distribution is related to the sampling distribution of the variance when the sample is from a normal distribution. The sample variance is a sum of squares of standard normal variables N (0, 1). Hence, the of square of N (0,1) random variable is a Chi-square with 1 d.f..

Notice that the Chi-square is related to F-statistics as follows: F = Chi-square/d.f. 1 , where F has (d.f. 1 = d.f. of the Chi-square-table, and d.f. 2 is the largest available in the F-table)

Similar to Normal random variables, the Chi-square has the additive property. For example, for two independent Chi-square variables, their sum is also Chi-square with degrees of freedom equal to the sum of the d.f. of the individual d.f.s. Thus the unbiased sample variance for a sample of size n from N (0,1) is a sum of n-1 Chi-squares, each with d.f. = 1, hence Chi-square with d.f. = n-1.

The most widely used applications of Chi-square distribution are:

The Chi-square Test for Association which is a non-parametric test; therefore, it can be used for nominal data too. It is a test of statistical significance widely used bivariate tabular association analysis. Typically, the hypothesis is whether or not two populations are different in some characteristic or aspect of their behavior based on two random samples. This test procedure is also known as the Pearson Chi-square test.

The Chi-square Goodness-of-Fit Test is used to test if an observed distribution conforms to any particular distribution. Calculation of this goodness-of-fit test is by comparison of observed data with data expected based on a particular distribution.

You might like to use Chi-square Density to find its P-values.


Multinomial Probability Function

A multinomial random variable is an extended binomial. However, the difference is that in a multinomial case, there are more than two possible outcomes. There are a fixed number of independent outcomes, with a given probability for each outcome.

The Expected Value (i.e., averages):

Expected Value = m = S (X i ´ P i ),     the sum is over all i's.

Expected value is another name for the mean and (arithmetic) average.

It is an important statistic, because, your customers want to know what to "expect", from your product/service OR as a purchaser of "raw material" for your product/service you need to know what you are buying, in other word what you expect to get:

To read-off the meaning of the above formula, consider computation of the average of the following data

2, 3, 2, 2, 0, 3

The average is Summing up all the numbers and dividing by their counts:

(2 + 3 + 2 + 2 + 0 + 3) / 6

This can be group and re-written as:

[ 2(3) + 3(2) + 0(1)] / 6 = 2(3/6) + 3(2/6) + 0(1/6)

which is the sum of each distinct observation times its probability. Right?

Expected value is known also as the First Moment, borrowed from Physics, because it is the point of balance where the data and the probabilities are the distances and the weights, respectively.

The Variance is:

Var(X) = E[(X- m)2] = E[X2 - 2Xm + m 2].

We simplify this using the above rules. First, because the expectation of a sum equals the sum of expectations:

Var(X) = E[X2] - E[2Xm] + E[m 2].

Then, because constants may be taken out of an expectation:

Var(X) = E[X2] - 2 mE[X] + m 2 E[1] = E[X2] - 2 m 2 + m2 = E[X2] - m 2.

Finally, notice that E[X2] can be written as E[g(X)] where g(X)=X2. From the final fact about expectations, we can calculate this:

E[X2] = S x2 P(X = x), for all x

Therefore, the Variance is:

Variance = s 2 = S [X i 2 ´ P i ] - m 2 ,     the sum is over all i's.

For example, suppose we toss two fair coins and we are interested in determining the expected value and the variance of the outcome:

E[X2] = (0) 2P(X=0) + (1) 2P(X=1) + (2) 2P(X=2) = 0(1/4) + 1(1/2) + 4(1/4) = 3/2.

From this, we calculate the variance:

Var(X) = E[X2] - m 2 = 3/2 - (1) 2 = 1/2.

Useful Tools for Population's Mean and Variance Estimations: It is not difficult to show that,

E(aX + b) = aE(X) + b, for any constant a and b
Var(aX+ b) = a2Var(X), for any constant a and b

Application: Notice that the above two examples are among some the tools well suited for reducing or even in preventing computational statistics round-off errors as well as computers' over/under flows.

Example: Suppose a random sample of size n = 9, is:

X:   220, 220, 260, 280, 270, 250, 300, 290, 240.
We wish to estimate the mean and the variance of the population based on this sample.

Let a = 10, and b = 22, then dividing the observational data set by a = 10, and then subtracting 22 fron each value, we obtain a new data set Y:

Y:   0, 0, 4, 6, 5, 3, 8, 7, 2.
Computing the mean and the variance of set Y, we obtain:
S yi = 35, S yi 2 = 203
Hence, the estimated mean and variance using the Y data set are 35/9, and [203 – 9(35/9)2] / 8 = 8.36, respectively. However, notice that X = 10Y + 22, therefore, the estimated mean and variance for the population are E(X) = 10 E(Y) + 22 = 350 + 22 = 372, and Var(X) = 102 Var(Y) = 836, respectively.

Notice that, the variance is not expressed in the same units as the expected value. So, the variance is hard to understand and to explain as a result of the squared term in its computation. This can be alleviated by working with the square root of the variance, which is called the Standard (i.e., having the same unit as the data have) Deviation:

Standard Deviation = s = (Variance) ½

Both variance and standard deviation provide the same information and, therefore, one can always be obtained from the other. In other words, the process of computing standard deviation always involves computing the variance. Since standard deviation is the square root of the variance, it is always expressed in the same units as the expected value.

For the dynamic process, the Volatility as a measure for risk includes the time period over which the standard deviation is computed. The Volatility measure is defined as standard deviation divided by the square root of the time duration.

Coefficient of Variation: Coefficient of Variation (CV) is the absolute relative deviation with respect to size provided is not zero, expressed in percentage:

CV =100 |s/| %

Notice that the CV is independent from the expected value measurement. The coefficient of variation demonstrates the relationship between standard deviation and expected value, by expressing the risk as a percentage of the expected value. The inverse of CV (namely 1/CV) is called the Signal-to-Noise Ratio.

You might like to use Multinomial Applet for checking your computation and performing computer-assisted experimentation.

An Application: Consider two investment alternatives, Investment I and Investment II with the characteristics outlined in the following table:

- Two Investments -

Investment I

Investment II

Payoff %

Prob.

Payoff %

Prob.

1

0.25

3

0.33

7

0.50

5

0.33

12

0.25

8

0.34

Performance of Two Investments

To rank these two investments under the Standard Dominance Approach in Finance, first we must compute the mean and standard deviation and then analyze the results. Using the Multinomial for calculation, we notice that the Investment I has mean = 6.75% and standard deviation = 3.9%, while the second investment has mean = 5.36% and standard deviation = 2.06%. First observe that under the usual mean-variance analysis, these two investments cannot be ranked. This is because the first investment has the greater mean; it also has the greater standard deviation; therefore, the Standard Dominance Approach is not a useful tool here. We have to resort to the coefficient of variation (C.V.) as a systematic basis of comparison. The C.V. for Investment I is 57.74% and for Investment II is 38.43%. Therefore, Investment II has preference over the Investment I. Clearly, this approach can be used to rank any number of alternative investments. Notice that less variation in return on investment implies less risk.

Expectation of a sum of a random number of random variables: Suppose that the number of people entering a department store on a given day is a random variable with mean 50. Suppose further that the amount of money spent by these customers is independent random variables having a common mean of $80. What is the expected amount of money spent in the store on a given day?.

E (sum of N random variables Xi) = E(N) . E(X)

Hence, the expected amount of money spent in the store is (50)(80) = $4000.

You might like to use this JavaScript in performing some numerical experimentation to:

  1. Show that E[aX + b] = aE(X) + b.
  2. Show that V[aX + b] = a 2 V(X).
  3. Show that: E(X 2 )= V(X) + (E(X)) 2 .

Normal Density Function

In the Descriptive Statistic Section of this Web site, we have been concerned with how empirical scores are distributed and how best to describe their distribution. We have discussed several different measures, but the mean m will be the measure that we use to describe the center of the distribution, and the standard deviation s will be the measure we use to describe the spread of the distribution. Knowing these two facts gives us ample information to make statements about the probability of observing a certain value within that distribution. If I know, for example, that the average Intelligence Quotient (I.Q.) score is 100 with a standard deviation of s = 20, then I know that someone with an I.Q. of 140 is very smart. I know this because 140 deviates from the mean mby twice the average amount as the rest of the scores in the distribution. Thus, it is unlikely to see a score as extreme as 140 because most of the I.Q. scores are clustered around 100 and only deviate 20 points from the mean m .

Many applications arise from the central limit theorem (CLT). The CLT states that, average of values of n observations approaches normal distribution, irrespective of the form of original distribution under quite general conditions. Consequently, normal distribution is an appropriate model for many, but not all, physical phenomena, such as distribution of physical measurements on living organisms, intelligence test scores, product dimensions, average temperatures, and so on.

Know that the Normal distribution is to satisfy seven requirements: (1) the graph should be bell shaped curve; (2) mean, median and mode are all equal; (3) mean, median and mode are located at the center of the distribution; (4) it has only one mode, (5) it is symmetric about mean, (6) it is a continuous function; (6) it never touches x-axis; and (7) the area under curve equals one.

Many methods of statistical analysis presume normal distribution.

When we know the mean and variance of a Normal then it allows us to find probabilities. So, if, for example, you knew some things about the average height of women in the nation, including the fact that heights are distributed normally, you could measure all the women in your extended family and find the average height. This enables you to determine a probability associated with your result, if the probability of getting your result, given your knowledge of women nationwide, is high. Then your family's female height cannot be said to be different from average. If that probability is low, then your result is rare (given the knowledge about women nationwide), and you can say your family is different. You have just completed a test of the hypothesis that the average height of women in your family is different from the overall average.

The ratio of two independent observations from the standard normal is distributed as the Cauchy Distribution which has thicker tails than a normal distribution. It density function is f(x) = 1/[p(1+x 2 )], for all real value x.


An Application: A portfolio manager believes that the overnight loss of his portfolio is distributed normally with mean $0 and standard deviation of $10 000. Find the 5% one-day value at risk for this portfolio.

Let X denotes the random portfolio loss distributed as X ~ N (0, 10 0002). The value at risk v5% is defined by definition a number such that

P(X £ v5%) = 0.95.
To find v5% we standardize the random variable on the left-hand side:
X £ v5% Û X – 0 £ v5% – 0Û [X – 0] / [10 000] £ [v5% – 0] / [10 000].
The transformation is denoted by Z = (X - 0) / 10 000 which has standard normal distribution. Therefore,

P{Z £ [v5%– 0] / [10 000]} = 0.95.

If we denote by z95% the 95% quantile of a standard normal distribution, then

[v5%] / [10 000]   =   z95%
v5% can be found in normal statistical table:
z95% = 1.645, v95% = 10 000z95%  = 16 450

Therefore, the overnight 5% value at risk is $16450.

You might like to use Standard Normal JavaScript instead of using tabular values from your textbook, and the well-known Lilliefors' Test for Normality to assess the goodness-of-fit.


Poisson Probability Function

Life is good for only two things, discovering mathematics and teaching mathematics.

-- Simeon Poisson

An important class of decision problems under uncertainty is characterized by the small chance of the occurrence of a particular event, such as an accident. Poisson probability function computes the probability of exactly x independent occurrences during a given period of time, if events take place independently and at a constant rate. Poisson probability function also represent number of occurrences over constant areas or volumes:

Poisson probabilities are often used; for example in quality control, software and hardware reliability, insurance claim, number of incoming telephone calls, and queuing theory.

Application: Gives probability of exactly x independent occurrences during a given period of time if events take place independently and at a constant rate. May also represents number of occurrences over constant areas or volumes. It is used frequently in quality control, reliability, queuing theory, and so on.

Example: Used to represent distribution of number of defects in a piece of material, customer arrivals, insurance claims, incoming telephone calls, alpha particles emitted, and so on.

A process that creates fabric is monitored. If the number of defects (X) per meter of fabric exceeds 5 then the process is stopped for diagnosis. The random variable X follows a Poisson distribution with rate = number of defects per meter of fabric.

An Application: One of the most useful applications of the Poisson distribution is in the field of queuing theory. In many situations where queues occur it has been shown that the number of people joining the queue in a given time period follows the Poisson model. For example, if the rate of arrivals to an emergency room is l per unit of time period (say 1 hr), then:

P ( n arrivals) = l n  e-l / n!

The mean and variance of random variable n are both l . However if the mean and variance of a random variable have equal numerical values, then it is not necessary that its distribution is a Poisson. Its mode is within interval [l -1, l].

Applications:

P ( 0 arrival) = e-l
P ( 1 arrival) = l  e-l / 1!
P ( 2 arrival) = l 2  e-l / 2!

and so on. In general:

P ( n+1 arrivals ) = l P ( n arrivals ) / n.

Normal approximation for Poisson: All Poisson tables are limited in their scope; therefore, it is necessary to use standard normal distribution in computing the Poisson probabilities. The following numerical example illustrates how good the approximation could be.

Numerical Example: Emergency patients arrive at a large hospital at the rate of 0.033 per minute. What is the probability of exactly two arrivals during the next 30 minutes?

The arrival rate during 30 minutes is l = (30)(0.033) = 1. Therefore,

P (2 arrivals) = [1 2 /(2!)] e -1 = 18%

The mean and standard deviation of distribution are:

m = l = 1, and s = l 1/2 = 1,

respectively; therefore, the standardized observation for n = 2, by using the continuity factor (which always enlarges) are:

z 1 = [(r-1/2) - m] / s = (1.5 -1)/1 = 0.5, and

z 2 = [(r+1/2) - m] / s = (2.5 -1)/1 = 1.5.

Therefore, the approximated P (2 arrivals) is P (z being within the interval 0.5, 1.5). Now, by using the standard normal table, we obtain:

P (2 arrivals) = 0.43319 - 0.19146 = 24%

As you see the approximation is slightly overestimated, therefore the error is on the safe side. For large values of l, say over 20, one may use the Normal approximation to calculate Poisson probabilities.

Notice that by taking the square root of a Poisson random variable, the transformed variable is more symmetric. This is a useful transformation in regression analysis of Poisson observations.

Poisson approximation for binomial: Notice that, whenever you use Poisson approximation to the binomial distribution with parameters n and p, then the goodness of the approximation is largely determined by the smallness of the p parameter rather than how large is n.

You might like to use Common Discrete Probability Functions to obtain probability and the cumulative probability functions.

You might like to use Poisson Probability Function JavaScript to perform your computation, and Testing Poisson to perform the goodness-of-fit test.

Further Reading:
Barbour et al., Poisson Approximation, Oxford University Press, 1992.


Student T-Density Function

The t distributions were discovered in 1908 by William Gosset, who was a chemist and a statistician employed by the Guinness brewing company. He considered himself a student still learning statistics, so that is how he signed his papers as pseudonym"Student". Or, perhaps he used a pseudonym due to"trade secret" restrictions by Guinness.

Note that there are different t-distributions; it is a class of distributions. When we speak of a specific t distribution, we have to specify the degrees of freedom. The t density curves are symmetric and bell-shaped like the normal distribution and have their peak at 0. However, the spread is more than that of the standard normal distribution. The larger the degrees of freedom, the closer the t-density is to the normal density.

The shape of a t-distribution depends on a parameter called"degree-of-freedom". As the degree-of-freedom gets larger, the t-distribution gets closer and closer to the standard normal distribution. For practical purposes, the t-distribution is treated as the standard normal distribution when degree-of-freedom is greater than 30.

Suppose we have two independent random variables, one is Z, distributed as the standard normal distribution, while the other has a Chi-square distribution with (n-1) d.f.; then the random variable:

(n-1)Z / c 2

has a t-distribution with (n-1) d.f. For large sample size (say, n over 30), the new random variable has an expected value equal to zero, and its variance is (n-1)/(n-3) which is close to one.

Notice that the t- statistic is related to F-statistic as follow: F = t 2 , where F has (d.f. 1 = 1, and d.f. 2 = d.f. of the t-table)

You might like to use Student t-Density to obtain its P-values.


Triangular Density Function

The triangular distribution shows the number of successes when you know the minimum, maximum, and most likely values. For example, you could describe the number of intakes seen per week when past intake data show the minimum, maximum, and most likely number of cases seen. It has a continuous probability distribution.

The parameters for the triangular distribution are Minimum (a), Maximum (b), and Likeliest (c). There are three conditions underlying triangular distribution:

  • The minimum number of items is fixed.
  • The maximum number of items is fixed.
  • The most likely number of items falls between the minimum and maximum values.

These three parameters forming a triangular shaped distribution, which shows that values near the minimum and maximum are less apt to occur than those near the most likely value.

The following are the general Triangular density function, together with the expected value and the variance for a Triangular random variable X (a, c, b):

f(x) = 2(x-a) / [(b-a)(c-a)], for a £ x £ c
f(x) = 2(b-x) / [(b-a)(b-a)], for c£ x £ b
E(X) = (a + b + c) / 3
Var(X) = (a2 + b2 + c2 - ab - ac - bc) / 18

The following is a Triangular density function with parameters (a = 0, c = 0.25, a = 1):

Click on the image to enlarge it and THEN print it.
A Triangular Density Function

Application: Given X is distributed as above, compute the tails probability P (X £ 0.1 OR X ³ 0.9).

Further Reading:
Evans M., Hastings N., and B., Peacock, Triangular Distribution, Ch. 40 in Statistical Distributions, Wiley, pp. 187-188, 2000.


Uniform Density Function

The uniform density function gives the probability that observation will occur within a particular interval [a, b] when probability of occurrence within that interval is directly proportional to interval length. Its mean and variance are:
m = (a+b)/2,s 2 = (b-a) 2 /12.

Applications: Used to generate random numbers in sampling and Monte Carlo simulation.

Comments: Special case of beta distribution.

You might like to use Goodness-of-Fit Test for Uniform and performing some numerical experimentation for a deeper understanding of the concepts.

Notice that any Uniform distribution has uncountable number of modes having equal density value; therefore it is considered as a homogeneous population.

Discrete Uniform Distribution: The discrete uniform distribution describes the distribution of n equally likely events (labeled with the integers from 1 to n), each with probability 1/n.

If X is a discrete uniform random variable with parameter n, then the mean, and variance are as follows:

E(X) = (n+1)/2, Var(X) = (n2 -1) /12

Further Reading:
Balakrishnan N., and V. Nevzorov, A Primer on Statistical Distributions, Wiley, 2003.


Necessary Conditions for Statistical Decision Making

Introduction to Inferential Data Analysis Necessary Conditions: Do not just learn formulas and number-crunching. Learn about the conditions under which statistical testing procedures apply. The following conditions are common to almost all statistical tests:

lewisyourpred.blogspot.com

Source: http://home.ubalt.edu/ntsbarsh/business-stat/opre504.htm

0 Response to "Height of an Office Building is the Variable Discrete or ‹ Continuous Height of an Office Building"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel