Probability theory

Probability Theory

see also:

Factorial:

the product of n consecutive positive integers from n to 1 is n!
n! = n(n-1)(n-2)(n-3)...3.2.1
note: 0! is defined as 1.

Permutation:

each of the ordered subsets which can be formed by selecting some or all of the elements of a set;
the multiplication principle:
- if one operation can be performed in m different ways, and when it has been performed in any of these ways, a second operation can then be performed in n different ways, the number of ways of performing the two operations is m x n
- eg. if there are 3 different main courses & 4 different desserts, you have a choice of 3x4=12 different two course meals
nPr:
- the no. of arrangements of n different objects taken r at a time;
- = n!/(n-r)!;
Arrangements in a circle:
- the no. of ways of arranging n different objects in a circle, regarding clockwise & anticlockwise as different:
- = (n-1)!;
Arrangements of n objects in a row, when not all are different:
- if p alike of one kind, q alike of another kind, etc..
- = n!/(p!q!...);
- eg. how many ways can the letters of the word mammal be rearranged to make different words?
  - n = 6; p = 2 for letter a; q = 1 for letter l; r = 3 for letter m;
  - thus = 6! / (2! x 1! x 3!)
Arrangements with restrictions:
- restriction principle: always fill a restriction first
- eg. number of ways of arranging in a row 6 men and 2 boys:
  - if 2 boys must be together:
    - regard the boys as 1 unit, thus 7 objects not 8
    - thus 7! ways, but as the 2 boys can be arranged 2! ways amongst themselves => 2!7!
  - if 2 boys must NOT be together:
    - number arrangements without restriction = 8!
    - => answer = 8! - number arrangements 2 boys are together => 8! - (2!7!)
  - if there must be at least 3 men separating the boys:
    - calculate the number of ways of arranging the boys alone:
      - sum up number of arrangements for boy B when boy A is placed in each of the 8 positions = 20
    - use multiplication principle to determine how the remaining 6 men can be arranged = 20 x 6!

Combination:

each of the subsets which can be formed by selecting some or all of the elements of the set without regard to the order in which the elements appear in the subset;
nCr:
- the no. of combinations of n different objects taken r at a time;
- = nPr/r! = n!/[r!(n-r)!];

Mutually exclusive operations:

when the selection of one object eliminates the possibility of it being selected again in that arrangement;
if two operations are mutually exclusive then the no.of arrangements possible with each are added (not multiplied) to obtain the total no. of possible arrangements;
ie. intersection A & B is a null set,
if two or more events cannot occur same time, Pr(A or B) = Pr(A) + Pr(B); (addition principle)

Event: a set of favourable outcomes;

Trial: eg. the tossing of a die;

Sample space: E = all possible outcomes;

Probability of outcomes corresponding to A events:

assume all sample points equally likely,
Pr(A) = [no. outcomes A]/[total no. possible outcomes];
Pr(A or B) = Pr(A) + (Pr(B) - Pr(A&B);
thus, the probability of drawing an Ace or a Heart from a pack of cards = 1/13 + 1/4 - 1/52 = 16/52 = 4/13

Independent events:

A & B are independent if:
- Pr(A&B) = Pr(A).Pr(B);

Conditional Probability:

Pr(B) given A = Pr(B/A) = Pr(A&B)/Pr(A);
if A,B are independent, then:
- Pr(B/A) = [Pr(A).Pr(B)/Pr(A)] = Pr(B);

Statistics:

see also: Statistics
2 types of variables:
- continuous (eg. height);
- discrete (eg. no. of peas);
Population: the group of items/individuals;
Sample range: that part of pop. measured;
Class intervals: subdivisions of the sample range into classes;
Class frequency: no. observations in each class;
Mode: most frequent variable;
Quantile: a value of the variable below which falls a given % of the frequency:
- 25% quantile = lower quartile = Q1;
- 75% quantile = upper quartile = Q3;
Semi-interquartile range(d) = 0.5(Q3-Q1);
Median: 50% quantile;
Arithmetic mean: the average of a set of observations;
- = (Sum of x)/n;
Variance(s²):
Standard deviation(s):
Standard score(z): eliminates scales, but standardises variable wrt mean & s;
- = (x - mean)/s;

Correlation coefficient(r):

the degree of assoc. between variables;
r=1, then positively assoc. linear relationship;
r=0, no relationship;
r=-1, then negatively assoc. linear relationship;
Does not allow for non-linear associations & is unduly influenced by extreme observations;
- = (1/(N-1))(sum z(x).z(y));

Rank correlation coefficient(r'):

a better measure of degree of assoc. with less influence from extremes & some measure of non-linear relationship, but need to rank all observations:
- => u = rank(x), v = rank(y), u,v E 1,2,3,...
- r'(x,y) = r(u,v) = s(u,v)/[SQR(s(uu).s(vv))];

Sampling Distibution:

if x is the mean of the sample, s' is the sd of the pop., n is the no. items in the sample, the standard error of the sampling distribution
= s'/SQR(n);
u (the mean of pop.) almost certainly lies b/n x +/- 3s'/SQR(n);
u(x) = u, s'(x) = s'/SQR(n);

Central Limit Theorem:

the sampling distibution of the sample means becomes more normal the greater the sample space is and the more normal distribution the pop. is;

Probability Distribution Curves:

see also: Statistics
curve = f(x),
Integral f(x) from -infinity to infinity = 1,
Normal Distribution:
- f(x) = [1/(s'.SQR(2pi))]e^[-0.5((x-u)/s')^2,
- where:
  - u = mean value of x in pop.
  - s'= stand. dev. of x in pop.
- Standard Normal Curve:
  - z = [1/(SQR(2pi))]e^[-0.5t^2],
  - where t = (x-u)/s'; z = s'f(x);
  - has a mean of zero, sd of 1;
Cauchy Distribution:
- f(x) = (1/pi)[a/(a^2+x^2)], x E R;
Exponential Distribution:
- f(x) = ke^(-kx) for x >= 0,
- = 0 for x<0,
Binomial Distribution:
- when to use:
  - when the same trial is repeated several times & there are only 2 possible outcomes in each trial either one of which must occur;
- variables:
  - n = no. of independent trials;
  - p = prob. success;
  - q = prob. failure;
- u = mean = np;
- s'= sd = SQR(npq);
- x = no. successes in n independent trials;
- prob.(X=x) = nCr(n#x)(q^(n-x))p^x
  - = prob. exactly x successes;
- Normal approximation:
  - can approx. to normal distrib. using u,s'
  - if n>30, p>0.1;
Hypergeometric Distribution:
- when to use:
  - same as for binomial except there is sampling without replacement;
- variables:
  - N = size of pop.;
  - n = size of sample;
  - D = no. of kind A in pop.;
  - N-D = no of kind B in pop.;
  - X = no. of kind A in sample;
- Pr(X=x) = nCr(D#x).nCr(N-D#n-x)/nCr(N#n)
  - = Pr(sample contains x of kind A),
- u = nD/N
- s'^2 = nD(1-D/N)(N-n)/[N(N-1)],
- if N is very large, & n small, can approx. to binomial distribution;
Poisson distribution:
- when to use:
  - as an approx. to binomial distrib. when p is very small and n is very large;
  - when the no. of times an event occurs can be counted but there is no upper limit to the no. of times it may occur;
- Equation:
  - Pr(X=x) = e^(-u) * u^x / x!,
- Eg. On average there are 2.5 cars per quarter-hour at a petrol station, what is the prob. that during a particular quarter-hour there will be some cars at the petrol station?
  - u = 2.5, Pr(X>=1) = 1 - Pr(X=0),
  - Pr(X=0) = e^(-2.5) => Pr(X>=1) = 0.92;
- Normal Approx.:
  - u = u, s' = SQR(u),
  - Good approx. if u is large;
  - Eg. On average, there are 20 people asking for an item each week, what is the minimum no. of items the store must have in stock each week to be almost certain of not having to refuse demand for this item?
    - u = 20, s' = SQR(20),
    - Min. no. items = u + 3s' = 34;

Hypothesis testing:

Null hypothesis(H0): that there is no effect of one variable on another;
Alternative hypothesis(H1): that there is an effect;
H1 is likely to be true if the results are very unlikely to have been obtained if H0 were true;
Significance level:
- p = Pr(obtaining obs. as extreme as the ones obtained if H0 were true),
Type I error:
- a = Pr(deciding to reject H0 when H0 true);
Type II error:
- b = Pr(accepting H0 when H1 is true);
- if p <= a, then reject H0,
- if p > a, then accept H0,
- Usually a is designated 0.05;
Z-test:
- If H0 is a standard distribution:
  - p =Pr(z <= (x-u0)(SQRn)/s') or Pr(z >= (x-u0)(SQRn)/s')
  - (x < u0) (x > u0)
  - where, u0,s' are mean, sd of H0 true, n = size of sample, x = mean of sample;
  - the pop. should be approx. of normal distrib., if it is skewed, a large sample is necessary for the z-test to be true;
t-test:
- used instead of z-test if pop. sd. unknown;
- need to use t-tables;
- as for z-test, but:
  - t = (x-u0)SQRn/s, s = sd of sample,
  - no. of degrees of freedom(df) = n-1;
  - t-test of u1=u2 comparing 2 independent samples:
    - eg. treated vs control group;
      - t = (x1-x2)/[s(1@2)SQR[(1/n1)+(1/n2)]],
      - df = n1 + n2 - 2,
      - s(1@2)^2 = [(n1-1)s1^2 + (n2-1)s2^2]/(n1 + n2 - 2),
  - t-test of u1=u2 comparing matched pairs:
    - eg. twins;
      - t = d'SQRn/sd,
      - df = n-1, ud = 0 if H0 is true,
      - d' = (1/n)(sum d) = mean of the differences b/n each pr
      - sd^2 = (1/(n-1))(sum (d^2) - [((sum d)^2)/n],
      - d = difference b/n a pair,
      - sd = stand. dev. of all d,
      - n = no. of matched pairs;
Chi-squared test of goodness of fit:
- used to test hypothesis concerning the proportions of the pop. in each of the categories;
- k = no. of categories in the pop.;
- n = no. of random samples from pop.;
- Cx = category x;
- Ox = observed freq. of sample in Cx;
- Ex = expected freq. of sample in Cx if H0 true;
- Px = proportion of pop. in Cx;
- Ex = nPx, sum(x=1 to k) Px = 1, &o^2 = sum[((Ox-Ex)^2)/Ex], if H0 true, then &o^2 (Chi-squared) small, if H1 true, then &o^2 large;
- p = Pr(&^2 > &o^2);
Row by Column contingency tables:
- r = no. of rows, c = no. of columns,
- Ex = expected freq. assuming independence,
  - = row total * column total / grand total,
- df = (r-1)(c-1),
- &o^2 = sum[((Ox-Ex)^2)/Ex], if H0 true, provided Ex > 5 for at least 80% of categ.