问题

概率统计 >> 概率论
Questions in category: 概率论 (Probability).

超几何分布(Hypergeometric Distribution)

Posted by haifeng on 2020-03-28 11:18:00 last update 2020-03-28 12:20:48 | Answers (0) | 收藏


The assumptions leading to the hypergeometric distribution are as follows:

  1. The population or set to be sampled consists of $N$ individuals, objects, or elements (a finite population).
  2. Each individual can be characterized as a success ($S$) or a failure ($F$), and there are $M$ successes in the population.
  3. A sample of $n$ individuals is selected without replacement in such a way that each subset of size $n$ is equally likely to be chosen.

 


Prop. If $X$ is the number of $S$'s in a completely random sample of size $n$ drawn from a population consisting of $M$ $S$'s and $(N-M)$ $F$'s, then the probability distribution of $X$, called the hypergeometric distribution, is given by

\[
P(X=x)=h(x;n,M,N)=\frac{\binom{M}{x}\binom{N-M}{n-x}}{\binom{N}{n}}
\]

for $x$ an integer satisfying $\max(0,n-N+M)\leqslant x\leqslant\min(n,M)$.

 

Notation:

$N$: number of individuals


 

Prop. The mean and variance of the hypergeometric rv $X$ having pmf $h(x;n,M,N)$ are 

\[
E(X)=n\cdot\frac{M}{N},\quad V(X)=\frac{N-n}{N-1}\cdot n\cdot\frac{M}{N}\cdot\biggl(1-\frac{M}{N}\biggr)
\]

 

The ratio $\frac{M}{N}$ is the proportion of $S$'s in the population. If we replace $M/N$ by $p$ in $E(X)$ and $V(X)$, we get

\[
\begin{aligned}
E(X)&=np,\\
V(X)&=\frac{N-n}{N-1}\cdot np(1-p)=\frac{N-n}{N-1}\cdot npq
\end{aligned}
\]

It shows that the means of the binomial and hypergeometric rv's are equal, whereas the variances of the two rv's differ by the factor $\frac{N-n}{N-1}$, often called the finite population correction factor

This factor is less than 1, so the hypergeometric variables has smaller variance than does the binomial rv. The correction factor can be written

\[
\dfrac{1-\frac{n}{N}}{1-\frac{1}{N}},
\]

which is approximately 1 when $n$ is small relative to $N$.


Remark:

An easy way to remember the scope of $x$ is use the formula

\[
\binom{N-M}{n-x}=\binom{N-M}{N-M-(n-x)}=\binom{N-M}{x-(n-N+M)}.
\]

Hence, to make the equation meaningful, $x$ should satisfies $n-N+M\leqslant x$.


 

References:

Proposition in section 5 of Chapter 3, BOOK:

《Probability and Statistics For Engineering and The Sciences》(Fifth Edtion) P.129
Author: Jay L. Devore