[Notes] Some notes of Probability and Statistics
Axioms of Probabilies:
Suppose $\Omega$ is the sample space, $A$ and $B$ are events. The probabilities should obey the following rules:
(1) $P(A)\geqslant 0$,
(2) $P(\Omega)=1$;
(3) If $A,B\subset\Omega$ and $A\cap B=\emptyset$, then $P(A\cup B)=P(A)+P(B)$.
Thm. $P(A)\leqslant 1$, $\forall\ A\subset\Omega$.
Proof. By axioms, $1=P(\Omega)=P(A\cup A^c)=P(A)+P(A^c)$, then we have
\[P(A)=1-P(A^c)\leqslant 1\]
Cor. If $A,B,C\subset\Omega$, and $A,B,C$ are disjoint sets. Then
\[
P(A\cup B\cup C)=P(A)+P(B)+P(C)
\]
Proof. Since $A,B,C$ are disjoint sets, $A$ and $B\cup C$ are disjoint. Thus
\[
P(A\cup B\cup C)=P(A\cup (B\cup C))=P(A)+P(B\cup C),
\]
and by Axiom(3), $P(B\cup C)=P(B)+P(C)$, substitute it in the above, we get
\[
P(A\cup B\cup C)=P(A)+P(B)+P(C).
\]
It is easy to infer the following corollary.
Cor. If $A_1,A_2,\ldots,A_n$ are disjoint sets in $\Omega$, then
\[
P(A_1\cup A_2\cdots A_n)=P(A_1)+P(A_2)+\cdots+P(A_n).
\]
Proof. By mathematical induction.
Cor. If $s_1,\ldots,s_n$ are $n$ distinct points(or outcomes) in $\Omega$ then
\[
P(\{s_1,\ldots,s_n\})=P(\{s_1\})+\cdots+P(\{s_n\})=P(s_1)+\cdots+P(s_n).
\]
Here, we use $P(s_i)$ to denote the probability $P(\{s_i\})$.
Suppose the sample space have $N$ points, and event $A$ has $n$ points. Here $\Omega$ is a finite set, i.e., $N < \infty$. If all outcomes are equally likely, then
\[
P(A)=\frac{\text{number of elements of}\ A}{\text{total number of sample points}}=\frac{\# A}{\#\Omega}
\]
Example. If $A,B$ are two subsets of $\Omega$. Let $C=A-B=A\cap B^c$, $D=B-A=B\cap A^c$, $E=A\cap B$.
Then we have $P(C\cup D)=P(A)+P(B)-2P(E)$.
Solution.
Note that $C$ and $D$ are disjoint, hence we have
\[
P(C\cup D)=P(C)+P(D).\tag{1}
\]
On the other hand, $A=C\cup E$, $B=D\cup E$, and $C\cap E=\emptyset$, $D\cap E=\emptyset$. Thus
\[
\begin{aligned}
P(A)&=P(C)+P(E),\\
P(B)&=P(D)+P(E).
\end{aligned}
\]
It infers that $P(A)+P(B)=P(C)+P(D)+2P(E)$. By using (1), we have
\[
P(C\cup D)=P(A)+P(B)-2P(E).
\]
Prop. Suppose $A,B\subset\Omega$, we have
\[
P(A\cup B)=P(A)+P(B)-P(A\cap B).
\]
Proof.
\[
P(A\cup B)=P(A\cup(B-A))=P(A\cup(B\cap A^c))=P(A)+P(B\cap A^c).
\]
Note that $P(B)=P(B\cap A^c)+P(B\cap A)$, we substitute it in the above formula, and get
\[
P(A\cup B)=P(A)+P(B)-P(A\cap B).
\]
Prop. Suppose $A,B,C\subset\Omega$, we have
\[
P(A\cup B\cup C)=P(A)+P(B)+P(C)-P(A\cap B)-P(A\cap C)-P(B\cap C)+P(A\cap B\cap C).
\]
Exercise. Using the axioms to show that if one event $A$ is contained in another event $B$ (i.e. $A$ is a subset of $B$), then $P(A)\leqslant P(B)$.
For general $A$ and $B$, what does this imply about the relationship among $P(A\cap B)$, $P(A)$ and $P(A\cup B)$ ?
Solution. Since $A\subset B$, $B=A\cup(B\cap A^c)$, here $A$ and $B\cap A^c$ are disjoint. Thus by axioms,
\[
P(B)=P(A)+P(B\cap A^c)\geqslant P(A).
\]
For general $A$ and $B$, we have $A\cap B\subset A\subset A\cup B$, thus we have
\[
P(A\cap B)\leqslant P(A)\leqslant P(A\cup B).
\]