Skip to Content
Information Theory

2. AEP

2020-12-22Original-language archivelegacy assets may be incomplete

Convergence of random variables

  • In probability (convergence in probability): P(XnXϵ)1P(|X_n-X|\leq\epsilon)\rightarrow 1
  • In mean square: E(XnX)20E(X_n-X)^2\rightarrow 0
  • With probability 1 (almost surely convergence): P(limnXn=X)=1P(\lim_{n\rightarrow\infty}X_n=X)=1
  • (2)\rightarrow(1), (3)\rightarrow(1)
  • Law of Large Numbers: X1,,Xnp(x)X_1,\cdots,X_n\sim p(x)
    • Strong: P(limnXn=E(X1))=1P(\lim_{n\rightarrow\infty} \overline X_n=E(X_1))=1
    • Weak: XnE(X1)\overline X_n\rightarrow E(X_1) in probability

Asymptotic Equipartition Property

  • AEP: X1,X2,XnX_1,X_2,\cdots X_n are i.i.d. p(x)\sim p(x), then 1nlogp(X1,X2,,Xn)H(X)-\frac{1}{n}\log p(X_1,X_2,\cdots, X_n)\rightarrow H(X)
    • 2nH(X)+ϵp(X1,X2,,Xn)2n(H(X)ϵ)2^{-nH(X)+\epsilon}\leq p(X_1,X_2,\cdots,X_n)\leq 2^{-n(H(X)-\epsilon)}
  • typical set Aϵ(n)A_\epsilon^{(n)} is the set of sequences (x1,,xn)Xn(x_1,\cdots, x_n)\in\mathcal{X}^n with property 2nH(X)+ϵp(X1,X2,,Xn)2n(H(X)ϵ)2^{-nH(X)+\epsilon}\leq p(X_1,X_2,\cdots,X_n)\leq 2^{-n(H(X)-\epsilon)}
    • P(Aϵ(n))1ϵP(A_\epsilon^{(n)})\geq 1-\epsilon for nn sufficiently large (The typical set has probability nearly 1)
    • Aϵ(n)2n(H(X)+ϵ)|A_\epsilon^{(n)}|\leq 2^{n(H(X)+\epsilon)} (All elements are nearly qeuiprobable)
    • Aϵ(n)(1ϵ)2n(H(X)ϵ)|A_\epsilon^{(n)}|\geq (1-\epsilon)2^{n(H(X)-\epsilon)} (elements nearly 2nH2^{nH})
  • High Probability Set: Bδ(n)XnB_\delta^{(n)}\subset\mathcal{X}^n be the smallest set with P(Bδ(n))1δP(B_\delta^{(n)})\geq 1-\delta
    • if P(Bδ(n))1δP(B_\delta^{(n)})\geq 1-\delta, then 1nlogBδ(n)>Hδ\frac{1}{n}\log|B_\delta^{(n)}|>H-\delta' for n sufficiently large
  • Data Compression
    • Encode: E(1nl(Xn))H(X)+ϵE(\frac{1}{n}l(X^n))\leq H(X)+\epsilon
      • nlogX+1+1n\log|\mathcal{X}|+1+1 for (Aϵ(n))c(A_\epsilon^{(n)})^c
      • n(H+ϵ)+1+1n(H+\epsilon)+1+1 for Aϵ(n)A_\epsilon^{(n)}
    • For any scheme with rate r<H(X),Pe1r<H(X),P_e\rightarrow 1

Joint Typicality

Jointly typical sequences Aϵ(n)A_\epsilon^{(n)}

  • 1nlogp(xn)H(X)<ϵ|-\frac{1}{n}\log p(x^n)-H(X)|<\epsilon
  • 1nlogp(yn)H(X)<ϵ|-\frac{1}{n}\log p(y^n)-H(X)|<\epsilon
  • 1nlogp(xn,yn)H(X,Y)<ϵ|-\frac{1}{n}\log p(x^n, y^n)-H(X, Y)|<\epsilon

XnAϵ(n),YnAϵ(n)X^n\in A_\epsilon^{(n)}, Y^n\in A_\epsilon^{(n)} cannot imply (Xn,Yn)Aϵ(n)(X^n, Y^n)\in A_\epsilon^{(n)}

Three Properties:

  • Pr((Xn,Yn)Aϵ(n))1Pr((X^n,Y^n)\in A_\epsilon^{(n)})\rightarrow 1 as nn\rightarrow \infty
  • Aϵ(n)2n(H(X,Y)+ϵ)|A_\epsilon^{(n)}|\leq 2^{n(H(X,Y)+\epsilon)}
  • If (X~n,Y~n)p(xn)p(yn)(\widetilde{X}^n, \widetilde{Y}^n)\sim p(x^n)p(y^n), then (1ϵ)2n(I(X,Y)+3ϵ)Pr((X~n,Y~n)Aϵ(n))2n(I(X,Y)3ϵ)(1-\epsilon)2^{-n(I(X,Y)+3\epsilon)} \leq Pr((\widetilde{X}^n, \widetilde{Y}^n)\in A_\epsilon^{(n)})\leq 2^{n(I(X,Y)-3\epsilon)}

The probability of joint AEP: 2nI(X;Y)=2nH(X,Y)2nH(X)2nH(Y)2^{-nI(X;Y)}=\frac{2^{nH(X,Y)}}{2^{nH(X)}2^{nH(Y)}}