Skip to Content
Information Theory

5. Channel Capacity

2020-12-24Original-language archivelegacy assets may be incomplete

Channel Capacity

  • (X,p(yx),Y)(\mathcal{X},p(y|x),\mathcal{Y}): send xx, with probability p(yx)p(y|x), receiver get yy
  • Channel capacity: C=maxp(x)I(X;Y)C=\max_{p(x)}I(X;Y)
  • Strategy to calculate CC
    • I(X;Y)=H(Y)H(YX)I(X;Y)=H(Y)-H(Y|X)
    • convex sets: dmin=minaA,bBd(a,b)d_{\min}=\min_{a\in A,b\in B}d(a,b)
  • Examples:
    • Noiseless Binary Channel: C=1,p(x)=(12,12)C=1,p(x)=(\frac{1}{2},\frac{1}{2})
    • Noisy Channel with nonoverlapping Outputs: in fact not noisy
    • Noisy Typewriter: log13\log13
    • Binary Symmetric Channel: C=1H(p)C=1-H(p)
    • Binary Erasure Channel: maxp(x)H(Y)H(α)=1Pe\max_{p(x)}H(Y)-H(\alpha)=1-P_e
  • symmetric channel: channel transition matrix row and columnr are permutations of each other C=logYH(r)C=\log|\mathcal{Y}|-H(r)
  • weakly symmetric: rows

Channel Coding Theorem

  • Message WW (Encoder) XnX^n (Channel) YnY^n (Decoder) W^\hat W
  • maxH(W)n\max \frac{H(W)}{n}
  • Message: can be ordered and denoted by index, logM\log M symbols
  • discrete memoryless channel (DMC): p(ykxk,yk1)=p(ykxk)p(y_k|x^k,y^{k-1})=p(y_k|x_k)
  • without feedback: p(xkxk1,yk1)=p(xkxk1)p(x_k|x^{k-1},y^{k-1})=p(x_k|x^{k-1})
  • memoryless + no feedback: H(YnXn)=i=1nH(YiXi)H(Y^n|X^n)=\sum_{i=1}^nH(Y_i|X_i)
  • Markov chain: WXnYnW^W\rightarrow X^n\rightarrow Y^n\rightarrow\hat W
  • DMC: C=maxI(X;Y)C=\max I(X;Y)

DMC

  • codebook (f(w)=xnf(w)=x^n) shared between sender and receiver
  • (M,n)(M,n) code
    • index set {1,2,,M}\{1,2,\cdots, M\}
    • encoding function f(x):{1,2,,M}Xnf(x):\{1,2,\cdots,M\}\rightarrow\mathcal{X}^n, yielding codewords xn(1),,xn(M)x^n(1),\cdots,x^n(M)
    • decoding function g(x)g(x)
  • Conditional probability of error: λi=ynp(ynxn(i))[g(yni)]\lambda_i=\sum_{y^n}p(y^n|x^n(i))[g(y^n\neq i)]
    • maximal probability of error: λ(n)=maxλi\lambda^{(n)}=\max \lambda_i
    • average: Pe(n)=1MλiP_e^{(n)}=\frac{1}{M}\sum\lambda_i
  • rate RR of (M,n)(M, n) code: R=logMnR=\frac{\log M}{n} bits per transmission
    • achievable RR: sequence (2nR,n)(2^{nR}, n) codes such that the maximal probability of error λ(n)0\lambda(n)\rightarrow 0 when nn\rightarrow\infty
  • capacity: supremum of all achievable rates
  • Channel coding theorem: C=maxp(x)I(X;Y)C=\max_{p(x)}I(X;Y)
    • Achievability: For any r<C,(2nr,n)r<C,\exists (2^{nr},n) code
    • Converse: For any r>C,λe>0r>C,\lambda_e>0

Intuition for Channel Capacity: 2nI(X;Y)=2n(H(Y)H(YX))2^{nI(X;Y)}=2^{n(H(Y)-H(Y|X))} (手电筒)

Proof: RCR\leq C; R<C\forall R<C, achievable

Feedback Capacity

  • (2nR,n)(2^{nR}, n) feedback code: sequence of mappings xi(W,Yi1)x_i(W,Y^{i-1})
  • CFB=C=maxp(x)I(X;Y)C_{FB}=C=\max_{p(x)}I(X;Y) (Feedback cannot increase capacity)

Source-Channel Separation

  • data compression: R>HR>H
  • data transmission: R<CR<C
  • Source-channel coding theorem
    • if V1,V2,,VnV_1,V_2,\cdots,V_n is a finite alphabet stochastic process that satisfies the AEP and H(V)<CH(\mathcal{V})<C, exists source-channel code with probability of error 0\rightarrow 0
    • if H(V)>CH(\mathcal{V})>C, probability of error is bounded away from zero

Error Correction Code

  • Repetition code
  • Parity check code
  • Hamming code (n,k,d)
  • BCH code
  • Convolutional code
  • LDPC code
  • Polar code