Skip to Content
Information Theory

3. Entropy Rate

2020-12-22Original-language archivelegacy assets may be incomplete

Stationary Process

  • stationary: P(X1=x1,X2=x2,,Xn=xn)=P(X1+l=x1,X2=x2+l,,Xn+l=xn)P(X_1=x_1,X_2=x_2,\cdots,X_n=x_n)=P(X_{1+l}=x_1,X_2=x_{2+l},\cdots,X_{n+l}=x_n)
    • Gaussian process
    • Stationary Markov Chain
  • Stationary Distribution of MC
    • p(Xn+1)=p(Xn)p(X_{n+1})=p(X_n)
    • p(xn+1)=xnp(xn)Pxnxn+1p(x_{n+1})=\sum_{x_n}p(x_n)P_{x_nx_{n+1}}
    • net probability flow across any cut set is zero

Entropy Rate

  • entropy rate of a stochastic process: H(X)=limn1nH(X1,X2,,Xn)H(\mathcal{X})=\lim_{n\rightarrow\infty}\frac{1}{n}H(X_1,X_2,\cdots,X_n)
    • H(Xn,,X1)=i=1nH(XiXi1,,X1)H(X_n,\cdots,X_1)=\sum_{i=1}^nH(X_i|X_{i-1},\cdots,X_1)
  • (a) For a stationary stochastic process, H(XnXn1,,X1)H(X_n|X_{n-1},\cdots,X_1) is nonincreasing and has a limit
    • H(X)=limnH(XnXn1,,X1)H'(X)=\lim_{n\rightarrow\infty}H(X_n|X_{n-1},\cdots,X_1) exists
  • (b) Cesaro Mean: ana,bn=1ni=1nai,bnaa_n\rightarrow a,b_n=\frac{1}{n}\sum_{i=1}^na_i,b_n\rightarrow a
  • For a stationary stochastic process, H(X)=H(X)H(\mathcal{X})=H'(\mathcal{X}) (a,b)
    • Markov Chain: H(X)=H(X2X1)=ijμiPijlogPijH(\mathcal{X})=H(X_2|X_1)=-\sum_{ij}\mu_iP_{ij}\log P_{ij}
  • Some results
    • Second Law of Thermodynamics: model the isolated system as a Morkov chain with transitions obeying the physical laws governing the system
    • D(μnμn)D(\mu_n\|\mu_n') decreases with nn
    • H(XnX1)H(X_n|X_1) increases
    • Shuffles increase entropy: H(TX)H(X)H(TX)\geq H(X)

Functions of Markov Chain

  • X1,,Xn,X_1,\cdots,X_n,\cdots be a stationary Markov chain, Yi=ϕ(Xi)Y_i=\phi(X_i)
  • H(YnYn1,,Y1,X1)H(Y)H(YnYn1,,Y1)H(Y_n|Y_{n-1},\cdots,Y_1,X_1)\leq H(\mathcal{Y})\leq H(Y_n|Y_{n-1},\cdots,Y_1)
  • limH(YnYn1,,Y1,X1)=H(Y)=limH(YnYn1,,Y1)\lim H(Y_n|Y_{n-1},\cdots,Y_1,X_1)= H(\mathcal{Y})= \lim H(Y_n|Y_{n-1},\cdots,Y_1)