UA MATH566 统计理论 Fisher信息论的性质下
UA MATH566 统计理论 Fisher信息量的性质下
- 辅助统计量的Fisher信息为0
- 分布族参数变换后的Fisher信息
- 统计量的Fisher信息的有界性
下面介绍一些Fisher信息量的常用性质。
辅助统计量的Fisher信息为0
假设A(X)∼g(a,θ)A(X)\sim g(a,\theta)A(X)∼g(a,θ),它的Fisher信息为
IA(X)(θ)=E[S(A,θ)]2=E[∂logg(A,θ)∂θ∂loggT(A,θ)∂θ]IA(X)(θ)=0⇔E[∂logg(A,θ)∂θ∂loggT(A,θ)∂θ]=0I_{A(X)}(\theta) = E[S(A,\theta)]^2 = E \left[ \frac{\partial \log g(A,\theta)}{\partial \theta} \frac{\partial \log g^T(A,\theta)}{\partial \theta}\right] \\ I_{A(X)}(\theta) = 0 \Leftrightarrow E \left[ \frac{\partial \log g(A,\theta)}{\partial \theta} \frac{\partial \log g^T(A,\theta)}{\partial \theta}\right] = 0IA(X)(θ)=E[S(A,θ)]2=E[∂θ∂logg(A,θ)∂θ∂loggT(A,θ)]IA(X)(θ)=0⇔E[∂θ∂logg(A,θ)∂θ∂loggT(A,θ)]=0
假设g(a,θ)g(a,\theta)g(a,θ)是完备的,则上式表示
∂logg(a,θ)∂θ=0,∀a\frac{\partial \log g(a,\theta)}{\partial \theta} = 0 ,\forall a∂θ∂logg(a,θ)=0,∀a
这说明g(a,θ)g(a,\theta)g(a,θ)与参数θ\thetaθ无关,这正是A(X)A(X)A(X)为辅助统计量的定义。根据这个推导过程,如果某个统计量的Fisher信息量为0,那么它也一定是辅助统计量。
分布族参数变换后的Fisher信息
已知分布族f(x,θ)f(x,\theta)f(x,θ)的Fisher信息为I(θ)I(\theta)I(θ),现在想把它的参数变换为ξ\xiξ,则变换后的Fisher信息为
I(ξ)=[Dξθ(ξ)]TI(θ)Dξθ(ξ)I(\xi) = [D_{\xi}\theta(\xi)]^TI(\theta)D_{\xi}\theta(\xi)I(ξ)=[Dξθ(ξ)]TI(θ)Dξθ(ξ)
假设θ\thetaθ为nnn维的,ξ\xiξ为mmm维的,那么上式可以用分量表示为
Iab(ξ)=∑i,j=1nIij∂θi∂ξa∂θj∂ξb,a,b=1,⋯,mI_{ab}(\xi) = \sum_{i,j=1}^n I_{ij}\frac{\partial \theta_i}{\partial \xi_a}\frac{\partial \theta_j}{\partial \xi_b},a,b = 1,\cdots,mIab(ξ)=i,j=1∑nIij∂ξa∂θi∂ξb∂θj,a,b=1,⋯,m
根据定义计算即可
Iab(ξ)=E[∂logL∂ξa∂logL∂ξb]=E[(∑i=1n∂logL∂θi∂θi∂ξa)(∑j=1n∂logL∂θj∂θj∂ξb)]=∑i,j=1nE[∂logL∂θi∂logL∂θj]∂θi∂ξa∂θj∂ξb=∑i,j=1nIij∂θi∂ξa∂θj∂ξbI_{ab}(\xi) = E \left[ \frac{\partial \log L}{\partial \xi_a} \frac{\partial \log L}{\partial \xi_b}\right] = E \left[ \left(\sum_{i=1}^n \frac{\partial \log L}{\partial \theta_i} \frac{\partial \theta_i}{\partial \xi_a}\right)\left(\sum_{j=1}^n \frac{\partial \log L}{\partial \theta_j} \frac{\partial \theta_j}{\partial \xi_b}\right)\right] \\ =\sum_{i,j=1}^n E \left[ \frac{\partial \log L}{\partial \theta_i} \frac{\partial \log L}{\partial \theta_j}\right]\frac{\partial \theta_i}{\partial \xi_a}\frac{\partial \theta_j}{\partial \xi_b}=\sum_{i,j=1}^n I_{ij}\frac{\partial \theta_i}{\partial \xi_a}\frac{\partial \theta_j}{\partial \xi_b}Iab(ξ)=E[∂ξa∂logL∂ξb∂logL]=E[(i=1∑n∂θi∂logL∂ξa∂θi)(j=1∑n∂θj∂logL∂ξb∂θj)]=i,j=1∑nE[∂θi∂logL∂θj∂logL]∂ξa∂θi∂ξb∂θj=i,j=1∑nIij∂ξa∂θi∂ξb∂θj
例 自然参数形式的指数分布族f(x,θ)=h(x)exp(θTT(x)−b(θ))f(x,\theta) = h(x)\exp(\theta^T T(x)-b(\theta))f(x,θ)=h(x)exp(θTT(x)−b(θ))的Fisher信息量为
I(θ)=b′′(θ)=Var(T(X))I(\theta) = b''(\theta) = Var(T(X))I(θ)=b′′(θ)=Var(T(X))
假设参数η=Eθ(T(X))=b′(θ)\eta = E_{\theta}(T(X)) = b'(\theta)η=Eθ(T(X))=b′(θ),则根据隐映照定理,
Dηθ(η)=[b′′(θ)]−1,θ=b′−1(η)D_{\eta}\theta(\eta) = [b''(\theta)]^{-1},\theta = b^{'-1}(\eta)Dηθ(η)=[b′′(θ)]−1,θ=b′−1(η)
根据Fisher信息参数变换的性质,
I(η)=[b′′−1(θ)]TI(θ)b′′−1(θ)=b′′−1(θ),θ=b′−1(η)I(\eta) = [b^{''-1}(\theta)]^{T}I(\theta)b^{''-1}(\theta) = b^{''-1}(\theta),\theta = b^{'-1}(\eta)I(η)=[b′′−1(θ)]TI(θ)b′′−1(θ)=b′′−1(θ),θ=b′−1(η)
统计量的Fisher信息的有界性
假设X∼f(x,θ)X \sim f(x,\theta)X∼f(x,θ),T(X)∼g(t,θ)T(X) \sim g(t,\theta)T(X)∼g(t,θ)是它的任意统计量,则
0≤IT(X)≤IX(θ)0 \le I_T(X) \le I_X(\theta)0≤IT(X)≤IX(θ)
当且仅当T(X)T(X)T(X)为辅助统计量时取下界,当且仅当T(X)T(X)T(X)为充分统计量时取上界。
证明
下界可以用IT(X)(θ)=Varθ(T(X))I_{T(X)}(\theta) = Var_{\theta}(T(X))IT(X)(θ)=Varθ(T(X))说明,方差一定是非负的,第一条性质说明当且仅当T(X)T(X)T(X)为辅助统计量时取等。计算
IX(θ)=Var(S(X,θ))=E[Var(S(X,θ)∣T)]+Var[E(S(X,θ)∣T)]I_{X}(\theta) = Var (S(X,\theta)) = E[Var (S(X,\theta)|T)] + Var[E(S(X,\theta)|T)]IX(θ)=Var(S(X,θ))=E[Var(S(X,θ)∣T)]+Var[E(S(X,θ)∣T)]
假设XXX是概率空间(X,B(X),PX)(\mathcal{X},\mathcal{B}(\mathcal{X}),P_X)(X,B(X),PX)上的随机变量,X⊂Rn\mathcal{X} \subset \mathbb{R}^nX⊂Rn。统计量T(X)T(X)T(X)是一个由复合函数T(X):X→T⊂Rk,k<nT(X): \mathcal{X} \to \mathcal{T} \subset \mathbb{R}^k,k<nT(X):X→T⊂Rk,k<n定义的在概率空间(X,B(X),PX)(\mathcal{X},\mathcal{B}(\mathcal{X}),P_X)(X,B(X),PX)上的随机变量,其中TTT是可测函数。假设T(X)T(X)T(X)是(T,B(T),PT)(\mathcal{T},\mathcal{B}(\mathcal{T}),P_T)(T,B(T),PT)上的随机变量,则TTT是可测函数意味着∀B∈B(T),T−1(B)∈B(X)\forall B \in \mathcal{B}(\mathcal{T}),T^{-1}(B) \in \mathcal{B}(\mathcal{X})∀B∈B(T),T−1(B)∈B(X),从而导出测度PTP_TPT可以表示为PT(B)=PX(T−1(B))P_T(B)=P_X(T^{-1}(B))PT(B)=PX(T−1(B))。假设测度被参数化,且用θ\thetaθ表示其参数,则导出测度的关系意味着
∂PT(B)∂θ=∂PX(T−1(B))∂θ\frac{\partial P_T(B)}{\partial \theta} = \frac{\partial P_X(T^{-1}(B))}{\partial \theta}∂θ∂PT(B)=∂θ∂PX(T−1(B))
将概率测度写成概率密度的积分,上式表示
∂PT(B)∂θ=∂∂θ∫Bg(t,θ)dt=∂PX(T−1(B))∂θ=∂∂θ∫T−1(B)f(x,θ)dx\frac{\partial P_T(B)}{\partial \theta} = \frac{\partial }{\partial \theta} \int_{B} g(t,\theta)dt = \frac{\partial P_X(T^{-1}(B))}{\partial \theta} = \frac{\partial }{\partial \theta}\int_{T^{-1}(B)}f(x,\theta)dx∂θ∂PT(B)=∂θ∂∫Bg(t,θ)dt=∂θ∂PX(T−1(B))=∂θ∂∫T−1(B)f(x,θ)dx
凑出得分函数的形式,
∫Bg(t,θ)S(t,θ)dt=∫T−1(B)f(x,θ)S(x,θ)dx\int_{B} g(t,\theta)S(t,\theta) dt = \int_{T^{-1}(B)}f(x,\theta)S(x,\theta)dx∫Bg(t,θ)S(t,θ)dt=∫T−1(B)f(x,θ)S(x,θ)dx
假设f(x,θ)f(x,\theta)f(x,θ)是完备分布族,这个式子说明S(T,θ)=E[S(X,θ)∣T]S(T,\theta) = E[S(X,\theta)|T]S(T,θ)=E[S(X,θ)∣T]。因此第二项可以化简为
Var[E(S(X,θ)∣T)]=Var[S(T,θ)]=IT(θ)Var[E(S(X,\theta)|T)] = Var[S(T,\theta)] = I_T(\theta)Var[E(S(X,θ)∣T)]=Var[S(T,θ)]=IT(θ)
因此
IX(θ)−IT(θ)=E[Var(S(X,θ)∣T)]≥0I_X(\theta) - I_T(\theta) = E[Var (S(X,\theta)|T)]\ge0IX(θ)−IT(θ)=E[Var(S(X,θ)∣T)]≥0
下面验证取等条件。充分性:
假设TTT是充分统计量,根据Fisher-Neyman定理,
f(x,θ)=g(T(x),θ)h(x)⇒logf(x,θ)=logg(T(x),θ)+logh(x)⇒∂∂θlogf(x,θ)=∂∂θlogg(T(x),θ)f(x,\theta) = g(T(x),\theta)h(x) \\ \Rightarrow \log f(x,\theta) = \log g(T(x),\theta) + \log h(x) \\ \Rightarrow \frac{\partial }{\partial \theta} \log f(x,\theta) = \frac{\partial }{\partial \theta} \log g(T(x),\theta)f(x,θ)=g(T(x),θ)h(x)⇒logf(x,θ)=logg(T(x),θ)+logh(x)⇒∂θ∂logf(x,θ)=∂θ∂logg(T(x),θ)
因此IT(θ)=IX(θ)I_T(\theta) = I_X(\theta)IT(θ)=IX(θ)。
必要性:
我们计算下面这个量,
E[(S(X,θ)−S(T,θ))(S(X,θ)−S(T,θ))T]=E[S(X,θ)ST(X,θ)]+E[S(T,θ)ST(T,θ)]−2E[S(X,θ)ST(T,θ)]=IX(θ)+IT(θ)−2IT(θ)=IX(θ)−IT(θ)E[(S(X,\theta)-S(T,\theta))(S(X,\theta)-S(T,\theta))^T] \\ = E[S(X,\theta)S^T(X,\theta)] + E[S(T,\theta)S^T(T,\theta)] - 2E[S(X,\theta)S^T(T,\theta)] \\ = I_X(\theta) + I_T(\theta) - 2I_T(\theta) = I_X(\theta) - I_T(\theta)E[(S(X,θ)−S(T,θ))(S(X,θ)−S(T,θ))T]=E[S(X,θ)ST(X,θ)]+E[S(T,θ)ST(T,θ)]−2E[S(X,θ)ST(T,θ)]=IX(θ)+IT(θ)−2IT(θ)=IX(θ)−IT(θ)
其中
E[S(X,θ)ST(T,θ)]=E[E[S(X,θ)ST(T,θ)∣T]]=E[E[S(X,θ)∣T]ST(T,θ)]=E[S(T,θ)ST(T,θ)]=IT(θ)E[S(X,\theta)S^T(T,\theta)] = E[E[S(X,\theta)S^T(T,\theta)|T]] = E[E[S(X,\theta)|T]S^T(T,\theta)] \\ = E[S(T,\theta)S^T(T,\theta)] = I_T(\theta)E[S(X,θ)ST(T,θ)]=E[E[S(X,θ)ST(T,θ)∣T]]=E[E[S(X,θ)∣T]ST(T,θ)]=E[S(T,θ)ST(T,θ)]=IT(θ)
这个式子说明
IX−IT=E[(S(X,θ)−S(T,θ))(S(X,θ)−S(T,θ))T]I_X - I_T = E[(S(X,\theta)-S(T,\theta))(S(X,\theta)-S(T,\theta))^T]IX−IT=E[(S(X,θ)−S(T,θ))(S(X,θ)−S(T,θ))T]
左边为0,说明S(T,θ)=S(X,θ)S(T,\theta) = S(X,\theta)S(T,θ)=S(X,θ),即
∂∂θlogf(x,θ)=∂∂θlogg(T(x),θ)\frac{\partial }{\partial \theta} \log f(x,\theta) = \frac{\partial }{\partial \theta} \log g(T(x),\theta)∂θ∂logf(x,θ)=∂θ∂logg(T(x),θ)
f(x,θ)f(x,\theta)f(x,θ)与g(t,θ)g(t,\theta)g(t,θ)只相差一个与θ\thetaθ无关的常函数,根据Neyman-Fisher定理,T(X)T(X)T(X)是充分统计量。
总结
以上是生活随笔为你收集整理的UA MATH566 统计理论 Fisher信息论的性质下的全部内容,希望文章能够帮你解决所遇到的问题。
- 上一篇: UA MATH566 统计理论 Fish
- 下一篇: UA MATH571B 试验设计 总结