Chapter 5 Statistician Tool Box
5.1 Matrix algebra
Definition 5.2 定义:矩阵对矩阵的导数 设 \boldsymbol{X}=\left(\xi_{i j}\right)_{m \times n}, mn 元函数 f(\boldsymbol{X})=f\left(\xi_{11}, \xi_{12}\cdots, \xi_{1 n}, \xi_{21}, \cdots, \xi_{m n} )\right.,定义f(X) 对矩阵X的导数为
\begin{equation} \frac{\mathrm{d} f}{\mathrm{d} \boldsymbol{X}}=\left(\frac{\partial f}{\partial \xi_{i j}}\right)_{m \times n}= \left[ \begin{array}{ccc} {\frac{\partial f}{\partial \xi_{11}}} & {\dots} & {\frac{\partial f}{\partial \xi_{1 n}}} \\ {\vdots} & { } & {\vdots} \\ {\frac{\partial f}{\partial \xi_{m 1}}} & {\cdots} & {\frac{\partial f}{\partial \xi_{m n}}} \end{array}\right] \end{equation}根据这个定义,能够直接证明以下两个最常见的结论 \frac{\partial x'a}{\partial x}=a=\frac{\partial a'x}{\partial x}
\frac{\partial x'Ax}{\partial x}=(A+A')x
证明: 将矩阵乘法x^TAx打开得到 \begin{aligned} f(\boldsymbol{x})=& \sum_{i=1}^{n} \sum_{j=1}^{n} a_{i j} \xi_{i} \xi_{j}=\\ & \xi_{1} \sum_{j=1}^{n} a_{1 j} \xi_{j}+\cdots+\xi_{k} \sum_{j=1}^{n} a_{k j} \xi_{j}+\cdots+\xi_{n} \sum_{j=1}^{n} a_{n j} \xi_{j} \end{aligned} 且有 \begin{align} \frac{\partial f}{\partial \xi_{k}}=\xi_{1} a_{1 k}+\cdots+\xi_{k-1} a_{k-1, k}+\left(\sum_{j=1}^{n} a_{k j} \xi_{j}+\xi_{k} a_{k k}\right)+\\ \xi_{k+1} a_{k+1, k}+\cdots+\xi_{n} a_{\# k}=\sum_{j=1}^{n} a_{i j} \xi_{j}+\sum_{i=1}^{n} a_{k} \xi_{i} \end{align} 所以有 \frac{d f}{d x}=\left[ \begin{array}{c}{\frac{\partial f}{\partial \xi_{1}}} \\ {\vdots} \\ {\frac{\partial f}{\partial \xi_{n}}}\end{array}\right]=\left[ \begin{array}{c}{\sum_{j=1}^{n} a_{1 j} \xi_{j}} \\ {\vdots} \\ {\sum_{j=1}^{n} a_{n j} \xi_{j}}\end{array}\right]+\left[ \begin{array}{c}{\sum_{i=1}^{n} a_{i 1} \xi_{i}} \\ {\vdots} \\ {\sum_{i=1}^{n} a_{i n} \xi_{i}}\end{array}\right]=\\ \boldsymbol{A x}+\boldsymbol{A}^{\mathrm{T}} \boldsymbol{x}=\left(\boldsymbol{A}+\boldsymbol{A}^{\mathrm{T}}\right) \boldsymbol{x}
5.1.1 Block diagonal matrices
\mathbf{A}=\left[ \begin{array}{cccc}{\mathbf{A}_{1}} & {0} & {\cdots} & {0} \\ {0} & {\mathbf{A}_{2}} & {\cdots} & {0} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\mathbf{A}_{n}}\end{array}\right]
property: \begin{aligned} \operatorname{det} \mathbf{A} &=\operatorname{det} \mathbf{A}_{1} \times \cdots \times \operatorname{det} \mathbf{A}_{n} \\ \operatorname{tr} \mathbf{A} &=\operatorname{tr} \mathbf{A}_{1}+\cdots+\operatorname{tr} \mathbf{A}_{n} \end{aligned}
It’s inverse:
\left( \begin{array}{cccc}{\mathbf{A}_{1}} & {0} & {\cdots} & {0} \\ {0} & {\mathbf{A}_{2}} & {\cdots} & {0} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\mathbf{A}_{n}}\end{array}\right)=\left( \begin{array}{cccc}{\mathbf{A}_{1}^{-1}} & {0} & {\cdots} & {0} \\ {0} & {\mathbf{A}_{2}^{-1}} & {\cdots} & {0} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\mathbf{A}_{n}^{-1}}\end{array}\right)
5.2 两个二次型相加
In vector formulation(Assuming \Sigma_1,\Sigma_2 are symmetric) 有: \begin{array}{c}{-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_{1}\right)^{T} \mathbf{\Sigma}_{1}^{-1}\left(\mathbf{x}-\mathbf{m}_{1}\right)} \\ {-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_{2}\right)^{T} \mathbf{\Sigma}_{2}^{-1}\left(\mathbf{x}-\mathbf{m}_{2}\right)} \\ {=-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_{c}\right)^{T} \mathbf{\Sigma}_{c}^{-1}\left(\mathbf{x}-\mathbf{m}_{c}\right)+C}\end{array} 其中: \begin{aligned} \mathbf{\Sigma}_{c}^{-1}=& \mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1} \\ \mathbf{m}_{c} &=\left(\mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1}\right)^{-1}\left(\mathbf{\Sigma}_{1}^{-1} \mathbf{m}_{1}+\mathbf{\Sigma}_{2}^{-1} \mathbf{m}_{2}\right) \\ C &=\frac{1}{2}\left(\mathbf{m}_{1}^{T} \mathbf{\Sigma}_{1}^{-1}+\mathbf{m}_{2}^{T} \mathbf{\Sigma}_{2}^{-1}\right)\left(\mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1}\right)^{-1}\left(\mathbf{\Sigma}_{1}^{-1} \mathbf{m}_{1}+\mathbf{\Sigma}_{2}^{-1} \mathbf{m}_{2}\right) \\ &-\frac{1}{2}\left(\mathbf{m}_{1}^{T} \mathbf{\Sigma}_{1}^{-1} \mathbf{m}_{1}+\mathbf{m}_{2}^{T} \mathbf{\Sigma}_{2}^{-1} \mathbf{m}_{2}\right) \end{aligned}
而对于我需要的情况,简化m_1=0,m_2=0
那么就有,只需要\mathbf{\Sigma}_{c}^{-1}=\mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1},同时m_c=0.
5.3 正定阵的基础定理:
Theorem 1
For a p \times p symmetric matrix \Sigma=\left(\sigma_{i j}\right), the following are equivalent:
\Sigma is nonnegative definite.
all its leading principal minors are nonnegative definite, that is, the i \times i matrices \boldsymbol{\Sigma}_{i i}=\left(\begin{array}{ccc} {\sigma_{11}} & {\cdots} & {\sigma_{1 i}} \\ {\vdots} & {\ddots} & {\vdots} \\ {\sigma_{i 1}} & {\cdots} & {\sigma_{i i}} \end{array}\right), i=1, \cdots, p are nonnegative definite.
all eigenvalues of \Sigma are nonnegative.
there exists a matrix A such that \boldsymbol{\Sigma}=A A^{\prime}
there exists a lower triangular matrix L such that \boldsymbol{\Sigma}=L L^{\prime}
there exist vectors \boldsymbol{u}_{1}, \cdots, \boldsymbol{u}_{p} in R^{p} such that \sigma_{i j}=\boldsymbol{u}_{i}^{\prime} \boldsymbol{u}_{j}
5.4 对称阵的谱分解相关定理
Theorem 4 (The Spectral Decomposition) Let A be a p \times p symmetric matrix with p pairs of eigenvalues and eigenvectors \left(\lambda_{1}, \mathbf{e}_{1}\right),\left(\lambda_{2}, \mathbf{e}_{2}\right), \cdots,\left(\lambda_{p}, \mathbf{e}_{p}\right) Then,
The eigenvalues \lambda_{1}, \ldots, \lambda_{p} are all real, and can be ordered from the largest to the smallest \lambda_{1} \geq \lambda_{2} \geq \ldots \geq \lambda_{p}
The normalized eigenvectors \mathbf{e}_{1}, \ldots, \mathbf{e}_{p} are mutually orthogonal and the matrix P=\left(\mathbf{e}_{1}, \mathbf{e}_{2}, \ldots, \mathbf{e}_{p}\right) is an orthogonal matrix, that is, P P^{\prime}=P^{\prime} P=I
The spectral decomposition of A is the expansion A=\lambda_{1} \mathbf{e}_{1} \mathbf{e}_{1}^{\prime}+\lambda_{2} \mathbf{e}_{2} \mathbf{e}_{2}^{\prime}+\cdots+\lambda_{p} \mathbf{e}_{p} \mathbf{e}_{p}^{\prime}=P \Lambda P^{\prime} where P is as above and \Lambda=\operatorname{diag}\left(\lambda_{1}, \cdots, \lambda_{p}\right) is a diagonal matrix with \lambda_{1}, \ldots, \lambda_{p} as its respective diagonal entries.
The matrix A is nonnegative definite, if and only if all its eigenvalues are nonnegative.
5.5 Covariance Structure
5.5.1 Compound Symmetry Covariance结构
P54 of pourahmadi covariance book
a sufficient condition for nonnegative definite is : 1+(p-1) \rho \geq 0 or -(p-1)^{-1} \leq \rho \leq 1 \boldsymbol{\Sigma}=\sigma^{2} R=\sigma^{2}\left(\begin{array}{cccc} 1 & \rho & \cdots & \rho \\ \rho & 1 & \cdots & \rho \\ \vdots & \vdots & \ddots & \vdots \\ \rho & \rho & \cdots & 1 \end{array}\right)=\sigma^{2}\left[(1-\rho) I+\rho \mathbf{1}_{n} \mathbf{1}_{n}^{\prime}\right]
5.5.2 Huynh-Feldt Structure
\Sigma=\sigma^{2}\left(\alpha I+a \mathbf{1}_{p}^{\prime}+\mathbf{1}_{p} a^{\prime}\right)
The \Sigma is nonnegative definite provided that \alpha>\mathbf{1}_{p}^{\prime} a-\sqrt{p a^{\prime} a}
5.5.3 The One-Dependent Covariance Structure
\Sigma=\sigma^{2}\left(\begin{array}{ccccc} {a} & {b} & {0} & {\cdots} & {0} \\ {b} & {a} & {b} & {\ddots} & {\vdots} \\ {\vdots} & {\ddots} & {\ddots} & {\ddots} & {0} \\ {\vdots} & {} & {\ddots} & {\ddots} & {b} \\ {0} & {\cdots} & {0} & {b} & {a} \end{array}\right)
5.5.4 AR(1) Structure (highly popular)
\boldsymbol{\Sigma}=\sigma^{2}\left(\begin{array}{cccc} {1} & {\rho} & {\cdots} & {\rho^{p-1}} \\ {\rho} & {1} & {\cdots} & {\rho^{p-1}} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {\rho^{p-1}} & {\rho^{p-2}} & {\cdots} & {1} \end{array}\right),-1<\rho<1