Chapter 5 Statistician Tool Box
5.1 Matrix algebra
Definition 5.2 定义:矩阵对矩阵的导数 设 \(\boldsymbol{X}=\left(\xi_{i j}\right)_{m \times n}\), \(mn\) 元函数 \(f(\boldsymbol{X})=f\left(\xi_{11}, \xi_{12}\cdots, \xi_{1 n}, \xi_{21}, \cdots, \xi_{m n} )\right.\),定义\(f(X)\) 对矩阵\(X\)的导数为
\[\begin{equation} \frac{\mathrm{d} f}{\mathrm{d} \boldsymbol{X}}=\left(\frac{\partial f}{\partial \xi_{i j}}\right)_{m \times n}= \left[ \begin{array}{ccc} {\frac{\partial f}{\partial \xi_{11}}} & {\dots} & {\frac{\partial f}{\partial \xi_{1 n}}} \\ {\vdots} & { } & {\vdots} \\ {\frac{\partial f}{\partial \xi_{m 1}}} & {\cdots} & {\frac{\partial f}{\partial \xi_{m n}}} \end{array}\right] \end{equation}\]根据这个定义,能够直接证明以下两个最常见的结论 \[ \frac{\partial x'a}{\partial x}=a=\frac{\partial a'x}{\partial x} \]
\[ \frac{\partial x'Ax}{\partial x}=(A+A')x \]
证明: 将矩阵乘法\(x^TAx\)打开得到 \[ \begin{aligned} f(\boldsymbol{x})=& \sum_{i=1}^{n} \sum_{j=1}^{n} a_{i j} \xi_{i} \xi_{j}=\\ & \xi_{1} \sum_{j=1}^{n} a_{1 j} \xi_{j}+\cdots+\xi_{k} \sum_{j=1}^{n} a_{k j} \xi_{j}+\cdots+\xi_{n} \sum_{j=1}^{n} a_{n j} \xi_{j} \end{aligned} \] 且有 \[ \begin{align} \frac{\partial f}{\partial \xi_{k}}=\xi_{1} a_{1 k}+\cdots+\xi_{k-1} a_{k-1, k}+\left(\sum_{j=1}^{n} a_{k j} \xi_{j}+\xi_{k} a_{k k}\right)+\\ \xi_{k+1} a_{k+1, k}+\cdots+\xi_{n} a_{\# k}=\sum_{j=1}^{n} a_{i j} \xi_{j}+\sum_{i=1}^{n} a_{k} \xi_{i} \end{align} \] 所以有 \[ \frac{d f}{d x}=\left[ \begin{array}{c}{\frac{\partial f}{\partial \xi_{1}}} \\ {\vdots} \\ {\frac{\partial f}{\partial \xi_{n}}}\end{array}\right]=\left[ \begin{array}{c}{\sum_{j=1}^{n} a_{1 j} \xi_{j}} \\ {\vdots} \\ {\sum_{j=1}^{n} a_{n j} \xi_{j}}\end{array}\right]+\left[ \begin{array}{c}{\sum_{i=1}^{n} a_{i 1} \xi_{i}} \\ {\vdots} \\ {\sum_{i=1}^{n} a_{i n} \xi_{i}}\end{array}\right]=\\ \boldsymbol{A x}+\boldsymbol{A}^{\mathrm{T}} \boldsymbol{x}=\left(\boldsymbol{A}+\boldsymbol{A}^{\mathrm{T}}\right) \boldsymbol{x} \]
5.1.1 Block diagonal matrices
\[ \mathbf{A}=\left[ \begin{array}{cccc}{\mathbf{A}_{1}} & {0} & {\cdots} & {0} \\ {0} & {\mathbf{A}_{2}} & {\cdots} & {0} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\mathbf{A}_{n}}\end{array}\right] \]
property: \[ \begin{aligned} \operatorname{det} \mathbf{A} &=\operatorname{det} \mathbf{A}_{1} \times \cdots \times \operatorname{det} \mathbf{A}_{n} \\ \operatorname{tr} \mathbf{A} &=\operatorname{tr} \mathbf{A}_{1}+\cdots+\operatorname{tr} \mathbf{A}_{n} \end{aligned} \]
It’s inverse:
\[ \left( \begin{array}{cccc}{\mathbf{A}_{1}} & {0} & {\cdots} & {0} \\ {0} & {\mathbf{A}_{2}} & {\cdots} & {0} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\mathbf{A}_{n}}\end{array}\right)=\left( \begin{array}{cccc}{\mathbf{A}_{1}^{-1}} & {0} & {\cdots} & {0} \\ {0} & {\mathbf{A}_{2}^{-1}} & {\cdots} & {0} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {0} & {0} & {\cdots} & {\mathbf{A}_{n}^{-1}}\end{array}\right) \]
5.2 两个二次型相加
In vector formulation(Assuming \(\Sigma_1\),\(\Sigma_2\) are symmetric) 有: \[ \begin{array}{c}{-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_{1}\right)^{T} \mathbf{\Sigma}_{1}^{-1}\left(\mathbf{x}-\mathbf{m}_{1}\right)} \\ {-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_{2}\right)^{T} \mathbf{\Sigma}_{2}^{-1}\left(\mathbf{x}-\mathbf{m}_{2}\right)} \\ {=-\frac{1}{2}\left(\mathbf{x}-\mathbf{m}_{c}\right)^{T} \mathbf{\Sigma}_{c}^{-1}\left(\mathbf{x}-\mathbf{m}_{c}\right)+C}\end{array} \] 其中: \[ \begin{aligned} \mathbf{\Sigma}_{c}^{-1}=& \mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1} \\ \mathbf{m}_{c} &=\left(\mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1}\right)^{-1}\left(\mathbf{\Sigma}_{1}^{-1} \mathbf{m}_{1}+\mathbf{\Sigma}_{2}^{-1} \mathbf{m}_{2}\right) \\ C &=\frac{1}{2}\left(\mathbf{m}_{1}^{T} \mathbf{\Sigma}_{1}^{-1}+\mathbf{m}_{2}^{T} \mathbf{\Sigma}_{2}^{-1}\right)\left(\mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1}\right)^{-1}\left(\mathbf{\Sigma}_{1}^{-1} \mathbf{m}_{1}+\mathbf{\Sigma}_{2}^{-1} \mathbf{m}_{2}\right) \\ &-\frac{1}{2}\left(\mathbf{m}_{1}^{T} \mathbf{\Sigma}_{1}^{-1} \mathbf{m}_{1}+\mathbf{m}_{2}^{T} \mathbf{\Sigma}_{2}^{-1} \mathbf{m}_{2}\right) \end{aligned} \]
而对于我需要的情况,简化\(m_1=0\),\(m_2=0\)
那么就有,只需要\(\mathbf{\Sigma}_{c}^{-1}=\mathbf{\Sigma}_{1}^{-1}+\mathbf{\Sigma}_{2}^{-1}\),同时\(m_c=0\).
5.3 正定阵的基础定理:
Theorem 1
For a \(p \times p\) symmetric matrix \(\Sigma=\left(\sigma_{i j}\right),\) the following are equivalent:
\(\Sigma\) is nonnegative definite.
all its leading principal minors are nonnegative definite, that is, the \(i \times i\) matrices \[ \boldsymbol{\Sigma}_{i i}=\left(\begin{array}{ccc} {\sigma_{11}} & {\cdots} & {\sigma_{1 i}} \\ {\vdots} & {\ddots} & {\vdots} \\ {\sigma_{i 1}} & {\cdots} & {\sigma_{i i}} \end{array}\right), i=1, \cdots, p \] are nonnegative definite.
all eigenvalues of \(\Sigma\) are nonnegative.
there exists a matrix A such that \[ \boldsymbol{\Sigma}=A A^{\prime} \]
there exists a lower triangular matrix \(L\) such that \[ \boldsymbol{\Sigma}=L L^{\prime} \]
there exist vectors \(\boldsymbol{u}_{1}, \cdots, \boldsymbol{u}_{p}\) in \(R^{p}\) such that \(\sigma_{i j}=\boldsymbol{u}_{i}^{\prime} \boldsymbol{u}_{j}\)
5.4 对称阵的谱分解相关定理
Theorem 4 (The Spectral Decomposition) Let \(A\) be a \(p \times p\) symmetric matrix with p pairs of eigenvalues and eigenvectors \[ \left(\lambda_{1}, \mathbf{e}_{1}\right),\left(\lambda_{2}, \mathbf{e}_{2}\right), \cdots,\left(\lambda_{p}, \mathbf{e}_{p}\right) \] Then,
The eigenvalues \(\lambda_{1}, \ldots, \lambda_{p}\) are all real, and can be ordered from the largest to the smallest \[ \lambda_{1} \geq \lambda_{2} \geq \ldots \geq \lambda_{p} \]
The normalized eigenvectors \(\mathbf{e}_{1}, \ldots, \mathbf{e}_{p}\) are mutually orthogonal and the matrix \[ P=\left(\mathbf{e}_{1}, \mathbf{e}_{2}, \ldots, \mathbf{e}_{p}\right) \] is an orthogonal matrix, that is, \[ P P^{\prime}=P^{\prime} P=I \]
The spectral decomposition of \(A\) is the expansion \[ A=\lambda_{1} \mathbf{e}_{1} \mathbf{e}_{1}^{\prime}+\lambda_{2} \mathbf{e}_{2} \mathbf{e}_{2}^{\prime}+\cdots+\lambda_{p} \mathbf{e}_{p} \mathbf{e}_{p}^{\prime}=P \Lambda P^{\prime} \] where \(P\) is as above and \[ \Lambda=\operatorname{diag}\left(\lambda_{1}, \cdots, \lambda_{p}\right) \] is a diagonal matrix with \(\lambda_{1}, \ldots, \lambda_{p}\) as its respective diagonal entries.
The matrix \(A\) is nonnegative definite, if and only if all its eigenvalues are nonnegative.
5.5 Covariance Structure
5.5.1 Compound Symmetry Covariance结构
P54 of pourahmadi covariance book
a sufficient condition for nonnegative definite is : \(1+(p-1) \rho \geq 0\) or \(-(p-1)^{-1} \leq \rho \leq 1\) \[ \boldsymbol{\Sigma}=\sigma^{2} R=\sigma^{2}\left(\begin{array}{cccc} 1 & \rho & \cdots & \rho \\ \rho & 1 & \cdots & \rho \\ \vdots & \vdots & \ddots & \vdots \\ \rho & \rho & \cdots & 1 \end{array}\right)=\sigma^{2}\left[(1-\rho) I+\rho \mathbf{1}_{n} \mathbf{1}_{n}^{\prime}\right] \]
5.5.2 Huynh-Feldt Structure
\(\Sigma=\sigma^{2}\left(\alpha I+a \mathbf{1}_{p}^{\prime}+\mathbf{1}_{p} a^{\prime}\right)\)
The \(\Sigma\) is nonnegative definite provided that \[ \alpha>\mathbf{1}_{p}^{\prime} a-\sqrt{p a^{\prime} a} \]
5.5.3 The One-Dependent Covariance Structure
\[ \Sigma=\sigma^{2}\left(\begin{array}{ccccc} {a} & {b} & {0} & {\cdots} & {0} \\ {b} & {a} & {b} & {\ddots} & {\vdots} \\ {\vdots} & {\ddots} & {\ddots} & {\ddots} & {0} \\ {\vdots} & {} & {\ddots} & {\ddots} & {b} \\ {0} & {\cdots} & {0} & {b} & {a} \end{array}\right) \]
5.5.4 AR(1) Structure (highly popular)
\[ \boldsymbol{\Sigma}=\sigma^{2}\left(\begin{array}{cccc} {1} & {\rho} & {\cdots} & {\rho^{p-1}} \\ {\rho} & {1} & {\cdots} & {\rho^{p-1}} \\ {\vdots} & {\vdots} & {\ddots} & {\vdots} \\ {\rho^{p-1}} & {\rho^{p-2}} & {\cdots} & {1} \end{array}\right),-1<\rho<1 \]