はしくれエンジニアもどきのメモ

情報系技術・哲学・デザインなどの勉強メモ・備忘録です。

正規逆ガンマ分布の事後分布のパラメータ更新式と予測分布の導出

正規逆ガンマ分布の事後分布のパラメータ更新式と予測分布の導出

Udemyの「ベイズ推定とグラフィカルモデル:コンピュータビジョン基礎1」(テキスト:"Computer vision: models, learning and inference" by Simon Prince)で, 事後分布のパラメータ式のみ載っていたので,実際に導出できるかのメモ.

正規逆ガンマ分布の事後分布

Normal-inverse-gamma distribution - Wikipedia

尤度を正規分布,事前分布を正規逆ガンマ分布とした場合, 共役関係により事後分布も正規逆ガンマ分布になる.

事後分布 尤度 共役事前分布
正規逆ガンマ分布 正規分布 正規逆ガンマ分布

ベイズの事後分布の式を展開して,正規逆ガンマ分布の式を代入していくと以下の式になる.


\begin{eqnarray}
Pr(\vec{\theta} | \mathbf{d}_{1 \cdots I})
&=& \frac{\Pi_{i=1}^{I} Pr(x_{i} = d_{i} | \vec{\theta}) }{Pr(\mathbf{d}_{1 \cdots I})} Pr(\vec{\theta}) \\\\
Pr(\mu, \sigma^{2} |  \vec{d}_{1 \cdots I}) &=& \frac{\Pi_{i=1}^{I} Pr(\vec{x}_{i}=\vec{d}_{i} | \mu, \sigma^{2})}{Pr(\vec{d}_{1 \cdots I})}Pr(\mu, \sigma^{2}) \\\\
&=& \frac{\Pi_{i=1}^{I} Pr(\vec{x}_{i}=\vec{d}_{i} | \mu, \sigma^{2})}{\int \int \Pi_{i=1}^{I} Pr(\vec{x}_{i}=\vec{d}_{i} | \mu, \sigma^{2})Pr(\mu, \sigma^{2}) d\mu d\sigma^{2}}Pr(\mu, \sigma^{2}) \\\\
&=& \frac{\Pi_{i=1}^{I} \text{Norm}(\vec{x}_{i}=\vec{d}_{i} | \mu, \sigma^{2})}{\int \int \Pi_{i=1}^{I} \text{Norm}(\vec{x}_{i}=\vec{d}_{i} | \mu, \sigma^{2} ) \text{NormInvGamma}(\mu, \sigma^{2} | \alpha, \beta, \delta, \gamma) d\mu d\sigma^{2} } \text{NormInvGamma}(\mu, \sigma^{2} | \alpha, \beta, \delta, \gamma) \\\\
&=& \text{NormInvGamma}(\mu, \sigma^{2} | \tilde{\alpha}, \tilde{\beta}, \tilde{\delta}, \tilde{\gamma}  ) \\\\
\end{eqnarray}

この事後分布の正規逆ガンマ分布のパラメータは以下のようになる.


\begin{eqnarray}
\tilde{\alpha} &=& \alpha + \frac{I}{2} \\\\
\tilde{\beta} &=& \beta + \frac{\sum_{i} x_{i}^{2}}{2}  + \frac{\gamma \delta^2}{2} - \frac{(\gamma\delta + \sum_{i}d_{i})^{2}}{2(\gamma + I)}\\\\
\tilde{\delta} &=& \frac{(\gamma\delta + \sum_{i}d_{i})}{\gamma + I} \\\\
\tilde{\gamma} &=& \gamma + I \\\\
\end{eqnarray}
  • $d_{i}$: (正規分布に従う)観測データ
  • $I$: 観測データ数(標本の大きさ)

パラメータの導出

参考:http://patricklam.org/teaching/conjugacy_print.pdf

事後分布は尤度×事前分布に比例するので


\begin{eqnarray}
Pr(\mu, \sigma^{2} | \vec{x}=\vec{d})
&\propto& \Pi_{i=1}^{I} Pr(x_i=d_i | \mu, \sigma^2)Pr(\mu, \sigma^{2})\\
&=& \Pi_{i=1}^{I} \text{Norm}(x_i=d_i | \mu, \sigma^2) \text{NormInvGamma}(\mu, \sigma^{2}|\alpha, \beta, \delta, \gamma)\\\\
&=& \frac{1}{( 2 \pi \sigma^{2})^{\frac{I}{2}}} \exp{\left[ - 0.5 \sum_{i=1}^{I} \frac{(d_{i} - \mu)^{2}}{\sigma^{2}} \right]} \cdot \frac{\sqrt{\gamma}}{ \sqrt{2 \pi \sigma^{2}}} \frac{\beta^{\alpha}}{\Gamma[ \alpha ]} \left( \frac{1}{\sigma^{2}} \right)^{\alpha + 1} \exp{\left[ - \frac{2 \beta + \gamma (\delta - \mu)^{2}}{2 \sigma^{2}} \right]}\\\\
\end{eqnarray}

 \frac{\sqrt{\gamma}}{ \sqrt{2 \pi \sigma^{2}}} \frac{\beta^{\alpha}}{\Gamma(\alpha)} は正規逆ガンマの係数より除外すると


\begin{eqnarray}
Pr(\mu, \sigma^{2} | \vec{x}=\vec{d})
&\propto& \left( \frac{1}{\sigma^{2}} \right)^{\alpha + \frac{I}{2} + 1} \exp{\left[ - \frac{1}{2}\frac{2 \beta +\sum_{i=1}^{I}( d_{i}-\mu)^{2}  + \gamma (\delta - \mu)^{2}}{ \sigma^{2}} \right]} \\\\
\end{eqnarray}

expの中の分子を展開して平方完成する


\begin{eqnarray}
2 \beta +\sum_{i=1}^{I}( d_{i}-\mu)^{2}  + \gamma (\delta - \mu)^{2}
&=& 2\beta + \sum(d_{i}^{2} - 2d_{i}\mu +\mu^{2}) + \gamma(\delta^{2} - 2\delta\mu + \mu^{2})\\\\
&=& 2\beta + (\sum d_{i}^{2} - 2\mu \sum d_{i} +I\mu^{2}) + (\gamma\delta^{2} - 2\mu\gamma\delta + \gamma\mu^{2})\\\\
&=& 2\beta + \sum d_{i}^{2} + \gamma\delta^{2} - 2\mu (\sum d_{i}+ \gamma\delta)+(I+\gamma)\mu^{2}\\
\end{eqnarray}\\\\

ここで平方完成用に項を追加


\begin{eqnarray}
2 \beta +\sum_{i=1}^{I}( d_{i}-\mu )^{2}  + \gamma (\delta - \mu)^{2}
&=& 2\beta + \sum d_{i}^{2} + \gamma\delta^{2} + \frac{(\sum d_{i}+ \gamma\delta)^{2}}{I+\gamma} - 2\mu (\sum d_{i}+ \gamma\delta)+(I+\gamma)\mu^{2} - \frac{(\sum d_{i}+ \gamma\delta)^{2}}{I+\gamma}\\\\
&=& 2\beta + \sum d_{i}^{2} + \gamma\delta^{2} + (I+\gamma)\left(\frac{(\sum d_{i}+ \gamma\delta)^{2}}{ ( I + \gamma )^{2}} - 2\mu (\frac{\sum d_{i}+ \gamma\delta)}{(I+\gamma)}+\mu^{2}\right) - \frac{(\sum d_{i}+ \gamma\delta)^{2}}{I+\gamma}\\\\
&=& 2\beta + \sum d_{i}^{2} + \gamma\delta^{2} + (I+\gamma)\left(\frac{(\sum d_{i}+ \gamma\delta)}{ (I +\gamma ) } - \mu \right)^{2} - \frac{(\sum d_{i} + \gamma\delta)^{2}}{I+\gamma}\\\\
&=& 2(\beta + \frac{\sum d_{i}^{2}}{2} + \frac{\gamma\delta^{2}}{2}- \frac{(\sum d_{i} + \gamma\delta)^{2}}{2 ( I + \gamma )}) + ( I+\gamma )\left(\frac{\sum d_{i}+ \gamma\delta}{I+\gamma} - \mu \right)^{2} \\\\
\end{eqnarray}

よって,


\begin{eqnarray}
Pr(\mu, \sigma^{2} | \vec{x}=\vec{d})
&\propto& \left( \frac{1}{\sigma^{2}} \right)^{\alpha + \frac{I}{2} + 1} \exp{\left[ - \frac{1}{2}\frac{ 2 ( \beta + \frac{\sum d_{i}^{2}}{2} + \frac{\gamma\delta^{2}}{2}- \frac{(\sum d_{i} + \gamma \delta )^{2}}{2 ( I+\gamma ) } ) + (\gamma+I)\left(\frac{\sum d_{i}+ \gamma\delta}{I+\gamma} - \mu \right)^{2}}{ \sigma^{2}} \right] } \\\\
\end{eqnarray}

ここから,事後分布の新しいパラメータは以下のようになる.


\begin{eqnarray}
\tilde{\alpha} &=& \alpha + \frac{I}{2} \\\\
\tilde{\beta} &=& \beta + \frac{ \sum d_{i}^{2}}{2} + \frac{ \gamma \delta^{2}}{2}- \frac{(\sum d_{i} + \gamma \delta )^{2}}{2 ( I + \gamma ) } \\\\
\tilde{\gamma} &=& \gamma + I \\\\
\tilde{\delta} &=& \frac{\gamma \delta + \sum d_{i}}{\gamma + I} \\\\
\end{eqnarray}

正規逆ガンマ分布の予測分布(predictive density)

正規逆ガンマ分布の事後分布を使った予測分布は以下の積分の式になる.


Pr(x^{*} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
= \int \int Pr(x^{*} | \mu, \sigma^{2}) Pr(\mu, \sigma^{2} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu d\sigma

$x^{*}$:正規分布に従う新しいデータ

 Pr(x^{*} | \mu, \sigma^{2}) : 新しいデータの正規分布

 Pr(\mu, \sigma^{2} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I } )

: 正規逆ガンマ分布の事後分布

この予測分布はt分布になることをまず確認する.

t分布になることを確認

補題「正規逆ガンマ分布を分散パラメータで積分するとt分布になる」を利用する.

cartman0.hatenablog.com

導出の流れ:

  1. 新しいデータ$x^{*}$の正規分布と事後分布の正規逆ガンマ分布の積より,正規逆ガンマ分布のもっている正規分布との積が計算でき,新しいデータ$x^{*}$と平均パラメータ$\mu$との2次元正規分布になる.

  2. 2次元正規分布を平均パラメータ$\mu$で積分すると1次元正規分布になる.

  3. 1次元正規分布と逆ガンマ分布の積により新しいデータ$x^{*}$に対する正規逆ガンマ分布になる

  4. 正規逆ガンマ分布を分散パラメータで積分するのでt分布になる.

参考:https://people.eecs.berkeley.edu/~jordan/courses/260-spring10/lectures/lecture5.pdf


\begin{eqnarray}
Pr(x^{*} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
&=& \int \int Pr(x^{*} | \mu, \sigma^{2}) Pr(\mu, \sigma^{2} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu d\sigma^{2} \\\\
&=& \int \int \text{Norm}( x^{*} | \mu, \sigma^{2}) \text{NormInvGamma}(\mu, \sigma^{2}|\alpha, \beta, \gamma, \delta , \vec{x}_{1 \cdots I } = \vec{d}_{1 \cdots I}) d\mu d\sigma^{2} \\\\
&=& \int \int \text{Norm}( x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \delta, \frac{\sigma^{2}}{ \gamma}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})\text{InvGamma}( \sigma^{2}|\alpha, \beta, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu d\sigma^{2} \\\\
\end{eqnarray}

ここで,正規分布の積より,1つの変数が新しいデータ$x^{*}$, もう1つの変数が平均パラメータ$\mu$を持つ2次元正規分布になる.


\begin{eqnarray}
Pr(x^{*} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
&=& \int \text{Norm}(x^{*},\mu | \delta, \gamma,\sigma^{2}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu \int \text{InvGamma}( \sigma^{2}|\alpha, \beta, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\sigma^{2} \\\\
\end{eqnarray}

しかし,平均パラメータ$\mu$で積分するのでが新しいデータ$x^{*}$についての1次元正規分布になる.


\begin{eqnarray}
Pr(x^{*} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
&=& \text{Norm}(x^{*} | \delta, \gamma,\sigma^{2}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) \int \text{InvGamma}( \sigma^{2}|\alpha, \beta, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\sigma^{2} \\\\
&=& \int \text{Norm}(x^{*} | \delta, \gamma,\sigma^{2}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) \text{InvGamma}(\sigma^{2} |\alpha, \beta, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\sigma^{2} \\\\
&=& \text{Student-t}(x^{*} | \alpha, \beta, \delta, \gamma, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) \\\\
\end{eqnarray}

予測分布のt分布のパラメータを導出

参考:https://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf


\begin{eqnarray}
Pr(x^{*} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
&=& \int \int Pr(x^{*} | \mu, \sigma^{2}) Pr(\mu, \sigma^{2} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu d\sigma^{2} \\\\
&=& \int \int \text{Norm}(x^{*} | \mu, \sigma^{2}) \text{NormInvGamma}(\mu, \sigma^{2}|\tilde{\alpha}, \tilde{\beta}, \tilde{\gamma}, \tilde{\delta} , \vec{x}_{1\cdots I}= \vec{d}_{1 \cdots I}) d\mu d\sigma^{2} \\\\
&=& \int \int \text{Norm}(x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \delta, \frac{\sigma^{2}}{\tilde{\gamma}}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})\text{InvGamma}( \sigma^{2}|\tilde{\alpha}, \tilde{\beta}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu d\sigma^{2} \\\\
&=& \int \text{Norm}(x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \delta, \frac{\sigma^{2}}{\tilde{\gamma}}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu \int\text{InvGamma}( \sigma^{2}|\tilde{\alpha}, \tilde{\beta}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\sigma^{2} \\\\
&=& \int \frac{1}{\sqrt{2\pi \sigma^{2}}} \exp{ ( - \frac{(x^{*} - \mu )^{2}}{2 \sigma^{2}})} \cdot \frac{1}{\sqrt{2\pi \frac{\sigma^{2}}{\tilde{\gamma}} }} \exp{ ( - \frac{ ( \tilde{\delta}-\mu )^{2}}{2 \frac{\sigma^{2}}{\tilde{\gamma}}} ) } d\mu
\int \frac{\tilde{\beta}^{\tilde{\alpha}}}{\Gamma[\tilde{\alpha}] } \left( \frac{1}{\sigma^{2}} \right)^{\tilde{\alpha} + 1}
\exp{ \left[ - \frac{2 \tilde{\beta}}{2 \sigma^{2}} \right]} d\sigma^{2} \\
\end{eqnarray}

正規分布の積は,2次元正規分布になる.


\begin{eqnarray}
\text{Norm}(x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \tilde{\delta}, \frac{\sigma^{2}}{\tilde{\gamma}}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
&=&  \frac{1}{\sqrt{2\pi \sigma^{2}}} \exp{(-\frac{(x^{*}-\mu)^{2}}{2 \sigma^{2}})} \cdot \frac{1}{\sqrt{2\pi \frac{\sigma^{2}}{\tilde{\gamma}} }} \exp{(-\frac{(\tilde{\delta}-\mu)^{2}}{2 \frac{\sigma^{2}}{\tilde{\gamma}}})}\\
&=& \frac{1}{\sqrt{(2\pi)^{2} \sigma^{2} \frac{\sigma^{2}}{\tilde{\gamma}}}}
\exp{ \left\{ - \frac{1}{2}\left( \frac{(x^{*}-\mu)^{2}}{\sigma^{2}} + \frac{(\tilde{\delta}-\mu)^{2}}{\frac{\sigma^{2}}{\tilde{\gamma}}} \right) \right\} }
\end{eqnarray}

ガウス積分は,$\int_{-\infty}^{\infty}\exp{(-a(x-b)^{2})} dx = \sqrt{\frac{\pi}{a}}$ より, expの中身を平方完成してガウス積分の形を作る.


\begin{eqnarray}
\frac{1}{\sigma^{2}}\left\{ (x^{*2} -2\mu x^{*} + \mu^{2}) + (\tilde{\gamma} \tilde{\delta}^{2} - 2\mu \gamma\tilde{\delta} + \tilde{\gamma} \mu^{2}) \right\}
&=& \frac{1}{\sigma^{2}}\left\{ (\tilde{\gamma} + 1)\mu^{2} -2 \mu (x^{*} + \tilde{\gamma} \tilde{\delta}) + (x^{*2}+\tilde{\gamma} \tilde{\delta}^{2}) \right\}\\\\
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)\left\{ \mu^{2} -2 \mu \frac{(x^{*} +\gamma \tilde{\delta})}{\tilde{\gamma} + 1} + \frac{x^{*2}+\tilde{\gamma} \tilde{\delta}^{2}}{\tilde{\gamma} + 1} \right\}\\\\
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)\left\{ \mu^{2} -2 \mu \frac{(x^{*} +\gamma \tilde{\delta})}{\tilde{\gamma} + 1} + \left(\frac{(x^{*} +\gamma \tilde{\delta})}{\tilde{\gamma} + 1}\right)^{2} + \frac{x^{*2}+\tilde{\gamma} \tilde{\delta}^{2}}{\tilde{\gamma} + 1} - \left(\frac{(x^{*} +\gamma \tilde{\delta})}{\tilde{\gamma} + 1}\right)^{2} \right\}\\
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)\left\{ \left( \mu - \frac{(x^{*} +\gamma \tilde{\delta})}{\tilde{\gamma} + 1}\right)^{2} + \frac{x^{*2}+\tilde{\gamma} \tilde{\delta}^{2}}{\tilde{\gamma} + 1} - \left(\frac{(x^{*} +\gamma \tilde{\delta})}{\tilde{\gamma} + 1}\right)^{2} \right\}\\\\
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)\left\{
\left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2}
+ \frac{1}{\tilde{\gamma} + 1}\left( x^{*2} + \tilde{\gamma}\tilde{\delta}^{2}
- \frac{( x^{*} + \gamma \tilde{\delta} )^{2}}{\tilde{\gamma} + 1 } \right)
\right\} \\\\
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)\left\{
\left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2}
+ \frac{1}{\tilde{\gamma} + 1}
\left( \frac{\tilde{\gamma} x^{*2} + \tilde{\gamma}\tilde{\delta}^{2}+x^{*2}+\tilde{\gamma}^{2}\tilde{\delta}^{2} -( x^{*} + \gamma \tilde{\delta} )^{2}}{\tilde{\gamma} + 1 } \right)
\right\} \\\\
\end{eqnarray}

右の項の分子の式はまとめることができるので,


\begin{eqnarray}
\tilde{\gamma} x^{*2} + \tilde{\gamma}\tilde{\delta}^{2}+x^{*2}+\tilde{\gamma}^{2}\tilde{\delta}^{2} -( x^{*2} +\gamma^{2} \tilde{\delta}^{2}+ 2x^{*}\gamma \tilde{\delta} )
&=&\tilde{\gamma} x^{*2} + \tilde{\gamma}\tilde{\delta}^{2}- 2x^{*}\gamma \tilde{\delta} \\\\
&=&\tilde{\gamma} (x^{*2} - 2x^{*}\tilde{\delta} + \tilde{\delta}^{2} ) \\\\
&=&\tilde{\gamma} (x^{*} - \tilde{\delta} )^{2} \\\\
\end{eqnarray}

よって,expの中身は,パラメータ$\mu$を持った項と定数項に分けることができる.


\begin{eqnarray}
\frac{1}{\sigma^{2}}\left\{ (x^{*2} -2\mu x^{*} + \mu^{2}) + (\tilde{\gamma} \tilde{\delta}^{2} - 2\mu \gamma\tilde{\delta} + \tilde{\gamma} \mu^{2}) \right\}
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)\left\{
\left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2}
+ \frac{1}{\tilde{\gamma} + 1}
\left( \frac{\tilde{\gamma} (x^{*} - \tilde{\delta} )^{2} }{\tilde{\gamma} + 1 } \right)
\right\} \\\\
&=& \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1)
\left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2}
+ \frac{1}{\sigma^{2}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta} )^{2} \\\\
\end{eqnarray}

よって正規分布の積は,


\begin{eqnarray}
\text{Norm}(x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \tilde{\delta}, \frac{\sigma^{2}}{\tilde{\gamma}}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I})
&=& \frac{1}{\sqrt{(2\pi)^{2} \sigma^{2} \frac{\sigma^{2}}{\tilde{\gamma}}}}
\exp{ \left\{ - \frac{1}{2}\left( \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1) \left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2}
+ \frac{1}{\sigma^{2}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta} )^{2} \right) \right\} }\\\\
&=& \frac{1}{\sqrt{(2\pi)^{2} \sigma^{2} \frac{\sigma^{2}}{\tilde{\gamma}}}}
\exp{\left\{ -\frac{1}{2} \frac{1}{\sigma^{2}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta} )^{2} \right\} }
\exp{ \left\{ - \frac{1}{2} \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1) \left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2} \right\} }\\\\
\end{eqnarray}

平均パラメータ$\mu$で積分すると,ガウス積分になる.


\begin{eqnarray}
\int_{-\infty}^{\infty} \text{Norm}(x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \tilde{\delta}, \frac{\sigma^{2}}{\tilde{\gamma}}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu
&=& \int_{-\infty}^{\infty} \frac{1}{\sqrt{(2\pi)^{2} \sigma^{2} \frac{\sigma^{2}}{\tilde{\gamma}}}}
\exp{\left\{ -\frac{1}{2} \frac{1}{\sigma^{2}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta} )^{2} \right\} }
\exp{ \left\{ - \frac{1}{2} \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1) \left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2} \right\} } d\mu \\\\
&=& \frac{1}{\sqrt{(2\pi)^{2} \sigma^{2} \frac{\sigma^{2}}{\tilde{\gamma}}}}
\exp{\left\{ -\frac{1}{2} \frac{1}{\sigma^{2}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta} )^{2} \right\} }
\int_{-\infty}^{\infty} \exp{ \left\{ - \frac{1}{2} \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1) \left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2} \right\} } d\mu \\\\
\end{eqnarray}

ガウス積分より,


\int_{-\infty}^{\infty} \exp{ \left\{ - \frac{1}{2} \frac{1}{\sigma^{2}}(\tilde{\gamma} + 1) \left( \mu - \frac{( x^{*} + \gamma \tilde{\delta}) }{\tilde{\gamma} + 1}\right)^{2} \right\} } d\mu = \sqrt{\frac{2\pi \sigma^{2}}{\tilde{\gamma} + 1}}

よって,1次元正規分布になる.


\begin{eqnarray}
\int_{-\infty}^{\infty} \text{Norm}(x^{*} | \mu, \sigma^{2}) \text{Norm}(\mu, | \tilde{\delta}, \frac{\sigma^{2}}{\tilde{\gamma}}, \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I}) d\mu
&=& \frac{1}{\sqrt{(2\pi)^{2} \sigma^{2} \frac{\sigma^{2}}{\tilde{\gamma}}}}
\exp{\left\{ -\frac{1}{2} \frac{1}{\sigma^{2}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta} )^{2} \right\} }
\sqrt{\frac{2\pi \sigma^{2}}{\tilde{\gamma} + 1}} \\\\
&=& \frac{1}{\sqrt{2\pi \frac{\sigma^{2} (\tilde{\gamma} + 1) }{\tilde{\gamma}}}}
\exp{\left\{ -\frac{1}{2} \frac{1}{\sigma^{2}\frac{\tilde{\gamma} + 1}{\tilde{\gamma}}}(x^{*} - \tilde{\delta} )^{2} \right\} } \\\\
\end{eqnarray}

残りの積分を考えると, この正規分布と逆ガンマの積より正規逆ガンマ分布になり, 正規逆ガンマ分布を分散についてt分布になる.


\begin{eqnarray}
\int_{0}^{\infty} \frac{1}{\sqrt{2\pi \frac{\sigma^{2} (\tilde{\gamma} + 1) }{\tilde{\gamma}}}}
\exp{\left\{ -\frac{1}{2} \frac{1}{\sigma^{2}\frac{\tilde{\gamma} + 1}{\tilde{\gamma}}}(x^{*} - \tilde{\delta} )^{2} \right\} } \cdot  \frac{\tilde{\beta}^{\tilde{\alpha}}}{\Gamma[ \tilde{\alpha}] } \left( \frac{1}{\sigma^{2}} \right)^{\tilde{\alpha} + 1}
\exp{ \left[ - \frac{2 \tilde{\beta}}{2 \sigma^{2}}  \right] } d\sigma^{2}
&=& \int_{0}^{\infty} \text{Norm}(\tilde{\delta}, \sigma^{2}\frac{\tilde{\gamma} + 1}{\tilde{\gamma}})
\cdot \text{InvGamma}(\tilde{\alpha}, \tilde{\beta}) d\sigma^{2}
\end{eqnarray}

ここで,正規逆ガンマ分布の積分は以下の一般化t分布になる.


\begin{eqnarray}
\int_{0}^{\infty} \text{Norm}(\mu, \frac{\sigma^{2}}{\gamma}) \cdot \text{InvGamma }(\alpha, \beta) d\sigma^{2}
&=& \frac{\Gamma(\frac{d_f + 1}{2})}{\Gamma[ \frac{d_f}{2} ] }\frac{1}{\sqrt{d_f \pi }}\frac{1}{\frac{1}{\sqrt{\frac{\alpha}{\beta}\gamma}}} \left(1+ \frac{1}{d_f} \left( \frac{x - \mu}{\frac{1}{\sqrt{\frac{\alpha}{\beta}\gamma}}} \right)^{2} \right)^{-(\frac{d_f + 1}{2})} \\\\
&=& \text{Generilized Student-t}(\text{df}=2\alpha, \text{location} = \mu, \text{scale} = \frac{1}{\sqrt{\frac{\alpha}{\beta}\gamma}} ) \\\\
\end{eqnarray}

よって,予測分布は,


\begin{eqnarray}
Pr( x^{*} | \vec{x}_{1\cdots I} = \vec{d}_{1 \cdots I} )
&=& \int_{0}^{\infty} \text{Norm}(\tilde{\delta}, \frac{\sigma^{2}}{\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}) \cdot \text{InvGamma}(\tilde{\alpha}, \tilde{\beta}) d\sigma^{2} \\\\
&=& \text{Generalized Student-t}( \text{df}=2\tilde{\alpha}, \text{location} = \tilde{\delta}, \text{scale} = \frac{1}{\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}} ) \\\\
\end{eqnarray}

t分布の式を展開してまとめる


\begin{eqnarray}
\text{Generilized Student-t}(\text{df}=2\tilde{\alpha}, \text{location} = \tilde{\delta}, \text{scale} = \frac{1}{\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}} )
&=& \frac{\Gamma(\frac{2\tilde{\alpha} + 1}{2})}{\Gamma(\frac{2\tilde{\alpha}}{2})\sqrt{\pi 2\tilde{\alpha}}} \frac{1}{\frac{1}{\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}}}\left( 1 + \frac{1}{2\tilde{\alpha}}\left(\frac{x - \tilde{\delta}}{\frac{1}{\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}}}\right)^{2} \right)^{-\frac{2\tilde{\alpha}+1}{2}} \\\\
&=& \frac{\Gamma(\frac{2\tilde{\alpha} + 1}{2})}{\Gamma(\frac{2\tilde{\alpha}}{2})\sqrt{\pi 2\tilde{\alpha}}}
\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}
\left( 1 + \frac{1}{2\tilde{\alpha}}
\left(
\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}(x - \tilde{\delta})\right)^{2} \right)^{-\frac{2\tilde{\alpha}+1}{2}} \\\\
&=& \frac{\Gamma(\tilde{\alpha}+\frac{ 1}{2})}{\Gamma(\tilde{\alpha})\sqrt{2\pi}}
\sqrt{\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}
\left( 1 + \frac{1}{2}
\left( \frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right)^{-(\tilde{\alpha}+\frac{1}{2})} \\\\
&=& \frac{1}{\sqrt{2\pi}}
\frac{\sqrt{\tilde{\gamma}}}{\sqrt{\tilde{\gamma} + 1}}
\frac{1}{\sqrt{\tilde{\beta}}}
\left( 1 + \frac{1}{2} \left( \frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right)^{-(\tilde{\alpha}+\frac{1}{2})}
\frac{\Gamma(\tilde{\alpha}+\frac{ 1}{2})}{\Gamma(\tilde{\alpha})} \\\\
\end{eqnarray}

$-(\tilde{\alpha}+\frac{1}{2})$ 乗の中身を展開する.積和の形を作っていく.


\begin{eqnarray}
\left[ 1 + \frac{1}{2}
\left( \frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2})}
&=& \left[ \frac{2}{2} + \frac{1}{2}\left( \frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2})}\\\\
&=& \left[ \frac{1}{2}
\left( 2+\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right]^{ - ( \tilde{\alpha}+\frac{1}{2})} \\\\
&=& \left[ \frac{1}{2}
\left( \frac{\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}{\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}} 2 +\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2}) } \\\\
&=& \left[ \frac{1}{2}\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}
\left( \frac{1}{\frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}} 2 +(x - \tilde{\delta})^{2}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2})}\\\\
&=& \left[ \frac{1}{\tilde{\beta}} \left( \tilde{\beta}+\frac{1}{2}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2})}\\\\
\end{eqnarray}

括弧の中身を展開する.


\begin{eqnarray}
\frac{1}{2}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x^{*} - \tilde{\delta})^{2}
&=& \frac{1}{2}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1} (x^{*2} -2x^{*}\tilde{\delta}+ \tilde{\delta}^{2})\\\\
&=& \frac{1}{2}\frac{1}{\tilde{\gamma} + 1} (\tilde{\gamma}x^{*2} -2x^{*}\tilde{\gamma}\tilde{\delta}+ \tilde{\gamma}\tilde{\delta}^{2})\\\\
&=& \frac{1}{2}\frac{1}{\tilde{\gamma} + 1} (\tilde{\gamma}x^{*2} -2x^{*}\tilde{\gamma}\tilde{\delta} + \tilde{\gamma}\tilde{\delta}^{2}
- x^{*2} + x^{*2} -\tilde{\gamma}^{2}\tilde{\delta}^{2}+\tilde{\gamma}^{2}\tilde{\delta}^{2})\\\\
&=& \frac{1}{2}\frac{1}{\tilde{\gamma} + 1} (\tilde{\gamma}x^{*2}+ x^{*2} +\tilde{\gamma}^{2}\tilde{\delta}^{2}+\tilde{\gamma}\tilde{\delta}^{2}
- (x^{*2}+2x^{*}\tilde{\gamma}\tilde{\delta} +\tilde{\gamma}^{2}\tilde{\delta}^{2}) )\\\\
&=& \frac{1}{2}\frac{1}{\tilde{\gamma} + 1} \left\{(\tilde{\gamma}+1)(x^{*2}+\tilde{\delta}^{2})
- (x^{*}+\tilde{\gamma}\tilde{\delta})^{2} \right\}\\\\
&=& \frac{(x^{*2}+\tilde{\delta}^{2})}{2}
- \frac{(x^{*}+\tilde{\gamma}\tilde{\delta})^{2}}{2(\tilde{\gamma} + 1)} \\\\
\end{eqnarray}

よって,


\begin{eqnarray}
\left[ 1 + \frac{1}{2} \left( \frac{1}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}(x - \tilde{\delta})^{2}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2})}
&=& \left[\frac{1}{\tilde{\beta}}
\left( \tilde{\beta}+\frac{(x^{*2}+\tilde{\delta}^{2})}{2}
- \frac{(x^{*}+\tilde{\gamma}\tilde{\delta})^{2}}{2(\tilde{\gamma} + 1)}\right) \right]^{-(\tilde{\alpha}+\frac{1}{2})}\\\\
&=& \frac{\tilde{\beta}^{\tilde{\alpha}+\frac{1}{2}}}{\left( \tilde{\beta}+\frac{(x^{*2}+\tilde{\delta}^{2})}{2}
- \frac{(x^{*}+\tilde{\gamma}\tilde{\delta})^{2}}{2(\tilde{\gamma} + 1)}\right)^{(\tilde{\alpha}+\frac{1}{2})}}\\\\
\end{eqnarray}

t分布の元の式に代入すると,


\begin{eqnarray}
\text{Generilized Student-t}(\text{df}=2\tilde{\alpha}, \text{location} = \tilde{\delta}, \text{scale} = \frac{1}{\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}} )
&=& \frac{1}{\sqrt{2\pi}}
\frac{\sqrt{\tilde{\gamma}}}{\sqrt{\tilde{\gamma} + 1}}
\frac{1}{\sqrt{\tilde{\beta}}}
\frac{\tilde{\beta}^{\tilde{\alpha}+\frac{1}{2}}}{\left( \tilde{\beta}+\frac{(x^{*2}+\tilde{\delta}^{2})}{2}- \frac{(x^{*}+\tilde{\gamma}\tilde{\delta})^{2}}{2(\tilde{\gamma} + 1)}\right)^{(\tilde{\alpha}+\frac{1}{2})}}
\frac{\Gamma(\tilde{\alpha}+\frac{ 1}{2})}{\Gamma(\tilde{\alpha})} \\\\
&=& \frac{1}{\sqrt{2\pi}}
\frac{\sqrt{\tilde{\gamma}}}{\sqrt{\tilde{\gamma} + 1}}
\frac{\tilde{\beta}^{\tilde{\alpha}}}{\left( \tilde{\beta}+\frac{(x^{*2}+\tilde{\delta}^{2})}{2} - \frac{(x^{*}+\tilde{\gamma}\tilde{\delta})^{2}}{2(\tilde{\gamma} + 1)}\right)^{(\tilde{\alpha}+\frac{1}{2})}} \frac{\Gamma(\tilde{\alpha}+\frac{ 1}{2})}{\Gamma(\tilde{\alpha})}\\\\
\end{eqnarray}

\begin{eqnarray}
\breve{\alpha} &=& \tilde{\alpha}+\frac{ 1}{2}\\\\
\breve{\gamma} &=& \tilde{\gamma} + 1 \\\\
\breve{\beta} &=&  \tilde{\beta}+\frac{(x^{*2}+\tilde{\delta}^{2})}{2} - \frac{(x^{*}+\tilde{\gamma}\tilde{\delta})^{2}}{2(\tilde{\gamma} + 1)}\\\\
\end{eqnarray}

とおくと


\begin{eqnarray}
\text{Generilized Student-t}(\text{df}=2\tilde{\alpha}, \text{location} = \tilde{\delta}, \text{scale} = \frac{1}{\sqrt{\frac{\tilde{\alpha}}{\tilde{\beta}}\frac{\tilde{\gamma}}{\tilde{\gamma} + 1}}} )
&=& \frac{1}{\sqrt{2\pi}}
\frac{\sqrt{\tilde{\gamma}}}{\sqrt{\breve{\gamma}}}
\frac{\tilde{\beta}^{\tilde{\alpha}}}{ \breve{\beta}^{\breve{\alpha}}}
\frac{\Gamma(\breve{\alpha})}{\Gamma(\tilde{\alpha})}\\
\end{eqnarray}

テキスト:"Computer vision: models, learning and inference" by Simon Prince p.59 (4.25), (4.26)式と同じ式になる.

コード

以上の式を使って,観測データが与えられたら,

  • 事後分布のパラメータを計算
  • そこから予測分布のt分布を決定

するようなクラスをつくる.

# code by Baysian
import scipy as sp
from scipy import stats
from scipy import special

class Baysian_Normal():
    def __init__(self, prior_alpha, prior_beta, prior_delta, prior_gamma, data_x, *args, **kwargs):
        '''
        prior:
        √(γ/2πσ^2)・β^α/Γ(α)・(1/σ^2)^(α+1)exp(- (2β + γ(δ-μ)^2 / (2 σ^2))
        - alpha: df
        - N: data number

        Post:
        - alpha_post = alpha + N / 2
        - gamma_post = gamma + N
        - delta_post = (gamma*delta + sum (d)) / (gamma + N)
        - beta_post = beta + sum(d^2)/2 + gamma*delta^2/2 - (gamma*delta + sum(d))^2/(2(gamma + N))

        '''
        self.parameters_of_posterior = {}
        data = sp.array(data_x).flatten()
        n = len(data)

        posterior_alpha = prior_alpha + n / 2
        posterior_gamma = prior_gamma + n
        posterior_delta = (prior_gamma*prior_delta +
                           sp.sum(data)) / (prior_gamma + n)
        posterior_beta = prior_beta + sp.sum(data**2)/2 + (prior_gamma*prior_delta**2)/2 - (
            prior_gamma*prior_delta + sp.sum(data))**2/(2*(prior_gamma + n))

        self.parameters_of_posterior["alpha"] = posterior_alpha
        self.parameters_of_posterior["beta"] = posterior_beta
        self.parameters_of_posterior["gamma"] = posterior_gamma
        self.parameters_of_posterior["delta"] = posterior_delta

    def norm_inv_gamma_pdf(mu, sigma2, alpha, beta, gamma, delta):
        '''
        N(mu |δ,σ2/λ)・IG(σ2|α,β)
        return pdfM
        '''
        col = len(mu)
        row = len(sigma2)
        sigma2 = sp.sort(sigma2)[::-1]# desc
        IG = stats.invgamma(a=alpha, scale=beta)
        Mpdf_sigma2 = IG.pdf(x=sigma2).reshape(-1, 1)
        Mpdf_sigma2 = sp.repeat(Mpdf_sigma2, repeats=col, axis=1)

        Mpdf_N = sp.zeros_like(Mpdf_sigma2)
        for r in sp.arange(row):
            N = stats.norm(loc=delta, scale=sp.sqrt(sigma2[r]/gamma))
            Mpdf_N[r] = N.pdf(x=mu)
        return Mpdf_N * Mpdf_sigma2

    def posterior_NormInvGamma_pdf(self, mu, sigma2):
        return norm_inv_gamma_pdf(mu, sigma2, alpha=self.parameters_of_posterior["alpha"],
                                  beta=self.parameters_of_posterior["beta"],
                                  gamma=self.parameters_of_posterior["gamma"],
                                  delta=self.parameters_of_posterior["delta"])


    def sampling_posterior_NormInvGamma(self, N=1):
        IG = stats.invgamma(scale=self.parameters_of_posterior["beta"], a=self.parameters_of_posterior["alpha"])
        sigma2 = IG.rvs(size=N)

        mu = sp.array([])
        for s2 in sigma2:
            normal = stats.norm(loc=self.parameters_of_posterior["delta"], scale=s2/self.parameters_of_posterior["gamma"])
            mu = sp.append(mu, normal.rvs(size=1))

        return sp.vstack((mu, sigma2))

    def predictive_dist(self):
        '''
        t-dist
        '''
        tilde_alpha = self.parameters_of_posterior["alpha"]
        tilde_beta = self.parameters_of_posterior["beta"]
        tilde_gamma = self.parameters_of_posterior["gamma"]
        tilde_delta = self.parameters_of_posterior["delta"]
        return stats.t(loc=tilde_delta, scale=1/sp.sqrt(tilde_alpha/tilde_beta * tilde_gamma/(tilde_gamma+1)), df=2*tilde_alpha)



parameters = {
    "prior_alpha": 1,
    "prior_beta": 1,
    "prior_delta": 0,
    "prior_gamma": 1
}
bn = Baysian_Normal(**parameters, data_x=[0, 1, -1, 0.5, -0.5])
bn.parameters_of_posterior
{'alpha': 3.5, 'beta': 2.25, 'gamma': 6, 'delta': 0.0}