# James Hensman’s Weblog

## February 12, 2009

### Expected Values (Matrices)

Filed under: Uncategorized — jameshensman @ 9:41 am

I’ve been fiddling around in python, implementing a Variational Bayesian Principal Component Analysis (VBPCA) algorithm. It’s true, I could do this happily in vibes , but where’s the fun in that? Besides, I think I learned more doing it this way.

Happily, there’s a nice paper by Chris Bishop, which explains clearly what’s going on, and gives you the update equations. For example:

${\bf m_x}^{(n)} = \langle \tau \rangle {\bf \Sigma_x} \langle {\bf W}^\top \rangle (t_n - \langle {\mathbf \mu} \rangle)$

which involves taking the expected value of $\tau$, ${\bf W}$ and $\mathbf \mu$. These three are straightforward:

The distribution for $\tau$ is $p(\tau) = \textit{Gamma}(\tau \mid a,b)$, and so the expected value of $\tau$ is $a/b$.

The distribution for $\mu$ is $p(\mu) = \mathcal{N}(\mu \mid {\bf m_\mu},{\bf \Sigma_\mu})$, and so the expected value of $\mu$ is simply ${\bf m_\mu}$.

The distribution for $\bf W$ is a slightly more complex beast: $\bf W$ is a non-square matrix, and we have a Gaussian distribution for each row. That is: $p({\bf W}) = \prod_{i=1}^d \mathcal{N}({\bf w}_i \mid {\bf m}_w^{(i)}, {\bf \Sigma}_w)$. Bizarrely, the rows share a covariance matrix. The expected value of ${\bf W}$ is simple: is just a matrix made of a stack of ${\bf m}_w^{(i)}$s.

The tricky bit comes in a different update equation, where we need to evaluate $\langle {\bf W}^\top {\bf W} \rangle$. The first thing to notice is that (where ${\bf W}$ is a d by q matrix):

${\bf W^\top W} = \sum_{i=1}^d {\bf w}_i {\bf w}_i^\top$.

Since $p({\bf w}_i)$ is Gaussian, $\langle {\bf w}_i {\bf w}_i^\top \rangle = {\bf m}_i{\bf m}_i^\top + {\bf \Sigma}_w$. Now we can write:

$\langle {\bf W^\top W} \rangle = \langle \sum_{i=1}^d {\bf w}_i {\bf w}_i^\top \rangle = \sum_{i=1}^d \langle {\bf w}_i {\bf w}_i^\top \rangle = \sum_{i=1}^d \left({\bf m}_i{\bf m}_i^\top + {\bf \Sigma}_w \right)$

Simple when you know how.

Edit: Anyone know how to get a bold $\mu$? I’ve tried {\bf \mu} and \mathbf{\mu}.

1. Yo, this what you want? $\boldsymbol{\mu}$ – the command is \boldsymbol{}. They’re actually a whole other symbol set provided by some AMS thing I think. It’s annoying isn’t it!