James Hensman’s Weblog

January 26, 2009

Working with log likelihoods

Filed under: python — jameshensman @ 1:41 pm
Tags: ,

Sometimes, when you have a series of number representing the log-probability of something, you need to add up the probabilities. Perhaps to normalise them, or perhaps to weight them… whatever.  You end up writing (or Mike ends up writing):

logsumexp = lambda x: np.log(sum([np.exp(xi) for xi in x]))

Which is going to suck when them members of x are small. Small enough that the precision of the 64 bit float you holding them runs out, and they exponentiate to zero (somewhere near -700).  Your code is going to barf when it get to the np.log part, and finds it can’t log zero.

One solution is to add a constant to each member of x, so that you don’t work so close to the limits of the precision, and remove the constant later:

def logsumexp(x):
    x += 700
    x = np.sum(np.exp(x))
    return np.log(x) - 700

Okay, so my choice of 700 is a little arbitrary, but that (-700) is where the precision starts to run out, and it works for me. Of course, if your numbers are way smaller than that, you may have a problem.

Edit: grammar. And I’m getting used to this whole weblog shenanigan. Oh, and <code>blah</code> looks rubbish: I'm going to stop doing that.


Blog at WordPress.com.