James Hensman’s Weblog

January 29, 2009

python.copy()

Filed under: python — jameshensman @ 10:48 am
Tags: ,

Assignment in python is exactly that. Assignment, not copying. For those of us who have switched to the language of the gods from some inferior mortal language (Matlab), this can lead to some frustration.

For example, within my Variational Factor Analysis (VBFA) class, I need to keep a record of something I’m calling b_phi_hat. One of the the methods in the class involves the update of this little vector, which depends on its initial (prior) value, b_phi. Like this:

class VBFA:
import numpy as np
    def __init__(self):
        self.b_phi = np.mat(np.zeros((5,1)))
        #blah blah...
    
    def update_phi(self):
        self.b_phi_hat = self.b_phi
        for i in range(5):
            self.b_phi_hat[i] = self.something()

update_phi() get called 100s of times when the class is used. Spot the problem? It’s on line 8, where b_phi_hat is assigned to b_phi. When the loop runs on the next two lines, it’s modifying the original, not just a copy of the original, i.e. after the first iteration line 8 doesn’t ‘refresh’ b_phi_hat, it keeps it at its current value.

What I should have written is:

import numpy as np
class VBFA:
    def __init__(self):
        self.b_phi = np.mat(np.zeros((5,1)))
        #blah blah...
    
    def update_phi(self):
        self.b_phi_hat = self.b_phi.copy()
        for i in range(5):
            self.b_phi_hat[i] = self.something()

which explicitly makes a copy of the original on line 8, refreshing b_phi_hat with every iteration.

Advertisements

January 26, 2009

Working with log likelihoods

Filed under: python — jameshensman @ 1:41 pm
Tags: ,

Sometimes, when you have a series of number representing the log-probability of something, you need to add up the probabilities. Perhaps to normalise them, or perhaps to weight them… whatever.  You end up writing (or Mike ends up writing):

logsumexp = lambda x: np.log(sum([np.exp(xi) for xi in x]))

Which is going to suck when them members of x are small. Small enough that the precision of the 64 bit float you holding them runs out, and they exponentiate to zero (somewhere near -700).  Your code is going to barf when it get to the np.log part, and finds it can’t log zero.

One solution is to add a constant to each member of x, so that you don’t work so close to the limits of the precision, and remove the constant later:

def logsumexp(x):
    x += 700
    x = np.sum(np.exp(x))
    return np.log(x) - 700

Okay, so my choice of 700 is a little arbitrary, but that (-700) is where the precision starts to run out, and it works for me. Of course, if your numbers are way smaller than that, you may have a problem.

Edit: grammar. And I’m getting used to this whole weblog shenanigan. Oh, and <code>blah</code> looks rubbish: I'm going to stop doing that.

Create a free website or blog at WordPress.com.