# James Hensman’s Weblog

## June 14, 2010

### Multiple Matrix Multiplication in numpy

Filed under: Uncategorized — jameshensman @ 10:45 am

There are two ways to deal with matrices in numpy. The standard numpy array in it 2D form can do all kinds of matrixy stuff, like dot products, transposes, inverses, or factorisations, though the syntax can be a little clumsy. For those who just can’t let go of matlab, there’s a matrix object which prettifies the syntax somewhat. For example, to do $A = B^\top C^{-1}$

```import numpy as np

#using arrays
B = np.random.randn(2,2) # random array
C = np.random.randn(2,2)

A = np.dot(B.T, np.linalg.inv(C))

#using matrices
B = np.mat(B) #cast as matrix
C = np.mat(C)

A = B.T*C.I
```

Despite the nicer syntax in the matrix version, I prefer to use arrays. This is in part because they play nicer with the rest of numpy, but mostly because they behave better in higher dimensions. This is really useful when you have to deal with lots of matrices, which seems to occur all the time in my work:

```A = np.random.randn(100,2,2) # a hundred 2x2 arrays
A2 = [np.mat(np.random.randn(2,2)) for i in range(100)] # a hundered 2x2 matrices
```

## Transposing

Transposing the list of matrices is easy: just use the .T operator. This doesn’t work for the 100x2x2 array though, since it switches the axis in a way we don;t want. The solution is to manually specify which axes to switch.

```A2T = [a.T for a in A2] # matrix version
AT = np.transpose(A,(0,2,1)) #array version 1
#or
AT = np.rollaxis(A,-2,-1)# array version 2
```

## Multiplication

Suppose you have a series of matrices which you want to (right) multiply by another matrix. This is messy with the matrix object, where you need to do list comprehension, but nice as pie with the array object.

```#matrix version
A = [np.mat(np.random.randn(2,2)) for i in range(100)]
B = np.mat(np.random.randn(2,2))
AB = [a*B for a in A]

#array version
A = np.random.randn(100,2,2)
B = np.random.randn(2,2)
AB = np.dot(A,B)
```

Left-multiplication is a little harder, but possible using a transpose trick: $(BA)^\top = A^\top B^\top$

```#matrix version
BA = [Ba for a in A]

#array version
BA = np.transpose(np.dot(np.transpose(A,(0,2,1)),B.T),(0,2,1))
```

Okay, the syntax is getting ugly there, I’ll admit. Suppose now that you had two sets of matrices, and wanted the product of each element, as in $C_i = A_i B_i, \, i=1 \ldots 100$

```#matrix version
A = [np.mat(np.random.randn(2,2)) for i in range(100)]
B = [np.mat(np.random.randn(2,2)) for i in range(100)]
AB = [a*b for a,b in zip(A,B)]

#array version
A = np.random.randn(100,2,2)
B = np.random.randn(100,2,2)
np.sum(np.transpose(A,(0,2,1)).reshape(100,2,2,1)*B.reshape(100,2,1,2),-3)
```

I’ll admit that the syntax of this is very weird. To see how it works, consider taking the (matrix) product of two arrays in a similar manner:

```A = np.random.randn(2,2)
B = np.random.randn(2,2)
np.sum(A.T.reshape(2,2,1)*B.reshape(2,1,2),0)
```

The reshaping persuades numpy to broadcast the multiplication, which results in a 2x2x2 cube of numbers. Summing the numbers along the first dimension of the cube results in matrix multiplication. I have some scribbles which illustrate this, which I’ll post if anyone wants.

## Speed

The main motivation for using arrays in this manner is speed. I find for loops in python to be rather slow (including within list comps), so I prefer to use numpy array methods whenever possible. The following runs a quick test, multiplying 1000 3×3 matrices together. It’s a little crude, but it shows the numpy.array method to be 10 times faster than the list comp of np.matrix.

```import numpy as np
import timeit

#compare multiple matrix multiplication using list coms of matrices and deep arrays

#1) the matrix method
setup1 = """
import numpy as np
A = [np.mat(np.random.randn(3,3)) for i in range(1000)]
B = [np.mat(np.random.randn(3,3)) for i in range(1000)]
"""

test1 = """
AB = [a*b for a,b in zip(A,B)]
"""

timer1 = timeit.Timer(test1,setup1)
print timer1.timeit(100)

#2) the array method
setup2 = """
import numpy as np
A = np.random.randn(1000,3,3)
B = np.random.randn(1000,3,3)
"""

test2 = """
AB = np.sum(np.transpose(A,(0,2,1)).reshape(1000,3,3,1)*B.reshape(1000,3,1,3),0)
"""

timer2 = timeit.Timer(test2,setup2)
print timer2.timeit(100)
```

In the likely event that there’s a better way to do this, or I’ve screwed up some code somewhere, please leave me a comment 🙂

1. James, you are awesome! Just the solution I was looking for! I was using map function which is the same as list comp: i.e. a = (n,2,3) b =(n,3,2) then map(dot, a, b) produces the desired results but it is slow. Also, if numpy were to allow arrays of matrices: say A is an array of matrix(2,3) and B is array of matrix(3,2), then array multiplication forces each matrix of A to be multiplied by each matrix of B, but since A and B are matrices, their type would force matrix multiplication instead of array multiplication. Seems like a more cleaner way of implementing it.

Comment by Jacques — October 28, 2011 @ 2:39 am

• Thanks Jaques 🙂

I try to avoid matrix types wherever possible! An abomination of the numpy namespace imho.

I like the map trick — I don’t understand why it should be slow?

James.

Comment by jameshensman — October 28, 2011 @ 8:09 am

2. x = matrix([[1,2],[3,4]]) # create a matrix, OK
y = array(x,subok=True) # produces array of 1 matrix, OK
y = array([x,x],subok=True) # does not produce array of 2 matrices, each matrix is converted to an array, inconsistent with array docstring, NOT OK
y = array([x,x],dtype=matrix) # doesn’t work either 😦
y = array([1,x,x]) # produces array of objects, y[0] = int, y[1] = matrix, y[2] = matrix, it seems that objects must have different dimensions to retain their type, numpy is trying to be too smart NOT OK
y[1]*y[2] # indeed performs matrix multiply using * and not dot

Now lets extend our example to x = rand(10,3,4) and y = rand(10,4,2), we wish to use the notation z=x*y to produce z of shape (10,3,2); this is cleaner if you need to multiply lots of array of matrices of different dimensions together; other wise you would have map(dot, map(dot, a, b), c), etc… or write a new function with multiple arguments: map(newfunc, a, b, c, d, e, f, …)

The notation of x*y is much cleaner and allows for a*b*c*d*e*f; even better, if q = rand(10,3,3), you can do cool things like q**100, because it would be matrix multiply each member of q with itself, 100 times. Let’s see how one can accomplish this trick.

npts = 10; # number of matrices to be multiplied
xx = empty((npts,),dtype=object) # create a 1D array of empty object references
yy = xx.copy() # create another 1D array of empty object references

for n in range(npts): # xx and yy are now FINALLY 1D array of matrix references
xx[n]=mat(x[n])
yy[n]=mat(y[n])

zz = xx*yy # will work as advertised, no use of dot, or map

# you would need another for loop to obtain z in array form from zz since zz is (npts,) and z needs to be (npts,3,2)

This way you can chain multiply arrays of matrices and their inverses. Although a for loop is slow, but we are not assigning the actual matrices but rather their references to xx and yy; hence the loop will be very fast, i.e. it’s only npts long. If we don’t like loops we can use map with lambda function to accomplish the same thing. You can even put the whole thing inside of a function, all the function does is take a 3D number array and produces a 1D array of matrix references. if x[n] is square, xx**(-1) will be the inverse.

numpy has a function called vectorize(), it’s like map but with broadcasting. But you would still need to use dot(dot(a,b),c) which is messy, but you don’t need a multidimensional eye(). If you want to do things like a*b*c*d**(-1)*f*g*q, the trick above would give you readable and speedy execution.

You are correct, map and vectorize are not slow, but their notation is hard to read for long chains of vector matrix multiplies.

I have tried to use vectors of matrices of different dimensions (i.e none-square) based on your method, but the results depend on which axis one finally sums at the end. I did not investigate it further.

Comment by Jacques — November 2, 2011 @ 3:36 am

3. Hi Jaques,

Excellent work! Note that you can also do:
y = array([x,x],dtype=object)

Also of note, an implementation of the dot function which takes multiple arguments:
http://www.scipy.org/Cookbook/MultiDot

Using the map function, on lists of arrays you could then do

map(mdot, xx, yy, zz)

James.

Comment by jameshensman — November 2, 2011 @ 10:39 am

• Thanks James! appreciate the info.

Comment by Jacques — November 2, 2011 @ 1:39 pm

4. Nice. Thanks for your post. Remark: You may want to replace the “.reshape(100,2,2,1)” with “[:,:,:,np.newaxis]” and do something similar for B.

Comment by Oren — January 16, 2012 @ 2:26 am

• Hi Oren,

Yep, I hadn’t discivered the joy of newaxis when I wrote this. Now my code is full of it! actually, newaxis seems to be simply a null pointer, so my code often uses A[:,:,None,:] or similar. Not sure if this will break in future numpy versions though.

Comment by jameshensman — January 16, 2012 @ 11:15 am

5. Hey, great post, thanks a lot! I linked it on Stack Overflow where a question was asked that you answered here 🙂 As I mentioned there, however, in some cases using np.einsum is faster (it was in one case that I tested, using a timing setup similar to yours, but with 10000x5x5 “matrix-array’s”). Its syntax is also cleaner and clearer (if you’re used to tensor/index notation).

I believe there’s a tiny error in your last example, by the way: shouldn’t the last zero in line 28 be a minus three?

Comment by Patrick Bos — February 21, 2012 @ 8:30 am

• Here’s the Stack Overflow link, if you’re interested: http://stackoverflow.com/questions/9284421/fast-way-to-invert-or-dot-kxnxn-matrix/9367940#9367940

The main question was actually how to do an inverse in this way; any ideas?

Comment by Patrick Bos — February 21, 2012 @ 8:32 am

I too would love to be able to do matrix inverses on slices of an array. I have a feeling that it shouldn’t be too hard to do this because of the was data is stored in numpy. Each slice should be contiguous, or, depending on the array ordering, there should be the same ‘skip’ between every row of the slice. In theory then it’s possible to point the lapack routines at the slices independently, since they can be called with a rowskip parameter, but this isn’t available through numpy/scipy. It might be possible with a little fortran and f2py though.

Of course, even with a fortran wrapper, you’d be simply doing all the inverses in a loop. To get a _real_ parallel implementation, you’d have to start hacking around in lapack to get the inverse operatinos to happen in parallel. Thus, I made myself a little wrapper which just does the inverses in a list comprehension.

Comment by jameshensman — February 22, 2012 @ 9:14 am

6. […] The main motivation for using arrays in this manner is speed. I find for loops in python to be rather slow (including within list comps), so I prefer to use numpy array methods whenever possible. The following runs a quick test, multiplying 1000 3×3 matrices together. Multiple Matrix Multiplication in numpy « James Hensman’s Weblog […]

Pingback by Python Quick Hacks and Codes | Pearltrees — March 7, 2012 @ 11:45 pm

7. James, This post still comes up high when searching with google for efficient ways to multiply a list of matrices with another list of matrices (as in your ‘the syntax is very weird’ example) so it would be worth updating it with einsum recipes. I actually wanted to do normal matrix multiplation on each element ie np.dot(A, B)
I added to your timer script resulting in #1 506ms #2 67ms #3 1.7ms (using einsum)

import numpy as np
import timeit

#compare multiple matrix multiplication using list coms of matrices and deep arrays

#1) the matrix method
setup1 = “””
import numpy as np
A = [np.mat(np.random.randn(3,3)) for i in range(1000)]
B = [np.mat(np.random.randn(3,3)) for i in range(1000)]
“””

test1 = “””
AB = [a*b for a,b in zip(A,B)]
“””

timer1 = timeit.Timer(test1,setup1)
print timer1.timeit(100)

#2) the array method
setup2 = “””
import numpy as np
A = np.random.randn(1000,3,3)
B = np.random.randn(1000,3,3)
“””

test2 = “””
AB = np.sum(np.transpose(A,(0,2,1)).reshape(1000,3,3,1)*B.reshape(1000,3,1,3),-3)
“””

timer2 = timeit.Timer(test2,setup2)
print timer2.timeit(100)

#3) the einsum method use the same arrays as setup2
test3 = “””
AB = np.einsum(‘…ij,…ij->…ij’, A, B)
“””

timer3 = timeit.Timer(test3,setup2)
print timer3.timeit(100)

“””
to do ‘normal’ matrix multiplication the equivalent of np.dot(A[i], B[i]) for i in range(1000)

AB = np.einsum(‘…ij,…jk->…ik’, A, B)
“””

Comment by paddyg — January 14, 2014 @ 7:55 pm

8. This is what I was CRAVING for!!!! Thanks so much!

Comment by Diego Alonso Cortez — March 2, 2016 @ 9:16 pm

Create a free website or blog at WordPress.com.