While fooling around with the shapes may make what you are trying to accomplish much more clear, the easiest way of handling this type of problems without thinking too much is with np.einsum:
In [5]: np.einsum('ij, jkl', M, a)
Out[5]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14],
[ 15, 16, 17, 18, 19]],
[[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0]],
[[-40, -41, -42, -43, -44],
[-45, -46, -47, -48, -49],
[-50, -51, -52, -53, -54],
[-55, -56, -57, -58, -59]]])
Plus it often comes with a performance bonus:
In [17]: a = np.random.randint(256, size=(3, 1000, 2000))
In [18]: %timeit np.dot(M, a.swapaxes(0,1))
10 loops, best of 3: 116 ms per loop
In [19]: %timeit np.einsum('ij, jkl', M, a)
10 loops, best of 3: 60.7 ms per loop
EDIT einsum is very powerful voodoo. You can also do what the OP asks in the comment below as follows:
>>> a = np.arange(60).reshape((3,4,5))
>>> M = np.array([[1,0,0], [0,0,0], [0,0,-1]])
>>> M = M.reshape((3,3,1,1)).repeat(4,axis=2).repeat(5,axis=3)
>>> np.einsum('ijkl,jkl->ikl', M, b)
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14],
[ 15, 16, 17, 18, 19]],
[[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0]],
[[-40, -41, -42, -43, -44],
[-45, -46, -47, -48, -49],
[-50, -51, -52, -53, -54],
[-55, -56, -57, -58, -59]]])