Class 2 (Monday 17 October)#

These tasks are designed to be worked on in the practical class on Monday 17 October.

Experimenting with Numba#

Let’s start by looking at the matvec code we wrote last week.

import numpy as np

def slow_matvec(matrix, vector):
    assert matrix.shape[1] == vector.shape[0]
    result = []
    for r in range(matrix.shape[0]):
        value = 0
        for c in range(matrix.shape[1]):
            value += matrix[r, c] * vector[c]
        result.append(value)
    return np.array(result)


# Example of using this function
matrix = np.random.rand(3, 3)
vector = np.random.rand(3)
print(slow_matvec(matrix, vector))
print(matrix @ vector)

Use numba.njit to tell Numba to just-in-time (JIT) compile this function.

Numba appears to be giving incorrect results for this function. This is because Numba interprets value = 0 as “make an integer value that is equal to 0”, then will not allow value to take non-integer values. To fix this, replace value = 0 with value = 0.0.

Using matplotlib, make a plot that shows the time this function takes to compute a matrix-vector product with and without Numba acceleration. Add timings for the faster_matvec function that you wrote to this plot. The first time you call your function, it will need to do the JIT compilation: you may want to measure the time the first run takes separately.

jit vs njit#

Add another line to your plot to shw the timings if you use numba.jit instead of numba.njit. Which is faster?

numba.njit will use “no Python mode”, while numba.jit uses “Python compatibility mode”. We would expect numba.njit to produce faster code, but numba.jit is able to compile a wider range of functions.

Parallel range#

Replace any ranges in your function with numba.prange: this will make your function use a parallel for loop. Compare the timings of your function with and without parellel ranges. How big does your matrix need to be before parallelisation becomes worth using?

Optimising your code#

Take the fastest version of your function you’ve obtained to far. Is there anything else you can try doing to it to make it faster? Try a few things and see if you can get any more speed.

Compare the time your function takes to the time Numpy takes to multiply two matrices. How close to Numpy’s speed can you get?