LSA Assignment 1 - Matrix-matrix multiplication#
The deadline for submitting this assignment is Midnight Friday 30 August 2024.
The easiest ways to create this file are:
Write your code and answers in a Jupyter notebook, then select File -> Download as -> PDF via LaTeX (.pdf).
Write your code and answers on Google Colab, then select File -> Print, and print it as a pdf.
Tasks you are required to carry out and questions you are required to answer are shown in bold below.
The assignment#
In this assignment, we will look at computing the product \(AB\) of two matrices \(A,B\in\mathbb{R}^{n\times n}\). The following snippet of code defines a function that computes the
product of two matrices. As an example, the product of two 10 by 10 matrices is printed. The final line prints matrix1 @ matrix2
- the @
symbol denotes matrix multiplication, and
Python will get Numpy to compute the product of two matrices. By looking at the output, it’s possible to check that the two results are the same.
import numpy as np
def slow_matrix_product(mat1, mat2):
"""Multiply two matrices."""
assert mat1.shape[1] == mat2.shape[0]
result = []
for c in range(mat2.shape[1]):
column = []
for r in range(mat1.shape[0]):
value = 0
for i in range(mat1.shape[1]):
value += mat1[r, i] * mat2[i, c]
column.append(value)
result.append(column)
return np.array(result).transpose()
matrix1 = np.random.rand(10, 10)
matrix2 = np.random.rand(10, 10)
print(slow_matrix_product(matrix1, matrix2))
print(matrix1 @ matrix2)
The function in this snippet isn’t very good.
Part 1: a better function#
Write your own function called faster_matrix_product
that computes the product of two matrices more efficiently than slow_matrix_product
.
Your function may use functions from Numpy (eg np.dot
) to complete part of its calculation, but your function should not use np.dot
or @
to compute
the full matrix-matrix product.
Before you look at the performance of your function, you should check that it is computing the correct results. Write a Python script using an assert
statement that checks that your function gives the same result as using @
for random 2 by 2, 3 by 3, 4 by 4, and 5 by 5 matrices.
In a text box, give two brief reasons (1-2 sentences for each) why your function is better than slow_matrix_product
. At least one of your
reasons should be related to the time you expect the two functions to take.
Next, we want to compare the speed of slow_matrix_product
and faster_matrix_product
. Write a Python script that runs the two functions for matrices of a range of sizes,
and use matplotlib
to create a plot showing the time taken for different sized matrices for both functions. You should be able to run the functions for matrices
of size up to around 1000 by 1000 (but if you’re using an older/slower computer, you may decide to decrease the maximums slightly). You do not need to run your functions for
every size between your minimum and maximum, but should choose a set of 10-15 values that will give you an informative plot.
Part 2: speeding it up with Numba#
In the second part of this assignment, you’re going to use Numba to speed up your function.
Create a copy of your function faster_matrix_product
that is just-in-time (JIT) compiled using Numba. To demonstrate the speed improvement acheived by using Numba,
make a plot (similar to that you made in the first part) that shows the times taken to multiply matrices using faster_matrix_product
, faster_matrix_product
with
Numba JIT compilation, and Numpy (@
). Numpy’s matrix-matrix multiplication is highly optimised, so you should not expect to be as fast is it.
You may be able to achieve further speed up of your function by adjusting the memory layout used. The function np.asfortanarray
will make a copy of an array that uses
Fortran-style ordering, for example:
import numpy as np
a = np.random.rand(10, 10)
fortran_a = np.asfortranarray(a)
Make a plot that compares the times taken by your JIT compiled function when the inputs have different combinations of C-style and Fortran-style ordering (ie the plot should have lines for when both inputs are C-style, when the first is C-style and the second is Fortran-style, and so on). Focusing on the fact that it is more efficient to access memory that is close to previous accesses, comment (in 1-2 sentences) on why one of these orderings appears to be fastest that the others. (Numba can do a lot of different things when compiling code, so depending on your function there may or may not be a large difference: if there is little change in speeds for your function, you can comment on which ordering you might expect to be faster and why, but conclude that Numba is doing something more advanced.)