NumPy & Multi-Dimension Arrays - NumPy Magic
Thomas J. Kennedy
1 Where are We?
In the previous lecture, we have just finished refactoring the Least Squares Approximation Solver.
This lecture will focus on a second refactoring pass through the solver code. The code generated in this lecture can be accessed on GitHub.
2 Back to main
I am not entirely satisfied with main
.
def main():
points = [(0.0, 0.0), (1.0, 1.0), (2.0, 4.0)]
# Compute X
x_values = [x for x, _ in points]
x_values = np.array(x_values)
x_values_squared = x_values ** 2
matrix_X = np.column_stack((np.ones(len(points)), x_values, x_values_squared))
# Compute Y
matrix_Y = np.array([y for _, y in points])
The first list (i.e., points
) represents our input data. We end up creating x_values
and y_values
based on these points. If this were part of a larger application… the points would come from an input file. Let us introduce a new function that takes the points and returns two lists.
def get_x_and_y_values(points: list[tuple[float, float]]) -> tuple[np.array, np.array]:
"""
Take a list of points and return a list of x values and a list of y values
"""
x_values = [x for x, _ in points]
x_values = np.array(x_values)
y_values = np.array([y for _, y in points])
return x_values, y_values
The logic for splitting the points into two np.array
objects should its own function.
2.1 Getting x_values and y_values
We can still improve this function. If we take…
x_values = [x for x, _ in points]
x_values = np.array(x_values)
the lines can be combined into one…
x_values = np.array([x for x, _ in points])
We can make the same tweak to the line for y_values
.
y_values = np.array([y for _, y in points])
The final function ends up as two lines (not including the return).
def get_x_and_y_values(points: list[tuple[float, float]]) -> tuple[np.array, np.array]:
"""
Take a list of points and return a list of x values and a list of y values
"""
x_values = np.array([x for x, _ in points])
y_values = np.array([y for _, y in points])
return x_values, y_values
But… we can make the process a little cleaner with hsplit
. Why not convert the points to an np.array
with…
np.array(points)
and the split the columns into two lists…
x_values, y_values = np.hsplit(np.array(points), 2)
Note that the 2
indicates a split into two (2) np.array
objects (one for each column).
The real final function is actually…
def get_x_and_y_values(points: list[tuple[float, float]]) -> tuple[np.array, np.array]:
"""
Take a list of points and return a list of x values and a list of y values
"""
x_values, y_values = np.hsplit(np.array(points), 2)
return x_values, y_values
2.2 Back to main
The output in main is a bit tricky. While .format
is useful, we need to center string literals for titles.
print("{:*^40}".format("XTX"))
I believe the string .center
member function is a more readable and self-documenting choice.
print("XTX".center(40, "*"))
Let us make that change to every heading and finalize main…
def main():
points = [(0.0, 0.0), (1.0, 1.0), (2.0, 4.0)]
x_values, y_values = get_x_and_y_values(points)
# Compute X and XT
x_values_squared = x_values ** 2
matrix_X = np.column_stack((np.ones(len(points)), x_values, x_values_squared))
matrix_XT = matrix_X.transpose()
# Compute Y
matrix_Y = y_values
# Compute XTX and XTY
matrix_XTX = np.matmul(matrix_XT, matrix_X)
matrix_XTY = np.matmul(matrix_XT, matrix_Y)
# Construct augmented XTX|XTY matrix
matrix_XTY = matrix_XTY.reshape(matrix_XTX.shape[0], 1)
matrix_augmented = np.hstack((matrix_XTX, matrix_XTY))
print("XTX".center(40, "*"))
print(matrix_XTX)
print()
print("XTY".center(40, "*"))
print(matrix_XTY)
print()
print("XTX|XTY".center(40, "*"))
print(matrix_augmented)
solution = solve_matrix(matrix_augmented)
print()
print("Solution".center(40, "*"))
print(np.polynomial.Polynomial(solution))
3 Looking at solve_matrix
The solve_matrix
itself is little more than a loop where other functions are invoked.
def solve_matrix(matrix_augmented: np.array) -> np.array:
"""
Solve a matrix and return the resulting solution vector
Args:
matrix_augmented: an n-by-n matrix with a vector augmented in the
right-most column
Returns:
constants c_0, c_1, c_2, ... c_n depending on the number of rows in the
supplied matrix
"""
# Get the number of rows in the matrix
num_rows, _ = matrix_augmented.shape
for current_row_idx in range(0, num_rows):
swap_current_row_with_largest_row(matrix_augmented, current_row_idx)
scale_row(matrix_augmented, current_row_idx)
eliminate(matrix_augmented, current_row_idx)
_backsolve(matrix_augmented)
return matrix_augmented[:, -1].flatten()
There is no refactoring left in this top-level function.
3.1 Swapping the Largest Row
The swap_current_row_with_largest_row
can be simplified with some NumPy magic. Let us take another look at the find largest row logic.
num_rows, _ = matrix.shape
# Find the row with the largest column entry
row_idx = current_idx
largest_idx = row_idx
current_col = current_idx
for j in range(row_idx + 1, num_rows):
if matrix[largest_idx, row_idx] < matrix[j, current_col]:
largest_idx = j
The first three lines are variable declarations and definitions. However, the loop itself can be replaced with a call to argmax
. This function (i.e., argmax
) will return the position (index) of the largest entry in an np.array
.
largest_idx = idx + np.argmax(matrix[idx:, :], axis=0)[idx]
Well… that code snippet is clear (as mud). Let us break the code into pieces:
-
axis=0
- find the maximum entry in each column. -
matrix_augmented[idx:, :]
- search every column in every row, starting at row “idx
”. -
‘[idx]’ - our interest lies with the row for the largest entry in column “
idx
”. -
idx + np.argmax(
- the search started at row “idx
” (which is treated as index 0). We need to shift back to the full set of rows.
That allows us to replace an entire loop with three lines.
# Find the row with the largest column entry
idx = current_idx
largest_idx = idx + np.argmax(matrix[idx:, :], axis=0)[idx]
# If the current row is not the largest row then swap
if largest_idx != current_idx:
matrix[[current_idx, largest_idx], :] = matrix[[largest_idx, current_idx], :]
The remainder of the function can remain unchanged.
4 Being Persnickety
The scale_row
function makes use of /=
.
scaling_factor = matrix[current_row_idx, current_row_idx]
matrix[current_row_idx, :] /= scaling_factor
I would prefer for the division operation to be a multiplication. This can be done with…
scaling_factor = matrix[current_row_idx, current_row_idx]
scaling_factor = np.reciprocal(scaling_factor)
or…
scaling_factor = np.reciprocal(matrix[current_row_idx, current_row_idx])
Let us make use of the latter option. The end result is…
def scale_row(matrix: np.array, current_row_idx: int) -> None:
"""
Scale every entry of the current row by the value of the corresponding
column (e.g., matrix[2,2])
"""
scaling_factor = np.reciprocal(matrix[current_row_idx, current_row_idx])
matrix[current_row_idx, :] *= scaling_factor
5 The End… For Real This Time
This is the end of the refactoring journey. The final code can be accessed on GitHub.