NumPy & Multi-Dimension Arrays - NumPy Magic

Thomas J. Kennedy

Contents:

1 Where are We?

In the previous lecture, we have just finished refactoring the Least Squares Approximation Solver.

This lecture will focus on a second refactoring pass through the solver code. The code generated in this lecture can be accessed on GitHub.

2 Back to main

I am not entirely satisfied with main.

def main():
    points = [(0.0, 0.0), (1.0, 1.0), (2.0, 4.0)]

    # Compute X
    x_values = [x for x, _ in points]
    x_values = np.array(x_values)
    x_values_squared = x_values ** 2
    matrix_X = np.column_stack((np.ones(len(points)), x_values, x_values_squared))

    # Compute Y
    matrix_Y = np.array([y for _, y in points])

The first list (i.e., points) represents our input data. We end up creating x_values and y_values based on these points. If this were part of a larger application… the points would come from an input file. Let us introduce a new function that takes the points and returns two lists.

def get_x_and_y_values(points: list[tuple[float, float]]) -> tuple[np.array, np.array]:
    """
    Take a list of points and return a list of x values and a list of y values
    """

    x_values = [x for x, _ in points]
    x_values = np.array(x_values)

    y_values = np.array([y for _, y in points])

    return x_values, y_values

The logic for splitting the points into two np.array objects should its own function.

2.1 Getting x_values and y_values

We can still improve this function. If we take…

    x_values = [x for x, _ in points]
    x_values = np.array(x_values)

the lines can be combined into one…

    x_values = np.array([x for x, _ in points])

We can make the same tweak to the line for y_values.

    y_values = np.array([y for _, y in points])

The final function ends up as two lines (not including the return).

def get_x_and_y_values(points: list[tuple[float, float]]) -> tuple[np.array, np.array]:
    """
    Take a list of points and return a list of x values and a list of y values
    """

    x_values = np.array([x for x, _ in points])
    y_values = np.array([y for _, y in points])

    return x_values, y_values

But… we can make the process a little cleaner with hsplit. Why not convert the points to an np.array with…

np.array(points)

and the split the columns into two lists…

    x_values, y_values = np.hsplit(np.array(points), 2)

Note that the 2 indicates a split into two (2) np.array objects (one for each column).

The real final function is actually…

def get_x_and_y_values(points: list[tuple[float, float]]) -> tuple[np.array, np.array]:
    """
    Take a list of points and return a list of x values and a list of y values
    """

    x_values, y_values = np.hsplit(np.array(points), 2)

    return x_values, y_values

2.2 Back to main

The output in main is a bit tricky. While .format is useful, we need to center string literals for titles.

    print("{:*^40}".format("XTX"))

I believe the string .center member function is a more readable and self-documenting choice.

    print("XTX".center(40, "*"))

Let us make that change to every heading and finalize main…

def main():
    points = [(0.0, 0.0), (1.0, 1.0), (2.0, 4.0)]

    x_values, y_values = get_x_and_y_values(points)

    # Compute X and XT
    x_values_squared = x_values ** 2
    matrix_X = np.column_stack((np.ones(len(points)), x_values, x_values_squared))

    matrix_XT = matrix_X.transpose()

    # Compute Y
    matrix_Y = y_values

    # Compute XTX and XTY
    matrix_XTX = np.matmul(matrix_XT, matrix_X)
    matrix_XTY = np.matmul(matrix_XT, matrix_Y)

    # Construct augmented XTX|XTY matrix 
    matrix_XTY = matrix_XTY.reshape(matrix_XTX.shape[0], 1)
    matrix_augmented = np.hstack((matrix_XTX, matrix_XTY))

    print("XTX".center(40, "*"))
    print(matrix_XTX)

    print()
    print("XTY".center(40, "*"))
    print(matrix_XTY)

    print()
    print("XTX|XTY".center(40, "*"))
    print(matrix_augmented)

    solution = solve_matrix(matrix_augmented)

    print()
    print("Solution".center(40, "*"))
    print(np.polynomial.Polynomial(solution))

3 Looking at solve_matrix

The solve_matrix itself is little more than a loop where other functions are invoked.

def solve_matrix(matrix_augmented: np.array) -> np.array:
    """
    Solve a matrix and return the resulting solution vector

    Args:
        matrix_augmented: an n-by-n matrix with a vector augmented in the
        right-most column

    Returns:
        constants c_0, c_1, c_2, ... c_n depending on the number of rows in the
        supplied matrix
    """

    # Get the number of rows in the matrix
    num_rows, _ = matrix_augmented.shape

    for current_row_idx in range(0, num_rows):
        swap_current_row_with_largest_row(matrix_augmented, current_row_idx)
        scale_row(matrix_augmented, current_row_idx)
        eliminate(matrix_augmented, current_row_idx)

    _backsolve(matrix_augmented)

    return matrix_augmented[:, -1].flatten()

There is no refactoring left in this top-level function.

3.1 Swapping the Largest Row

The swap_current_row_with_largest_row can be simplified with some NumPy magic. Let us take another look at the find largest row logic.

    num_rows, _ = matrix.shape

    # Find the row with the largest column entry
    row_idx = current_idx
    largest_idx = row_idx
    current_col = current_idx
    for j in range(row_idx + 1, num_rows):
        if matrix[largest_idx, row_idx] < matrix[j, current_col]:
            largest_idx = j

The first three lines are variable declarations and definitions. However, the loop itself can be replaced with a call to argmax. This function (i.e., argmax) will return the position (index) of the largest entry in an np.array.

        largest_idx = idx + np.argmax(matrix[idx:, :], axis=0)[idx]

Well… that code snippet is clear (as mud). Let us break the code into pieces:

That allows us to replace an entire loop with three lines.

    # Find the row with the largest column entry
    idx = current_idx
    largest_idx = idx + np.argmax(matrix[idx:, :], axis=0)[idx]

    # If the current row is not the largest row then swap
    if largest_idx != current_idx:
        matrix[[current_idx, largest_idx], :] = matrix[[largest_idx, current_idx], :]

The remainder of the function can remain unchanged.

4 Being Persnickety

The scale_row function makes use of /=.

    scaling_factor = matrix[current_row_idx, current_row_idx]
    matrix[current_row_idx, :] /= scaling_factor

I would prefer for the division operation to be a multiplication. This can be done with…

    scaling_factor = matrix[current_row_idx, current_row_idx]
    scaling_factor = np.reciprocal(scaling_factor)

or…

    scaling_factor = np.reciprocal(matrix[current_row_idx, current_row_idx])

Let us make use of the latter option. The end result is…

def scale_row(matrix: np.array, current_row_idx: int) -> None:
    """
    Scale every entry of the current row by the value of the corresponding
    column (e.g., matrix[2,2])
    """

    scaling_factor = np.reciprocal(matrix[current_row_idx, current_row_idx])
    matrix[current_row_idx, :] *= scaling_factor

5 The End… For Real This Time

This is the end of the refactoring journey. The final code can be accessed on GitHub.