gpp_linear_algebra-inl¶

Contents:

gpp_linear_algebra-inl.hpp

gpp_linear_algebra-inl.hpp¶

The “inline header” for gpp_linear_algebra. This file contains inline function definitions and template definitions, particularly for functions/templates that contain more complex logic making them too lengthy/cumbersome for gpp_linear_algebra.hpp.

namespace optimal_learning

Macro to allow restrict as a keyword for C++ compilation and CUDA/nvcc compilation. See related entry in gpp_common.hpp for more details.

Functions
OL_NONNULL_POINTERS void OuterProduct(int size_m, int size_n, double alpha, double const *restrict vector_v, double const *restrict vector_u, double *restrict outer_prod)

Computes A = alpha*v*u^T + A, where v * u^T is the outer product of v with u, with v \in R^m and u \in R^n (aka A_{ij} += v_i * u_j) If v == u, then this update is symmetric and semi-definite (positive or negative depends on sign of alpha).

Mathematically, the update to A is rank 1. Numerically, it will almost never be.

Parameters:

size_m: length of v

size_n: length of u

alpha: scaling factor

vector_v[size_m]:

the vector v

vector_u[size_n]:

the vector u

Outputs:

outerprod[size_m][size_n]:

A such that A_{ij} = v_i * v_j

OL_NONNULL_POINTERS OL_WARN_UNUSED_RESULT double MatrixTrace(double const *restrict A, int size_m)

Computes the trace of a matrix, tr(A) = \sum_{i = 1}^n A_ii

Parameters:

A[size_m][size_m]:

matrix to take the trace of

size_m: dimension of A

Returns:

the trace of A

OL_NONNULL_POINTERS OL_WARN_UNUSED_RESULT double TraceOfGeneralMatrixMatrixMultiply(double const *restrict A, double const *restrict B, int size)
Computes tr(A*B) without explicitly forming A*B

Naively, computing tr(A*B) would involve: C_{i,k} = A_{i,j}*B_{j,k}, trace = C_{i,i} Spelled out:
for i = 1:n
  for j = 1:n
    for k = 1:n
     C_{i,k} += A_{i,j} * B_{j,k}
Then, trace = \sum_{i = 1}^n C_{i,i} This has cost O(n^3), dominated by the matrix-matrix product.

Notice that since we only need the diagonal of C to compute the trace, there is no need to form C_{1,10}, for example. Thus the above loop can be reorganized: trace = A_{i,j} * B_{j,i}

For the pseudocode, we take the original code and set k = i:
for i = 1:n
  for j = 1:n
      trace += A_{i,j} * B_{j,i}
Which is O(n^2).

Finally, for matrix-products A*B, when A is stored column-major, we use the view of matrix-vector multiply, Ax, as a weighted sum of the columns of A. This improves memory access patterns. But here, such an arrangement is impossible and we have to access A in a “bad” ordering.

A, B do not need to be square (as long as their product, A*B, is square), but we make this assumption for convenience.

Parameters:

A[size][size]: left matrix

B[size][size]: right matrix

size: dimension of matrices

Returns:

tr(A*B)