gpp_linear_algebra-inl

Contents:

gpp_linear_algebra-inl.hpp

The “inline header” for gpp_linear_algebra. This file contains inline function definitions and template definitions, particularly for functions/templates that contain more complex logic making them too lengthy/cumbersome for gpp_linear_algebra.hpp.

namespace optimal_learning

Macro to allow restrict as a keyword for C++ compilation and CUDA/nvcc compilation. See related entry in gpp_common.hpp for more details.

Functions

OL_NONNULL_POINTERS void OuterProduct(int size_m, int size_n, double alpha, double const *restrict vector_v, double const *restrict vector_u, double *restrict outer_prod)

Computes A = alpha*v*u^T + A, where v * u^T is the outer product of v with u, with v \in R^m and u \in R^n (aka A_{ij} += v_i * u_j) If v == u, then this update is symmetric and semi-definite (positive or negative depends on sign of alpha).

Mathematically, the update to A is rank 1. Numerically, it will almost never be.

Parameters:
size_m:length of v
size_n:length of u
alpha:scaling factor
vector_v[size_m]:
 the vector v
vector_u[size_n]:
 the vector u
Outputs:
outerprod[size_m][size_n]:
 A such that A_{ij} = v_i * v_j

OL_NONNULL_POINTERS OL_WARN_UNUSED_RESULT double MatrixTrace(double const *restrict A, int size_m)

Computes the trace of a matrix, tr(A) = \sum_{i = 1}^n A_ii

Parameters:
A[size_m][size_m]:
 matrix to take the trace of
size_m:dimension of A
Returns:
the trace of A

OL_NONNULL_POINTERS OL_WARN_UNUSED_RESULT double TraceOfGeneralMatrixMatrixMultiply(double const *restrict A, double const *restrict B, int size)

Computes tr(A*B) without explicitly forming A*B

Naively, computing tr(A*B) would involve: C_{i,k} = A_{i,j}*B_{j,k}, trace = C_{i,i} Spelled out:

for i = 1:n
  for j = 1:n
    for k = 1:n
     C_{i,k} += A_{i,j} * B_{j,k}

Then, trace = \sum_{i = 1}^n C_{i,i} This has cost O(n^3), dominated by the matrix-matrix product.

Notice that since we only need the diagonal of C to compute the trace, there is no need to form C_{1,10}, for example. Thus the above loop can be reorganized: trace = A_{i,j} * B_{j,i}

For the pseudocode, we take the original code and set k = i:

for i = 1:n
  for j = 1:n
      trace += A_{i,j} * B_{j,i}

Which is O(n^2).

Finally, for matrix-products A*B, when A is stored column-major, we use the view of matrix-vector multiply, Ax, as a weighted sum of the columns of A. This improves memory access patterns. But here, such an arrangement is impossible and we have to access A in a “bad” ordering.

A, B do not need to be square (as long as their product, A*B, is square), but we make this assumption for convenience.

Parameters:
A[size][size]:left matrix
B[size][size]:right matrix
size:dimension of matrices
Returns:
tr(A*B)