gpp_cuda_math¶
Contents:
gpp_cuda_math.hpp¶
This file contains declaration of gpu functions (host code) that are called by C++ code. The functions include calculating ExpectedImprovement, gradient of ExpectedImprovement, and gpu utility functions (memory allocation, setup gpu device, etc)
namespace optimal_learning
Macro to allow restrict as a keyword for C++ compilation and CUDA/nvcc compilation. See related entry in gpp_common.hpp for more details.
Variablesconst unsigned int kEINumBlocks
Number of blocks assigned for computing Expected Improvement on GPU.
const unsigned int kEINumThreads
Number of threads per block assigned for computing Expected Improvement on GPU.
const unsigned int kGradEINumBlocks
Number of blocks assigned for computing Gradient of Expected Improvement on GPU.
const unsigned int kGradEINumThreads
Number of threads per block assigned for computing Gradient of Expected Improvement on GPU.
const CudaError kCudaSuccess
CudaError struct encoding a successful CUDA operation.
class CudaError
This C struct contains error information that are used by exception handling in gpp_expected_improvement_gpu.hpp/cpp. File/line and function information are empty strings if the error code is cudaSuccess (i.e., no error).
Public MemberscudaError_t err
error returned by CUDA API functions (basically enum type)
char const * file_and_line_info
file and line info of the function which returned error
char const * func_info
name of the function that returned error
gpp_cuda_math.cu¶
This file contains implementations of all GPU functions. There are both device code (executed on GPU device) and host code (executed on CPU), and they are compiled by NVCC, which is a NVIDIA CUDA compiler.
DefinesOL_CUDA_STRINGIFY_EXPANSION_INNER(x)
Macro to stringify the expansion of a macro. For example, say we are on line 53:
- #__LINE__ --> "__LINE__"
- OL_CUDA_STRINGIFY_EXPANSION(__LINE__) --> "53"
OL_CUDA_STRINGIFY_EXPANSION_INNER is not meant to be used directly; but we need #x in a macro for this expansion to work.
This is a standard trick; see bottom of: http://gcc.gnu.org/onlinedocs/cpp/Stringification.html
OL_CUDA_STRINGIFY_EXPANSION(x)
OL_CUDA_STRINGIFY_FILE_AND_LINE
Macro to stringify and format the current file and line number. For example, if the macro is invoked from line 893 of file gpp_foo.cpp, this macro produces the compile-time string-constant: (gpp_foo.cpp: 893)
OL_CUDA_ERROR_RETURN(X)
Macro that checks error message (with type cudaError_t) returned by CUDA API functions, and if there is error occurred, the macro produces a C struct containing error message, function name where error occured, file name and line info, and then terminate the function.
namespace optimal_learning
Macro to allow restrict as a keyword for C++ compilation and CUDA/nvcc compilation. See related entry in gpp_common.hpp for more details.
FunctionsCudaError CudaGetEI(double const *restrict mu, double const *restrict chol_var, int num_union, int num_mc, double best, uint64_t base_seed, bool configure_for_test, double *restrict gpu_mu, double *restrict gpu_chol_var, double *restrict random_number_ei, double *restrict gpu_random_number_ei, double *restrict gpu_ei_storage, double *restrict ei_val)Compute Expected Improvement by Monte-Carlo using GPU, and this function is only meant to be used by CudaExpectedImprovementEvaluator::ComputeExpectedImprovement(...) in gpp_expected_improvement_gpu.hpp/cpp
- Parameters:
mu[num_union]: the mean of the GP evaluated at points interested chol_var[num_union][num_union]: cholesky factorization of the GP variance evaluated at points interested num_union: number of the points interested num_mc: number of iterations for Monte-Carlo simulation best: best function evaluation obtained so far base_seed: base seed for the GPU’s RNG; will be offset by GPU thread index (see curand) configure_for_test: whether record random_number_ei or not - Outputs:
gpu_mu[num_union]: device pointer to memory storing mu on GPU gpu_chol_var[num_union][num_union]: device pointer to memory storing chol_var on GPU random_number_ei[num_union][num_iteration][num_threads][num_blocks]: random numbers used for computing EI, for testing purpose only gpu_random_number_ei[num_union][num_iteration][num_threads][num_blocks]: device pointer to memory storing random numbers used for computing EI, for testing purpose only gpu_ei_storage[num_threads][num_blocks]: device pointer to memory storing values of EI on GPU ei_val[1]: pointer to value of Expected Improvement - Returns:
- CudaError state, which contains error information, file name, line and function name of the function that occurs error
CudaError CudaGetGradEI(double const *restrict mu, double const *restrict grad_mu, double const *restrict chol_var, double const *restrict grad_chol_var, int num_union, int num_to_sample, int dim, int num_mc, double best, uint64_t base_seed, bool configure_for_test, double *restrict gpu_mu, double *restrict gpu_grad_mu, double *restrict gpu_chol_var, double *restrict gpu_grad_chol_var, double *restrict random_number_grad_ei, double *restrict gpu_random_number_grad_ei, double *restrict gpu_grad_ei_storage, double *restrict grad_ei)Compute Gradient of Expected Improvement by Monte-Carlo using GPU, and this function is only meant to be used by CudaExpectedImprovementEvaluator::ComputeGradExpectedImprovement(...) in gpp_expected_improvement_gpu.hpp/cpp
- Parameters:
mu[num_union]: the mean of the GP evaluated at points interested grad_mu[dim][num_to_sample]: the gradient of mean of the GP evaluated at points interested chol_var[num_union][num_union]: cholesky factorization of the GP variance evaluated at points interested grad_chol_var[dim][num_union][num_union][num_to_sample]: gradient of cholesky factorization of the GP variance evaluated at points interested num_union: number of the union of points (aka q+p) num_to_sample: number of points to sample (aka q) dim: dimension of point space num_mc: number of iterations for Monte-Carlo simulation best: best function evaluation obtained so far base_seed: base seed for the GPU’s RNG; will be offset by GPU thread index (see curand) configure_for_test: whether record random_number_grad_ei or not - Outputs:
gpu_mu[num_union]: device pointer to memory storing mu on GPU gpu_grad_mu[dim][num_to_sample]: device pointer to memory storing grad_mu on GPU gpu_chol_var[num_union][num_union]: device pointer to memory storing chol_var on GPU gpu_grad_chol_var[dim][num_union][num_union][num_to_sample]: device pointer to memory storing grad_chol_var on GPU random_number_grad_ei[num_union][num_threads][num_blocks]: random numbers used for computing gradEI, for testing purpose only gpu_random_number_grad_ei[num_union][num_threads][num_blocks]: device pointer to memory storing random numbers used for computing gradEI, for testing purpose only gpu_grad_ei_storage[dim][num_to_sample][num_threads][num_blocks]: device pointer to memory storing values of gradient EI on GPU grad_ei[dim][num_to_sample]: pointer to gradient of Expected Improvement - Returns:
- CudaError state, which contains error information, file name, line and function name of the function that occurs error
CudaError CudaMallocDeviceMemory(size_t size, void **restrict address_of_ptr_to_gpu_memory)Allocate GPU device memory for storing an array; analogous to malloc() in C. Thin wrapper around cudaMalloc() that handles errors. See: http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html
Do not dereference address_of_ptr_to_gpu_memory outside the GPU device. Do not dereference address_of_ptr_to_gpu_memory if the error code (return_value.err) is not cudaSuccess.
- Parameters:
size: number of bytes to allocate address_of_ptr_to_gpu_memory: address of the pointer to alllocated device memory on the GPU - Returns:
- CudaError state, which contains error information, file name, line and function name of the function that occurs error
CudaError CudaFreeDeviceMemory(void *restrict ptr_to_gpu_memory)Free GPU device memory on the GPU; analogous to free() in C. Thin wrapper around cudaFree() that handles errors. See: http://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY.html
- Parameters:
ptr_to_gpu_memory: pointer to memory on GPU to free; MUST have been returned by a previous call to cudaMalloc(). - Returns:
- CudaError state, which contains error information, file name, line and function name of the function that occurs error
CudaError CudaSetDevice(int devID)Setup GPU device, and all GPU function calls will be operated on the GPU activated by this function.
- Parameters:
devID: the ID of GPU device to setup - Returns:
- CudaError state, which contains error information, file name, line and function name of the function that occurs error