Stan Math Library  2.20.0
reverse mode automatic differentiation
multiply_transpose.hpp
Go to the documentation of this file.
1 #ifndef STAN_MATH_OPENCL_MULTIPLY_TRANSPOSE_HPP
2 #define STAN_MATH_OPENCL_MULTIPLY_TRANSPOSE_HPP
3 #ifdef STAN_OPENCL
10 
12 
13 namespace stan {
14 namespace math {
25  matrix_cl temp(A.rows(), A.rows());
26  if (A.size() == 0)
27  return temp;
28  // padding the matrices so the dimensions are divisible with local
29  // improves performance becasuse we can omit if statements in the
30  // multiply kernel
31  int local = opencl_kernels::multiply_transpose.make_functor.get_opts().at(
32  "THREAD_BLOCK_SIZE");
33  int Mpad = ((A.rows() + local - 1) / local) * local;
34  int wpt = opencl_kernels::multiply_transpose.make_functor.get_opts().at(
35  "WORK_PER_THREAD");
36  try {
37  opencl_kernels::multiply_transpose(cl::NDRange(Mpad, Mpad / wpt),
38  cl::NDRange(local, local / wpt), A, temp,
39  A.rows(), A.cols());
40  } catch (cl::Error& e) {
41  check_opencl_error("multiply self transpose", e);
42  }
43  return temp;
44 }
45 } // namespace math
46 } // namespace stan
47 
48 #endif
49 #endif
The matrix_cl class - allocates memory space on the OpenCL device, functions for transfering matrices...
Represents a matrix on the OpenCL device.
Definition: matrix_cl.hpp:29
checking OpenCL error numbers
const kernel_cl< in_buffer, out_buffer, int, int > multiply_transpose("multiply_transpose", {thread_block_helpers, multiply_transpose_kernel_code}, {{"THREAD_BLOCK_SIZE", 32}, {"WORK_PER_THREAD", 4}})
See the docs for add() .
double e()
Return the base of the natural logarithm.
Definition: constants.hpp:87
matrix_cl multiply_transpose(const matrix_cl &A)
Computes the product of a square OpenCL matrix with its transpose.
void check_opencl_error(const char *function, const cl::Error &e)
Throws the domain error with specifying the OpenCL error that occured.

     [ Stan Home Page ] © 2011–2018, Stan Development Team.