CD2H gitForager

Samsung/OCLAlgo

Name: OCLAlgo

Owner: Samsung

Description: C++11 concept for OpenCL computations

Created: 2014-03-25 11:57:03.0

Updated: 2017-08-02 18:27:05.0

Pushed: 2014-03-14 20:03:49.0

Homepage: null

Size: 644

Language: C++

GitHub Committers

User	Most Recent Commit	# Commits

Other Committers

User	Email	Most Recent Commit	# Commits

README

OCLAlgo framework

Brief Description

OpenCL framework, which is based on C++ OpenCL API Wrapper and C++11. It provides simple access to OpenCL platforms and devices for sync/async calculations.

Details

OCLAlgo provides simple OpenCL platform and device initialization by:

part platform and device names (isn't case sensetive)
number of platform and device in your system

st case
lgo::Queue queue("NVIDIA", "GeForce");
nd case
lgo::Queue queue(0, 0);

To enqueue OpenCL kernel you should make four simple steps:

Create KernelArg objects for OpenCL memory buffers with corresponding markers (IN, OUT, IN_OUT).
Create Task object using OpenCL program name (*.cl file), kernel name, compilation options and arguments in the same order as in OpenCL kernel (if you need to pass primitive types (such as int, float, double, char) to kernel, you simply pass them to Task constructor).
Create Grid object to define dimensions of OpenCL task.

Enqueue created task with corresponding grid.

 */
lgo::KernelArg arg(host_array, oclalgo::ArgType::IN_OUT);
add_value = 101;
 */
lgo::Task task("vector.cl", "modify", "", arg, add_value);
 */
lgo::Grid grid(cl::NDRange(size));
 */
 ocl_res = queue.EnqueueTask(task, grid);

To get the output results you should call .get(). In this case you wait while OpenCL finishes task, then get() method returns std::vector with output OpenCL buffers (this buffers was marked as ArgType::OUT or ArgType::IN_OUT when was passed to KernelArg object). You also can call .wait() to wait while OpenCL finishes task.

:vector<cl::Buffer> v_res = ocl_res.get();

If you want to copy OpenCL buffer to host array or vise versa, you should call Queue::memcpy (it's available to use sync or async approach to copy memory objects between Host and OpneCL devices). In async case oclalgo::future object is returned).

e.memcpy(host_array, v_res[0]);

License

Example

As an example of framework using oclalgo::Matrix and oclalgo::DMatrix calsses was added. This classes implement basic matrix operations ( + / - / * ) using host (oclalgo::Matrix) and device (oclalgo::DMatrix) resources.

You can find function test below, which checks correct work of oclalgo::Queue class in case of vectors addition.

lude <algorithm>
lude <iostream>
lude <string>

lude <gtest/gtest.h>
lude <src/gtest_main.cc>
lude <oclalgo/queue.h>

(Queue, VectorAdd) {
y {
// create OpenCL queue for sync/async task running using
// part platform and device names
oclalgo::Queue queue("NVIDIA", "GeForce");

// create and initialize input shared arrays
int size = 1024;
oclalgo::shared_array<int> a(size), b(size);
for (int i = 0; i < size; ++i) {
  a[i] = i;
  b[i] = size - i;
}

// initialize OpenCl kernel arguments
using oclalgo::ArgType;
using oclalgo::BufferArg;
BufferArg a_arg = queue.CreateKernelArg(a, ArgType::IN);
BufferArg b_arg = queue.CreateKernelArg(b, ArgType::IN);
BufferArg c_arg = queue.CreateKernelArg<int>(size, ArgType::OUT);

// create task using OpenCL program and kernel names, compilation options
// and arguments in the same order as in OpenCL kernel
oclalgo::Task task = queue.CreateTask("vector.cl", "vector_add", "",
                                      a_arg, b_arg, c_arg);

// create grid to define dimensions of OpenCL task
// in global and local (group size) space
oclalgo::Grid grid = oclalgo::Grid(cl::NDRange(size));

// enqueue OpenCL task (EnqueueTask() returns oclalgo::future object
// for async task running)
auto ocl_res = queue.EnqueueTask(task, grid);

// copy device memory with result to host
// (ocl_res.get() waits while OpenCL finished task
// and returns std::vector with output OpenCL buffers,
// which was marked as ArgType::OUT or ArgType::IN_OUT when was created)
queue.memcpy(a, ocl_res.get()[0]);

// check result
auto it = std::find_if(a.get_raw(), a.get_raw() + a.size(),
                       [size](int x) { return x != size; });
ASSERT_EQ(a.get_raw() + a.size(), it);
catch (const cl::Error& e) {
std::cerr << e.what() << " (err_code = "
          << oclalgo::Queue::StatusStr(e.err()) << ")" << std::endl;
throw e;

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.