2
0
mirror of https://github.com/boostorg/compute.git synced 2026-01-31 20:12:23 +00:00
Commit Graph

402 Commits

Author SHA1 Message Date
Kyle Lutz
bacec5b8fe Add uniform_real_distribution
This adds a random number distribution which generates random
numbers in a uniform distribution.

Also adds a convenience algorithm which fills a range with
uniformly distributed random numbers between two values.
2013-08-13 20:40:42 -04:00
Kyle Lutz
767589fe0d Rearrange type headers
This rearranges the type headers to live under the
<boost/compute/types/...> directory instead of the
top-level <boost/compute/...> directory.
2013-08-13 20:37:56 -04:00
Kyle Lutz
b539e8413c Add Doxygen documentation
This replaces the BoostBook/XML based reference documentation
with Doxygen auto-generated documentation.
2013-07-16 21:48:16 -04:00
Kyle Lutz
b3d2fbb7eb Add fill_async() algorithm
This adds a fill_async() which fills a range with a
given value asynchronously.
2013-07-02 21:57:19 -04:00
Kyle Lutz
5203506c16 Add support for on-device copy_async()
This adds support for copy_async() when copying between
memory objects on a compute device.
2013-07-02 21:57:19 -04:00
Kyle Lutz
8459fdeb0e Change meta_kernel::exec*() methods to return events
This changes the exec() and exec_1d() methods in the meta_kernel
class to return event objects.
2013-07-02 21:57:19 -04:00
Kyle Lutz
d8f5a5b503 Change enqueue_*_buffer() methods to return events
This changes the enqueue_copy_buffer() and enqueue_fill_buffer()
methods in the command_queue class to return event objects.
2013-07-02 21:57:19 -04:00
Kyle Lutz
c1bf707b41 Add event::get_command_type() method
This adds a get_command_type() method to the event class
which returns the OpenCL type for an event object.
2013-07-02 21:57:19 -04:00
Kyle Lutz
ee5f581094 Add command_queue::enqueue_migrate_memory_objects() method
This adds an enqueue_migrate_memory_objects() method to the
command_queue class which allows memory objects to be migrated
between compute devices and to the host.
2013-07-02 21:57:19 -04:00
Kyle Lutz
2ca028c37b Improve reduce() performance
This makes a few tweaks to the reduce() algorithm in order to
improve performance. An unnecessary barrier() has been removed
and now multiple values are reduced on the initial read.
2013-07-02 21:57:15 -04:00
Denis Demidov
84394de119 Get rid of type convesion warnings inside VS2010 2013-06-24 09:57:22 +02:00
Denis Demidov
b28d8697bc Silence MSVC security warning C4996 in system.hpp 2013-06-24 09:55:40 +02:00
Denis Demidov
f5c86057a1 Get rid of clang v3.3 warning -Wconstexpr-not-const 2013-06-21 15:27:00 +04:00
Kyle Lutz
f2b812019c Fix bugs with char/uchar/bool literals in meta_kernel
This fixes a few issues that occurred when using char, uchar
and bool literals with meta_kernel.
2013-06-19 23:55:22 -04:00
Kyle Lutz
e01569049b Add type_name<bool>() specialization
This adds a type_name() specialization for bool.
2013-06-19 23:48:49 -04:00
Kyle Lutz
0d285d8a30 Change meta_kernel::add_arg(name, value) to add_set_arg()
This changes the meta_kernel::add_arg() overload with a name
and a value to a separate method. This fixes conflict when
using add_arg() with string values.
2013-06-11 21:19:47 -04:00
Kyle Lutz
7fb77ef9c5 Add test for any/all/none_if() with NaN and inf
This adds a test for the any_of(), all_of() and none_of() functions
with NaN and Inf values.
2013-06-11 21:16:15 -04:00
Kyle Lutz
8e51a0a162 Refactor lambda expression framework to use meta_kernel
This refactors the lambda expression framework to use meta_kernel
to construct kernel source code instead of using plain strings.
2013-06-11 21:14:28 -04:00
Kyle Lutz
64e94549b3 Add specialization for get<N>() with zip_iterator
This adds a specialization for the get<N>() function when used
with zip_iterator's. Now, only the N'th iterator for the expression
will be dereferenced instead of dereferencing all of the iterators
into a tuple and then extracting the N'th component.
2013-06-11 20:37:23 -04:00
Kyle Lutz
15bc98b94f Remove cv-qualifiers from get<N>()'s value-type
This removes the cv-qualifiers for the value-type returned from
get<N>() expressions. This fixes issues when specializing based
on the type (e.g. pair, tuple).
2013-06-11 20:29:06 -04:00
Kyle Lutz
98b593b937 Fix meta_kernel streaming operators with float
This fixes a bug in the meta_kernel streaming operators with
float values. Now, float scalar and vector literals are inserted
into the kernel source with the proper 'f' suffix.
2013-06-11 20:23:47 -04:00
Kyle Lutz
36dd3f1306 Improve the system::find_default_device() method
This makes some improvements to the system::find_default_device()
method. Now, the devices on the system will only be queried once
when searching for the default device. This reduces the number of
calls to clGetPlatformIDs() and clGetDeviceIDs().

Also, in the case that no GPU or CPU devices are found, the first
device on the system will be selected as the default device. This
fixes issues when using Boost.Compute with pocl.
2013-05-24 20:07:38 -04:00
Kyle Lutz
aa7fd2f6fa Add asserts for clRelease*() functions in destructors
This adds assert()'s verifying that the clRelease*() functions
in the destructors for the OpenCL wrapper classes return
CL_SUCCESS.
2013-05-23 23:15:43 -04:00
Kyle Lutz
b5068b2027 Fix minor version macro
This fixes the minor version macro.
2013-05-23 22:46:52 -04:00
Kyle Lutz
5b12d04d4e Mark streaming operators for boost::tuple<> inline
This marks the meta_kernel streaming operators for
boost::tuple<> literals as inline.
2013-05-22 22:50:51 -04:00
Kyle Lutz
c2187b89c0 Mark streaming operator std::pair<> inline
This marks the meta_kernel streaming operator for
std::pair<> literals as inline.
2013-05-22 22:50:46 -04:00
Kyle Lutz
0405c3cdc3 Check for valid range in reverse()
This adds a check to the reverse() algorithm to ensure that
the range contains at least two elements. Previously, passing
zero or one element ranges to reverse() would result in errors.
2013-05-22 22:41:12 -04:00
Kyle Lutz
f07caa1ddd Fix compilation error in future<void> assignment operator
This fixes a compilation error which occurred when assigning
to a future<void> from a future<T>. For different future types
the event member variable is private and must be accessed via
the get_event() method.
2013-05-21 23:20:36 -04:00
Kyle Lutz
bac6fb7332 Check for valid pattern size in fill() disptacher
This checks for a valid pattern value size before dispatching
to the clEnqueueFillBuffer() function for the fill() algorithm.
2013-05-21 23:17:32 -04:00
Kyle Lutz
2560600122 Fix issues with boost::tuple<>, char, and fill()
This fixes issues when using boost::tuple<> containing char
types with the fill() algorithm.
2013-05-21 23:10:56 -04:00
Kyle Lutz
9141732b3e Fix issues with std::pair<>, char, and fill()
This fixes issues when using std::pair<> containing char
types with the fill() algorithm.
2013-05-21 23:10:56 -04:00
Kyle Lutz
f4ecbd1e6c Fix issues with char literals in meta_kernel
This fixes issues when using char and unsigned char literals in
a meta_kernel. Previously the character values would be directly
inserted without quotes (e.g. c instead of 'c') which lead to
kernel compilation errors.
2013-05-21 23:10:40 -04:00
Kyle Lutz
1caebe6de8 Fix bug in in-place scan()
This fixes a bug when creating a temporary vector for use in the
in-place scan() algorithm. Previously, a separate command queue
was used to copy the input values to the temporary vector. Now,
the same command queue is used for copying the input values and
performing the scan.
2013-05-20 23:05:51 -04:00
Kyle Lutz
9f231d7b13 Fix conversion warnings in buffer_iterator
This fixes conversion warnings for buffer_iterator.
2013-05-20 23:05:40 -04:00
Kyle Lutz
3bc5bfaf78 Remove timer class
This removes the timer class. The technique of measuring the time
difference between two different OpenCL markers on a command queue
is not portable to all OpenCL implementations (only works on NVIDIA).

A new internal timer class has been added which uses boost::chrono
(or std::chrono if BOOST_COMPUTE_TIMER_USE_STD_CHRONO is defined).
This new timer is used by the benchmarks to measure time elapsed
on the host.
2013-05-20 21:08:42 -04:00
Kyle Lutz
fab7be5f43 Add inplace_merge() algorithm
This adds a simple inplace_merge() algorithm which merges
two contiguous sorted ranges in-place.

For now, the implementation simply copies the ranges to
two temporary vectors and calls merge().
2013-05-20 20:50:12 -04:00
Kyle Lutz
b43e79b983 Add support for get<N>() in lambda expressions
This adds support for using the get<N>() function in lambda
expressions to extract a single component of an aggregate type.

Also adds a test of using boost::tuple<> to store a user-defined
data type on the device and sort them by their first component
using a lambda expression as the comparator.
2013-05-20 20:50:10 -04:00
Kyle Lutz
e46828a9d6 Fix issues involving iterators with void value_type
This fixes a few issues encountered when using iterators with a
void value_type (e.g. std::insert_iterator<>).

The is_contiguous_iterator meta-function was refactored to always
return false for iterators with a void value_type and avoid
instantiating types for containers with a void value_type
(e.g. std::vector<void>::iterator) which previously resulted
in compilation errors.
2013-05-20 19:57:13 -04:00
Kyle Lutz
4ab37ada07 Add system-wide default command queue
This adds a system-wide default command queue. This queue is
accessible via the new static system::default_queue() method.
The default command queue is created for the default compute
device in the default context and is analogous to the default
stream in CUDA.

This changes how algorithms operate when invoked without an
explicit command queue. Previously, each algorithm had two
overloads, the first expected a command queue to be explicitly
passed and the second would create and use a temporary command
queue. Now, all algorithms take a command queue argument which
has a default value equal to system::default_queue().

This fixes a number of race-conditions and performance issues
througout the library associated with create, using, and
destroying many separate command queues.
2013-05-15 20:59:56 -04:00
Kyle Lutz
a2bda0610d Fix memory issues with device_ptr and allocator
This fixes a few memory handling issues between device_ptr,
buffer_iterator, buffer_value, allocator, and malloc/free.

Previously, memory buffers that were allocated by allocator and
malloc were being retained (via clRetainMemObject() in buffer's
constructor) by device_ptr, buffer_iterator and buffer_value.

Now, false is passed for the retain parameter to buffer's
constructor so that the buffer's reference count is not
incremented. Furthermore, the classes now set the buffer to
null before being destructed so that they will not decrement its
reference count (which normally occurs buffer's destructor).

The main effect of this change is that objects which refer to a
memory buffer but do not own it (e.g. device_ptr, buffer_iterator)
will not modify the reference count for the buffer. This fixes a
number of memory leaks which occured in longer running programs.
2013-05-13 22:27:02 -04:00
Kyle Lutz
a5ddeae614 Add scalar<T> container
This adds a new scalar<T> "container" which stores a single
value in a memory buffer. This simplifies memory handling in
algorithms which read and write a single value.
2013-05-11 20:20:27 -04:00
Kyle Lutz
130f8c30f1 Rename kernel::num_args() method to arity()
This renames the kernel::num_args() method to arity().
2013-05-11 20:15:00 -04:00
Kyle Lutz
ffec5fd34a Remove unnecessary includes from transform_reduce
This removes a couple of unnecessary includes from the
transform_reduce.hpp header file.
2013-05-11 20:10:28 -04:00
Kyle Lutz
178676df4f Refactor the system::default_device() method
This refactors the system::default_device() method. Now, the
default compute device for the system is only found once and
stored in a static variable. This eliminates many redundant
calls to clGetPlatformIDs() and clGetDeviceIDs().

Also, the default_cpu_device() and default_gpu_device() methods
have been removed and their usages replaced with default_device().
2013-05-10 22:49:05 -04:00
Kyle Lutz
d40eddc56b Fix compilation error with get<N>() and tuple
This fixes a compilation error which occured when using
the get<N>() function with tuple types.
2013-05-10 21:51:28 -04:00
Kyle Lutz
705b3f35a3 Fix narrowing conversion warnings in device
This fixes a couple of narrowing conversion warnings in the
device partitioning methods which were seen when compiling
VexCL with Boost.Compute in C++11 mode.
2013-05-09 22:04:00 -04:00
Kyle Lutz
9a64f6b39a Add get<N>() function
This adds a get<N>() function which returns the n'th element
of an aggregate type (e.g. vector type, pair, tuple).

This unifies the functionality of, and replaces, the get_pair()
and vector_component() functions.
2013-05-05 12:46:05 -04:00
Kyle Lutz
3e840fa306 Add transform_if() algorithm
This adds a new algorithm named transform_if() which applies
a given unary function to an input value only if it passes a
separate predicate function.
2013-05-05 11:51:21 -04:00
Kyle Lutz
49a34442e5 Remove unused histogram() algorithm
This removes the unused histogram() algorithm.
2013-05-05 10:56:14 -04:00
Dominic Meiser
7c5e321c2a Fixing build issues under windows 2013-05-03 18:37:09 -04:00