2
0
mirror of https://github.com/boostorg/compute.git synced 2026-01-31 08:02:16 +00:00
Commit Graph

402 Commits

Author SHA1 Message Date
Kyle Lutz
ac148e8f1f Fix extra semicolon warning in interop/eigen/core.hpp 2014-01-13 18:27:40 -08:00
Denis Demidov
5e912dff1c Move BOOST_COMPUTE_MAX_ARITY definition to compute/config.hpp 2014-01-10 09:55:35 +04:00
Denis Demidov
52bae83504 Make zip_iterator take more than three elements
This uses Boost.Preprocessor macros to allow zip iterators to work with
arbitrary number of elements (the current limit is maximum boost::tuple
size which is 10 by default).

Refs #50
2014-01-09 23:39:58 +04:00
Denis Demidov
d24749ae52 Use SHA1 for online cache keys
This makes online cache use sha1 of the program source as key.
Introduces boost::compute::detail::sha1() function, which is moved
from compute::program into its own header file.
2014-01-07 23:07:18 +04:00
Kyle Lutz
6f52e3ce1f Merge pull request #46 from ddemidov/offline-cache
Use the original program source for program creation/compilation
2014-01-07 10:11:08 -08:00
Denis Demidov
41d2052c2a Fix linkage problem with detail::getenv()
detail::getenv() function was not declared inline, which led to
`multiple definition` errors at link time when a program consisted of
multiple objects that included Boost.Compute headers.

Fixed the problem and added core.multiple_objects test.
2014-01-07 21:29:18 +04:00
Denis Demidov
f519ad3639 Use the original program source for program creation/compilation
Instead of building the program from source with the added comment
block (used for distinction between different platforms and devices
when offline cache is in use), only use the altered source for the
hash computation. This way users will not get unexpected results from
program.source().
2014-01-07 21:05:26 +04:00
Kyle Lutz
aad03486d9 Add interop support
This adds interoperability support between Boost.Compute and various
other C/C++ libraries (Eigen, OpenCV, OpenGL, Qt and VTK). This eases
development for users using external libraries with Boost.Compute.
2014-01-06 23:35:38 -08:00
Kyle Lutz
b47e74df6f Add is_fundamental type-trait 2014-01-06 23:04:36 -08:00
Kyle Lutz
eca81df028 Merge pull request #39 from ddemidov/offline-cache
Implements offline kernel caching
2014-01-06 22:47:52 -08:00
Denis Demidov
562f149b18 Implements offline kernel caching
See kylelutz/compute#21

This adds program::build_with_source() function that both creates and
builds the program for the given context with supplied source and
compile options. In case BOOST_COMPUTE_USE_OFFLINE_CACHE macro is
defined, it also saves the compiled program binary for reuse in the
offline cache located in $HOME/.boost_compute folder on UNIX-like
systems and in %APPDATA%/boost_compute folder on Windows.

All internal uses of program::create_with_source() followed by
program::build() are replaced with program::build_with_source().
2014-01-07 09:07:00 +04:00
Kyle Lutz
b17888b604 Move future header to async directory 2014-01-06 18:44:37 -08:00
Kyle Lutz
55eeada078 Add getenv() wrapper
This adds a getenv() wrapper which can be used to avoid having to
explicitly disable MSVC warnings when checking for environment
variables.
2014-01-06 07:53:07 -08:00
Kyle Lutz
e337f632da Add height() and width() methods to image2d 2014-01-05 18:36:29 -08:00
Kyle Lutz
3bc4a6366d Add BOOST_COMPUTE_STRINGIZE_SOURCE() macro 2014-01-05 18:30:34 -08:00
Kyle Lutz
6b30645d6d Remove extra semicolon in accumulate.hpp 2014-01-05 18:18:45 -08:00
Kyle Lutz
0d9be38326 Fix issues with gather() algorithm
This fixes some issues with the gather algorithm and also
adds another test for it.
2013-12-21 15:34:29 -08:00
Kyle Lutz
55783258e7 Add cache support to meta_kernel::compile()
This updates the meta_kernel::compile() method to support
caching of program objects. The programs are cached based
on a hash of their source code.
2013-12-21 11:44:02 -08:00
Kyle Lutz
ac1ff45eff Add reduce_on_gpu() algorithm
This adds a improved reduce() algorithm implementation for
GPUs. Also adds checks to accumulate() which allow it to
use the higher-performance reduce() algorithm if possible.
2013-12-21 10:56:55 -08:00
Kyle Lutz
26612823a4 Add merge() overload with custom compare function
This adds a merge() function overload which uses a custom compare
function instead of the default less<T>() to compare the values.
2013-12-07 15:15:37 -08:00
Kyle Lutz
6b6f66b6ba Add reduce() overload without function argument
This adds adds an overload of the reduce() function which
uses plus<T>() as the reductor. This simplifies the common
case of calculating the sum for a range of values.
2013-12-07 15:02:04 -08:00
Kyle Lutz
ba9e64e316 Remove init argument from reduce()
This removes the init argument from reduce. This simplifies the
implementation and avoids copying a value from the host to the
device on every call to reduce.

If an initial value is required, the accumulate function can be
called instead.
2013-12-07 14:49:46 -08:00
Kyle Lutz
7db9ad715f Fix compilation error on Windows for context error handler
This fixes a compilation error which occurs on Windows when
registering the default error handler callback when creating
a new context object.

In OpenCL 1.1 and later the callback function is expected to
use the __stdcall calling convention. This is optionally defined
by the CL_CALLBACK macro on WIN32 platforms. If available, it is
defined with the BOOST_COMPUTE_CL_CALLBACK macro which is then
used to annotate the callback functions.
2013-12-06 23:11:01 -08:00
Kyle Lutz
4b2aa35326 Increase work-group size for copy() kernel
This increases the work-group size for the copy() kernel to 256 which
improves performance on several benchmarks.
2013-11-23 12:22:55 -08:00
Kyle Lutz
701bc8a5f3 Add nth_element() algorithm
This adds an implementation of the nth_element() algorithm. For
now the algorithm is trivially implemented by calling sort().
2013-11-15 20:51:13 -08:00
Kyle Lutz
0daa62e41f Add experimental copy_index_if() algorithm
This adds an experimental algorithm like copy_if() which copies
the index of the values for which predicate returns true instead
of the values themselves.
2013-11-15 20:30:30 -08:00
Kyle Lutz
adde232fc8 Add context error handler
This adds an error handler function which is invoked when an OpenCL
context encounters an error condition. The context error is converted
to a C++ exception containing the error information and thrown.
2013-11-15 20:26:01 -08:00
Kyle Lutz
953ebb4e26 Add variadic tuple support
This adds support for variadic tuples on C++11 compilers.
2013-11-15 20:07:39 -08:00
Kyle Lutz
b5ff4743bb Add field() function
This adds a new function which will return the named field
from a value. For example, this can be used to return one of
the components of a pair object or to swizzle a vector value.
2013-11-10 15:44:45 -08:00
Kyle Lutz
8213697307 Add BOOST_COMPUTE_FUNCTION() macro
This adds a new macro to ease the definition of custom user
functions. The BOOST_COMPUTE_FUNCTION() macro creates a new
boost::compute::function<> object with the provided return
type, argument types, function name and OpenCL source code.
2013-11-10 15:32:15 -08:00
Kyle Lutz
8608e60116 Refactor invoked_function<>
This refactors the invoked_function<> classes. Previously each
function arity (e.g. unary, binary) had a separate invoked_function<>
template class. Now they all use the same class which simplifies the
logic in function<> and meta_kernel.
2013-11-10 15:31:56 -08:00
Kyle Lutz
43678410be Fix bugs with type definitions in meta_kernel
This fixes a bug in which type definitions were being inserted
into meta_kernel's multiple times. Also forces zip_iterator to
insert its type definitions when used in a kernel.
2013-11-10 15:13:46 -08:00
Kyle Lutz
a0b635e201 Add type_name<void>() specialization
This adds a type_name<>() specialization for void types.
2013-11-10 14:35:04 -08:00
Kyle Lutz
85812f4e93 Add BOOST_COMPUTE_TYPE_NAME() macro
This adds a macro for registering custom type names for C++ types
to be used in OpenCL kernel code. Internally the macro specializes
the type_name<T>() function.
2013-10-02 21:40:22 -04:00
Kyle Lutz
a2b7595f36 Make type_name<T>() inline
This adds the inline specifier to the type_name<T>() function.
2013-10-02 21:23:09 -04:00
Kyle Lutz
feb510a019 Add unpack() function adaptor
This adds a new unpack() function adaptor which converts
a function with N arguments to a function which takes a
single tuple argument with N components.

This is useful for calling built-in functions with the tuples
values returned from zip_iterator. This also removes the now
un-needed binary_transform_iterator.
2013-09-24 23:05:08 -04:00
Kyle Lutz
736f3a17a6 Add min_and_max reduce() test
This adds a test for computing the minimum and maximum
values of a vector simultaneously using reduce() with a
custom reduction function.

Also fixes a bug in reduce() in which inplace_reduce() was
being used even if the input type and result type differed.
2013-09-24 22:47:16 -04:00
Kyle Lutz
a1155bc343 Store source strings for binary and ternary functions
This fixes an issue in which the source strings for binary
and ternary functions were not being stored and thus not
being inserted into kernels when they were invoked.
2013-09-24 22:42:50 -04:00
Kyle Lutz
dc6b3228eb Add as() and convert() type-conversion functions
This adds the as() and convert() functions for converting
between OpenCL types.
2013-09-24 22:27:50 -04:00
Kyle Lutz
3412d0935d Add not1() and not2() function adaptors
This adds the not1() and not2() function adaptors which
negate unary and binary functions respectively.
2013-09-24 22:22:52 -04:00
Kyle Lutz
07e4a6b3aa Remove BLAS functions
This removes the incomplete BLAS API functions.
2013-09-24 22:19:56 -04:00
Kyle Lutz
d16309f57e Add program_cache
This adds a program cache which can be used by algorithms and other
functions to store programs which may be re-used. This improves
performance by reducing the need for costly recompilation of commonly
used programs.

Program caches are context specific and multiple copies of the same
context will use the same program cache. They are created and accessed
by the global get_program_cache() function.

For now, only a few algorithms and functions (radix sort, mersenne
twister, fixed size sorts) make use of the program cache.
2013-09-07 22:58:34 -04:00
Kyle Lutz
d04e628367 Add experimental sort_by_transform() algorithm
This adds a sort_by_transform() algorithm which sorts a sets of
values based on the value of a transform function.

For example, this can be used to sort a set of vectors by their
length (when used with the length<T>() function) or by a single
component (when used with the get<N>() function).
2013-09-07 17:10:15 -04:00
Kyle Lutz
3389a5c741 Add sort_by_key() algorithm
This adds a new sort_by_key() algorithm which sorts a range
of values by a range of keys with a comparison operator.

For now this is only implemented by the serial insertion sort
algorithm. In the future it will be ported to the other sorting
algorithms (e.g. radix sort).
2013-09-07 17:02:08 -04:00
Kyle Lutz
f9d887e30d Add experimental tabulate() algorithm
This adds a tabulate() algorithm which fills a range with values
calculated from a function given each elements index.
2013-09-07 16:53:08 -04:00
Kyle Lutz
a96c9c0182 Add result argument to reduce() algorithm
This adds an output iterator result argument to the reduce()
algorithm. Now, instead of returning the reduced result, the
result is written to an output iterator. This allows the value
to stay on the device and avoids a device-to-host copy in cases
where the result is not needed on the host (e.g. it is part of
a larger computation).

This is an API breaking change to users of reduce(). Affected code
should now declare a result variable and then pass a pointer to it
as the new result argument.
2013-09-07 15:36:49 -04:00
Kyle Lutz
a8f4421739 Add copy() specialization for host-to-host transfers
This adds a copy() specialization for host-to-host transfers
which simply forwards the call to std::copy().

This is useful in templated algorithms which may in certain
circumstances copy() between data ranges on the host.
2013-09-07 15:29:48 -04:00
Kyle Lutz
78a561eff1 Add scan_on_cpu() algorithm
This adds a new scan_on_cpu() algorithm which implements the scan()
algorithm for CPU devices. Also renames the existing scan() algorithm
to scan_on_gpu().

This fixes some tests failures on POCL which were caused by the prior
GPU scan() algorithm not functioning properly with POCL.
2013-09-07 15:03:42 -04:00
Kyle Lutz
518d39fc2b Use bitwise-and to check device::type()
This changes the checks for the device type to use the bitwise-and
operator instead of the equaility operator. The returned type is a
bitset and this would cause errors when multiple bits were set.

This fixes a bug on POCL which returns the device type as a
combination of CL_DEVICE_TYPE_DEFAULT and CL_DEVICE_TYPE_CPU. Now
the correct device type (device::cpu) is detected for POCL.
2013-09-07 14:16:20 -04:00
Kyle Lutz
3a7b90ff06 Fix issue with comparison operators in lambda expressions
This fixes an issue in which comparison operators (e.g. <, ==)
in lambda expressions would return the wrong result type causing
compilation errors.

Also adds a few test cases to ensure the correct result type
and that lambda expressions can be properly used with count_if().
2013-08-15 22:10:03 -04:00