This adds a get<N>() function which returns the n'th element
of an aggregate type (e.g. vector type, pair, tuple).
This unifies the functionality of, and replaces, the get_pair()
and vector_component() functions.
This changes the vector class to not auto-initialize values
when it is created or resized. This improves performance by
eliminating a call to fill(). If needed, user code can call
fill() explicitly on the newly allocated values.
This removes the documentation for the non-existent platforms()
and platform_count() methods in the platform class. These methods
have been moved to the system class and are documented there.
This changes the enqueue_nd_range_kernel() method to return an
event object. This allows clients to monitor the progress of a
kernel executing on a device.
This cleans up the constructor methods for the OpenCL wrapper
classes and unifies the API used for creating a wrapper class
object from the underlying OpenCL objects.
Now, every wrapper class has a constructor taking the OpenCL
object and an optional boolean retain parameter which indicates
whether the constructor should increment the reference count.
This remove the enqueue_wait_for_event() method from the
command_queue class as the clEnqueueWaitForEvents() function
has been deprecated in OpenCL 1.2.
This implements the merge() algorithm which merges two
ranges of sorted values into a single sorted range.
The current implementation uses a simple serial merge
algorithm. A GPU optimized version is coming soon.