From 7e8b2bb40357d82b018c4f4a7fd9fe4498f39952 Mon Sep 17 00:00:00 2001 From: Hans Dembinski Date: Fri, 13 Oct 2017 11:20:59 +0200 Subject: [PATCH] replaced container_storage with array_storage, simplified comparison between storage types --- doc/benchmarks.qbk | 6 +- doc/changelog.qbk | 2 +- doc/concepts.qbk | 2 +- doc/rationale.qbk | 2 +- include/boost/histogram.hpp | 2 +- include/boost/histogram/axis.hpp | 2 +- include/boost/histogram/histogram_fwd.hpp | 4 +- .../histogram/histogram_impl_dynamic.hpp | 1 + .../boost/histogram/histogram_impl_static.hpp | 1 + include/boost/histogram/serialization.hpp | 6 +- .../histogram/storage/adaptive_storage.hpp | 58 +-------- .../boost/histogram/storage/array_storage.hpp | 112 ++++++++++++++++++ .../histogram/storage/container_storage.hpp | 99 ---------------- include/boost/histogram/storage/operators.hpp | 35 ++++++ test/adaptive_storage_test.cpp | 36 +++--- ...torage_test.cpp => array_storage_test.cpp} | 46 ++----- test/histogram_test.cpp | 33 +++--- test/speed_cpp.cpp | 19 ++- 18 files changed, 215 insertions(+), 251 deletions(-) create mode 100644 include/boost/histogram/storage/array_storage.hpp delete mode 100644 include/boost/histogram/storage/container_storage.hpp create mode 100644 include/boost/histogram/storage/operators.hpp rename test/{container_storage_test.cpp => array_storage_test.cpp} (59%) diff --git a/doc/benchmarks.qbk b/doc/benchmarks.qbk index 7aa23b61..e8bb4e52 100644 --- a/doc/benchmarks.qbk +++ b/doc/benchmarks.qbk @@ -10,13 +10,13 @@ The following plot shows results of a benchmark on a 9 GHz Macbook Pro. Random n [[root] [[@https://root.cern.ch ROOT classes] (`TH1I` for 1D, `TH3I` for 3D and `THnI` for 6D)]] [[py:numpy] [numpy functions ([python]`numpy.histogram` for 1D, `numpy.histogramdd` for 2D, 3D, and 6D)]] [[py:hd_sd] [[classref boost::histogram::histogram] with [classref boost::histogram::adaptive_storage<>], called from Python]] - [[hs_ss] [[classref boost::histogram::histogram] with [classref boost::histogram::container_storage>]]] + [[hs_ss] [[classref boost::histogram::histogram] with [classref boost::histogram::array_storage]]] [[hs_sd] [[classref boost::histogram::histogram] with [classref boost::histogram::adaptive_storage<>]]] - [[hd_ss] [[classref boost::histogram::histogram] with [classref boost::histogram::container_storage>]]] + [[hd_ss] [[classref boost::histogram::histogram] with [classref boost::histogram::array_storage]]] [[hd_sd] [[classref boost::histogram::histogram] with [classref boost::histogram::adaptive_storage<>]]] ] -[classref boost::histogram::histogram] is always faster than [classref boost::histogram::histogram] and safer to use, as more checks are done at compile time. It is recommended when working in C++ only. [classref boost::histogram::adaptive_storage] is faster than [classref boost::histogram::container_storage] for histograms with many bins, because it uses the cache more effectively due to its smaller memory consumption per bin. If the number of bins is small, it is slower because of the instruction and allocation overhead of handling memory in a dynamic way. +[classref boost::histogram::histogram] is always faster than [classref boost::histogram::histogram] and safer to use, as more checks are done at compile time. It is recommended when working in C++ only. [classref boost::histogram::adaptive_storage] is faster than [classref boost::histogram::array_storage] for histograms with many bins, because it uses the cache more effectively due to its smaller memory consumption per bin. If the number of bins is small, it is slower because of overhead of handling memory in a dynamic way. The histograms in this library are mostly faster than the competition, in some cases by a factor of 2. Simultaneously they are more flexible, since binning strategies can be customised. The Python-wrapped histogram is slower than numpy's own specialized function for 1D, but beats numpy's multi-dimensional histogramming function by a factor 2 to 3. diff --git a/doc/changelog.qbk b/doc/changelog.qbk index 85fba763..2d91c097 100644 --- a/doc/changelog.qbk +++ b/doc/changelog.qbk @@ -5,7 +5,7 @@ * Added static_histogram (v1.0 only had dynamic_histogram). * Merged wfill(...) and fill(...) interface. * Support custom allocators in storage classes. -* Replaced static_storage with container_storage, which may use any STL-compatible container with random access iterators as a backend, including std::array. +* Replaced static_storage with array_storage. * Replaced dynamic_storage with adaptive_storage, which adds the capability to grow the bin counter into a cpp_int, thus avoiding integer overflow completely. * Serialization uses binary_archive instead of text_archive. The latter is portable, but the performance is terrible. * Python interface changed: histograms are now iterable, returning axis classes diff --git a/doc/concepts.qbk b/doc/concepts.qbk index e5e88695..00f53c8f 100644 --- a/doc/concepts.qbk +++ b/doc/concepts.qbk @@ -54,7 +54,7 @@ To support weighted fills, an additional method is required: * `void weighted_increase(std::size_t index, value_type weight)` -[classref boost::histogram::container_storage] is a simple example of a storage type which does not support weighted fills. +[classref boost::histogram::array_storage] is a simple example of a storage type which does not support weighted fills. [endsect] diff --git a/doc/rationale.qbk b/doc/rationale.qbk index c48d9e84..044665be 100644 --- a/doc/rationale.qbk +++ b/doc/rationale.qbk @@ -54,7 +54,7 @@ Library users can create their own axis classes and use them with the library, b [section:storage_types Storage types] -Dense (aka contiguous) storage in memory is needed for fast bin lookup, which is of the random-access variety and may be happening in a tight loop. All storage types therefore implement dense storage of bin counters. [classref boost::histogram::container_storage] implements a storage based on an STL-conforming container and that could be the end of story, but there are several issues with this approach. For one, it is not convenient, because the user has to decide what type to use to hold the bin counts and it is not an obvious choice. The integer needs to be large enough to avoid counter overflow, but if it is too large and only a fraction of the bits are used, then it is a waste of memory. Using floating point numbers is even more dangerous. They don't overflow, but cap the bin count when the bits in the mantissa are used up. +Dense (aka contiguous) storage in memory is needed for fast bin lookup, which is of the random-access variety and may be happening in a tight loop. All storage types therefore implement dense storage of bin counters. [classref boost::histogram::array_storage] implements a storage based on a heap-allocated array. That could be the end of story, but there are several issues with this approach. For one, it is not convenient, because the user has to decide what type to use to hold the bin counts and it is not an obvious choice. The integer needs to be large enough to avoid counter overflow, but if it is too large and only a fraction of the bits are used, then it is a waste of memory. Using floating point numbers is even more dangerous. They don't overflow, but cap the bin count when the bits in the mantissa are used up. The standard storage used in the library is [classref boost::histogram::adaptive_storage], which solves these issues in an effective way, based on the following insight. diff --git a/include/boost/histogram.hpp b/include/boost/histogram.hpp index d42561f8..d1409d64 100644 --- a/include/boost/histogram.hpp +++ b/include/boost/histogram.hpp @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include /** diff --git a/include/boost/histogram/axis.hpp b/include/boost/histogram/axis.hpp index f6bf8ebf..f57852a6 100644 --- a/include/boost/histogram/axis.hpp +++ b/include/boost/histogram/axis.hpp @@ -22,11 +22,11 @@ #endif #include #include +#include #include #include #include #include -#include // forward declaration for serialization namespace boost { diff --git a/include/boost/histogram/histogram_fwd.hpp b/include/boost/histogram/histogram_fwd.hpp index b188cd09..fa527aaf 100644 --- a/include/boost/histogram/histogram_fwd.hpp +++ b/include/boost/histogram/histogram_fwd.hpp @@ -8,7 +8,6 @@ #define _BOOST_HISTOGRAM_HISTOGRAM_FWD_HPP_ #include -#include #include #include @@ -18,6 +17,9 @@ namespace histogram { using Static = std::integral_constant; using Dynamic = std::integral_constant; +template