From ed8fe334645659b5e731948d736250b1858d86b4 Mon Sep 17 00:00:00 2001 From: Daniel James Date: Wed, 18 Jan 2017 07:46:47 -0500 Subject: [PATCH] Update fiber documentation --- .../fiber/doc/html/fiber/performance.html | 262 ++++++++ .../libs/fiber/doc/html/fiber/stack.html | 632 ++++++++++++++++++ .../fiber/doc/html/fiber/stack/valgrind.html | 48 ++ 3 files changed, 942 insertions(+) create mode 100644 boost_1_63_0/libs/fiber/doc/html/fiber/performance.html create mode 100644 boost_1_63_0/libs/fiber/doc/html/fiber/stack.html create mode 100644 boost_1_63_0/libs/fiber/doc/html/fiber/stack/valgrind.html diff --git a/boost_1_63_0/libs/fiber/doc/html/fiber/performance.html b/boost_1_63_0/libs/fiber/doc/html/fiber/performance.html new file mode 100644 index 0000000..9ef5e63 --- /dev/null +++ b/boost_1_63_0/libs/fiber/doc/html/fiber/performance.html @@ -0,0 +1,262 @@ + + + +Performance + + + + + + + + + + + + + + + +
Boost C++ LibrariesHomeLibrariesPeopleFAQMore
+
+
+PrevUpHomeNext +
+
+ + +

+ Performance measurements were taken using std::chrono::highresolution_clock, + with overhead corrections. The code was compiled with gcc-6.20, using build + options: variant = release, optimization = speed. Tests were executed on Intel + Core i7-4770S 3.10GHz, 4 cores with 8 hyperthreads (4C/8T), running Linux (4.7.0/x86_64). +

+

+ Measurements headed 1C/1T were run in a single-threaded process. +

+

+ The microbenchmark syknet + from Alexander Temerev was ported and used for performance measurements. At + the root the test spawns 10 threads-of-execution (ToE), e.g. actor/goroutine/fiber + etc.. Each spawned ToE spawns additional 10 ToEs ... until 100000 ToEs are + created. ToEs return back ther ordinal numbers (0 ... 99999), which are summed + on the previous level and sent back upstream, until reaching the root. The + test was run 10-20 times, producing a range of values for each measurement. +

+
+

Table 1.1. time per actor/erlang process/goroutine (other languages) (average over + 100,000)

+
++++++ + + + + + + + + + + + + +
+

+ Haskell | stack-1.0.4 +

+
+

+ Erlang | erts-7.0 +

+
+

+ Go | go1.6.1 (GOMAXPROCS == default) +

+
+

+ Go | go1.6.1 (GOMAXPROCS == 8) +

+
+

+ 0.32 µs +

+
+

+ 0.64 µs - 1.21 µs +

+
+

+ 1.52 µs - 1.64 µs +

+
+

+ 0.70 µs - 0.98 µs +

+
+
+

+ std::thread can not be tested at this time (C++14) + because the API does not allow to set thread stack size (idefault on Linux + 2MB, on Windows 1MB). An out-of-memory error is likely. With pthreads the stack + size is set 8kBC. +

+
+

Table 1.2. time per thread (average over *10,000* - unable to spawn 100,000 threads)

+
++++ + + + + + + + + +
+

+ pthread +

+
+

+ std::thread +

+
+

+ 14.4 µs - 20.8 µs +

+
+

+ 18.8 µs - 28.1 µs +

+
+
+

+ The test utilizes 4 cores with Symmetric MultiThreading enabled (8 logical + CPUs). The fiber stacks are allocated by fixedsize_stack. +

+

+ As the benchmark shows, the memory allocation algorithm is significant for + performance in a multithreaded environment. The tests use glibc’s memory allocation + algorithm (based on ptmalloc2) as well as Google’s TCmalloc + (via linkflags="-ltcmalloc").[9] +

+

+ The shared_work scheduling algorithm uses one global queue, + containing fibers ready to run, shared between all threads. The work is distributed + equally over all threads. In the work_stealing scheduling + algorithm, each thread has its own local queue. Fibers that are ready to run + are pushed to and popped from the local queue. If the queue runs out of ready + fibers, fibers are stolen from the local queues of other participating threads. +

+
+

Table 1.3. time per fiber (average over 100,000)

+
++++++++ + + + + + + + + + + + + + + + + +
+

+ fiber (4C/8T, work stealing, tcmalloc) +

+
+

+ fiber (4C/8T, work stealing) +

+
+

+ fiber (4C/8T, work sharing, tcmalloc) +

+
+

+ fiber (4C/8T, work sharing) +

+
+

+ fiber (1C/1T, round robin, tcmalloc) +

+
+

+ fiber (1C/1T, round robin) +

+
+

+ 0.13 µs - 0.26 µs +

+
+

+ 0.35 µs - 0.66 µs +

+
+

+ 0.62 µs - 0.80 µs +

+
+

+ 0.90 µs - 1.11 µs +

+
+

+ 0.90 µs - 1.03 µs +

+
+

+ 0.91 µs - 1.28 µs +

+
+
+
+

+

[9] + Tais B. Ferreira, Rivalino Matias, Autran Macedo, Lucio B. Araujo An + Experimental Study on Memory Allocators in Multicore and Multithreaded Applications, + PDCAT ’11 Proceedings of the 2011 12th International Conference on Parallel + and Distributed Computing, Applications and Technologies, pages 92-98 +

+
+
+ + + +
+
+
+PrevUpHomeNext +
+ + diff --git a/boost_1_63_0/libs/fiber/doc/html/fiber/stack.html b/boost_1_63_0/libs/fiber/doc/html/fiber/stack.html new file mode 100644 index 0000000..71de2b1 --- /dev/null +++ b/boost_1_63_0/libs/fiber/doc/html/fiber/stack.html @@ -0,0 +1,632 @@ + + + +Stack allocation + + + + + + + + + + + + + + + +
Boost C++ LibrariesHomeLibrariesPeopleFAQMore
+
+
+PrevUpHomeNext +
+
+ + +

+ A fiber uses internally an execution_context + which manages a set of registers and a stack. The memory used by the stack + is allocated/deallocated via a stack_allocator which is + required to model a stack-allocator + concept. +

+

+ A stack_allocator can be passed to fiber::fiber() or to fibers::async(). +

+

+ + stack-allocator + concept +

+

+ A stack_allocator must satisfy the stack-allocator + concept requirements shown in the following table, in which a is an object of a stack_allocator + type, sctx is a stack_context, and size is a std::size_t: +

+
+++++ + + + + + + + + + + + + + + + + + + + + + + +
+

+ expression +

+
+

+ return type +

+
+

+ notes +

+
+

+ a(size) +

+
+ +

+ creates a stack allocator +

+
+

+ a.allocate() +

+
+

+ stack_context +

+
+

+ creates a stack +

+
+

+ a.deallocate( + sctx) +

+
+

+ void +

+
+

+ deallocates the stack created by a.allocate() +

+
+
+ + + + + +
[Important]Important

+ The implementation of allocate() might include logic to protect against + exceeding the context's available stack size rather than leaving it as undefined + behaviour. +

+
+ + + + + +
[Important]Important

+ Calling deallocate() + with a stack_context not obtained from + allocate() + results in undefined behaviour. +

+
+ + + + + +
[Note]Note

+ The memory for the stack is not required to be aligned; alignment takes place + inside execution_context. +

+

+ See also Boost.Context + stack allocation. In particular, traits_type + methods are as described for boost::context::stack_traits. +

+

+

+
+ + + Class + protected_fixedsize_stack +
+

+

+

+ Boost.Fiber provides the class protected_fixedsize_stack which + models the stack-allocator + concept. It appends a guard page at the end of each stack + to protect against exceeding the stack. If the guard page is accessed (read + or write operation) a segmentation fault/access violation is generated by the + operating system. +

+
+ + + + + +
[Important]Important

+ Using protected_fixedsize_stack is expensive. + Launching a new fiber with a stack of this type incurs the overhead of setting + the memory protection; once allocated, this stack is just as efficient to + use as fixedsize_stack. +

+
+ + + + + +
[Note]Note

+ The appended guard page + is not mapped to physical memory, only virtual + addresses are used. +

+
#include <boost/fiber/protected_fixedsize.hpp>
+
+namespace boost {
+namespace fibers {
+
+struct protected_fixedsize {
+    protected_fixesize(std::size_t size = traits_type::default_size());
+
+    stack_context allocate();
+
+    void deallocate( stack_context &);
+}
+
+}}
+
+

+

+
+ + + Member + function allocate() +
+

+

+
stack_context allocate();
+
+
+

+
+
Preconditions:
+

+ traits_type::minimum_size() + <= size + and traits_type::is_unbounded() + || ( + size <= + traits_type::maximum_size() + ). +

+
Effects:
+

+ Allocates memory of at least size + bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture + (the stack grows downwards/upwards) the stored address is the highest/lowest + address of the stack. +

+
+
+

+

+
+ + + Member + function deallocate() +
+

+

+
void deallocate( stack_context & sctx);
+
+
+

+
+
Preconditions:
+

+ sctx.sp is valid, traits_type::minimum_size() <= sctx.size and traits_type::is_unbounded() || ( sctx.size <= traits_type::maximum_size() ). +

+
Effects:
+

+ Deallocates the stack space. +

+
+
+

+

+
+ + + Class + pooled_fixedsize_stack +
+

+

+

+ Boost.Fiber provides the class pooled_fixedsize_stack which + models the stack-allocator + concept. In contrast to protected_fixedsize_stack it + does not append a guard page at the end of each stack. The memory is managed + internally by boost::pool<>. +

+
#include <boost/fiber/pooled_fixedsize_stack.hpp>
+
+namespace boost {
+namespace fibers {
+
+struct pooled_fixedsize_stack {
+    pooled_fixedsize_stack(std::size_t stack_size = traits_type::default_size(), std::size_t next_size = 32, std::size_t max_size = 0);
+
+    stack_context allocate();
+
+    void deallocate( stack_context &);
+}
+
+}}
+
+

+

+
+ + + Constructor +
+

+

+
pooled_fixedsize_stack(std::size_t stack_size, std::size_t next_size, std::size_t max_size);
+
+
+

+
+
Preconditions:
+

+ traits_type::is_unbounded() + || ( + traits_type::maximum_size() + >= stack_size) and 0 + < next_size. +

+
Effects:
+

+ Allocates memory of at least stack_size + bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture + (the stack grows downwards/upwards) the stored address is the highest/lowest + address of the stack. Argument next_size + determines the number of stacks to request from the system the first + time that *this + needs to allocate system memory. The third argument max_size + controls how much memory might be allocated for stacks — a value of zero + means no upper limit. +

+
+
+

+

+
+ + + Member + function allocate() +
+

+

+
stack_context allocate();
+
+
+

+
+
Preconditions:
+

+ traits_type::is_unbounded() + || ( + traits_type::maximum_size() + >= stack_size). +

+
Effects:
+

+ Allocates memory of at least stack_size + bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture + (the stack grows downwards/upwards) the stored address is the highest/lowest + address of the stack. +

+
+
+

+

+
+ + + Member + function deallocate() +
+

+

+
void deallocate( stack_context & sctx);
+
+
+

+
+
Preconditions:
+

+ sctx.sp is valid, traits_type::is_unbounded() || ( traits_type::maximum_size() >= sctx.size). +

+
Effects:
+

+ Deallocates the stack space. +

+
+
+
+ + + + + +
[Note]Note

+ This stack allocator is not thread safe. +

+

+

+
+ + + Class + fixedsize_stack +
+

+

+

+ Boost.Fiber provides the class fixedsize_stack which + models the stack-allocator + concept. In contrast to protected_fixedsize_stack it + does not append a guard page at the end of each stack. The memory is simply + managed by std::malloc() + and std::free(). +

+
#include <boost/context/fixedsize_stack.hpp>
+
+namespace boost {
+namespace fibers {
+
+struct fixedsize_stack {
+    fixedsize_stack(std::size_t size = traits_type::default_size());
+
+    stack_context allocate();
+
+    void deallocate( stack_context &);
+}
+
+}}
+
+

+

+
+ + + Member function + allocate() +
+

+

+
stack_context allocate();
+
+
+

+
+
Preconditions:
+

+ traits_type::minimum_size() + <= size + and traits_type::is_unbounded() + || ( + traits_type::maximum_size() + >= size). +

+
Effects:
+

+ Allocates memory of at least size + bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture + (the stack grows downwards/upwards) the stored address is the highest/lowest + address of the stack. +

+
+
+

+

+
+ + + Member + function deallocate() +
+

+

+
void deallocate( stack_context & sctx);
+
+
+

+
+
Preconditions:
+

+ sctx.sp is valid, traits_type::minimum_size() <= sctx.size and traits_type::is_unbounded() || ( traits_type::maximum_size() >= sctx.size). +

+
Effects:
+

+ Deallocates the stack space. +

+
+
+

+

+
+ + + Class + segmented_stack +
+

+

+

+ Boost.Fiber supports usage of a segmented_stack, + i.e. the stack grows on demand. The fiber is created with a minimal stack size + which will be increased as required. Class segmented_stack models + the stack-allocator concept. + In contrast to protected_fixedsize_stack and + fixedsize_stack it creates a stack which grows on demand. +

+
+ + + + + +
[Note]Note

+ Segmented stacks are currently only supported by gcc + from version 4.7 and clang + from version 3.4 onwards. In order to use + a segmented_stack Boost.Fiber + must be built with property segmented-stacks, + e.g. toolset=gcc segmented-stacks=on at + b2/bjam command line. +

+
#include <boost/fiber/segmented_stack.hpp>
+
+namespace boost {
+namespace fibers {
+
+struct segmented_stack {
+    segmented_stack(std::size_t stack_size = traits_type::default_size());
+
+    stack_context allocate();
+
+    void deallocate( stack_context &);
+}
+
+}}
+
+

+

+
+ + + Member function + allocate() +
+

+

+
stack_context allocate();
+
+
+

+
+
Preconditions:
+

+ traits_type::minimum_size() + <= size + and traits_type::is_unbounded() + || ( + traits_type::maximum_size() + >= size). +

+
Effects:
+

+ Allocates memory of at least size + bytes and stores a pointer to the stack and its actual size in sctx. Depending on the architecture + (the stack grows downwards/upwards) the stored address is the highest/lowest + address of the stack. +

+
+
+

+

+
+ + + Member + function deallocate() +
+

+

+
void deallocate( stack_context & sctx);
+
+
+

+
+
Preconditions:
+

+ sctx.sp is valid, traits_type::minimum_size() <= sctx.size and traits_type::is_unbounded() || ( traits_type::maximum_size() >= sctx.size). +

+
Effects:
+

+ Deallocates the stack space. +

+
+
+
+ + + + + +
[Note]Note

+ If the library is compiled for segmented stacks, segmented_stack is + the only available stack allocator. +

+
+ + + +
+
+
+PrevUpHomeNext +
+ + diff --git a/boost_1_63_0/libs/fiber/doc/html/fiber/stack/valgrind.html b/boost_1_63_0/libs/fiber/doc/html/fiber/stack/valgrind.html new file mode 100644 index 0000000..a01c651 --- /dev/null +++ b/boost_1_63_0/libs/fiber/doc/html/fiber/stack/valgrind.html @@ -0,0 +1,48 @@ + + + +Support for valgrind + + + + + + + + + + + + + + + +
Boost C++ LibrariesHomeLibrariesPeopleFAQMore
+
+
+PrevUpHomeNext +
+
+ +

+ Running programs that switch stacks under valgrind causes problems. Property + (b2 command-line) valgrind=on let + valgrind treat the memory regions as stack space which suppresses the errors. +

+
+ + + +
+
+
+PrevUpHomeNext +
+ +