2
0
mirror of https://github.com/boostorg/fiber.git synced 2026-02-21 02:52:18 +00:00
Files
fiber/libs/task/doc/work_stealing.qbk
Oliver Kowalke 39ec793737 initial checkin
2011-02-09 18:41:35 +01:00

51 lines
2.7 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
[/
Copyright Oliver Kowalke 2009.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt
]
[section:workstealing Work-Stealing]
Traditional __thread_pools__ do not scale because they use a single global-queue protected by a global-lock. The frequency at which
__worker_threads__ aquire the global-lock becomes a limiting factor for the throughput if:
* the tasks become smaller
* more processors are added
A __work_stealing__ algorithm can be used to solve this problem. It uses a special kind of queue which has two ends, and allows
lock-free pushes and pops from the ['private end] (accessed by the __worker_thread__ owning the queue), but requires synchronization
from the ['public end] (accessed by the other __worker_threads__). Synchronization is necessary when the queue is sufficiently small
that private and public operations could conflict.
The pool contains one global-queue (__bounded_queue__ or __unbounded_queue__) protected by a global-lock and each __worker_thread__
has its own private local worker-queue. If work is enqueued by a __worker_thread__ the __task__ is stored in the worker queue. If the
work is enqueued by a application thread it goes into the global queue. When __worker_threads__ are looking for work, they have
following search order:
* look into the private worker-queue - tasks can be dequeued without locks
* look in the global-queue - locks are used for synchronization
* check other worker-queues ('stealing' tasks from private worker queues of other __worker_threads__) - requires locks
For a lot of recursively queued tasks (so called __sub_tasks__), the use of a worker-queue per thread substantially reduces the
synchronization necessary to complete the work. There are also fewer cache effects due to sharing of the global-queue information.
Operations on the private worker queue are executed in LIFO order and operations on worker queues of other __worker_threads__ in
FIFO order (steals).
* There are chances that memory is still hot in the cache, if the tasks are pushed in LIFO order into the private worker queue.
* If a __worker_thread__ steals work in FIFO order, increases the chances that a larger 'chunk' of work will be stolen (the need
for other steals will be possibly reduced). Because the __sub_tasks__ are stored in LIFO order, the oldest items are closer to the
['public end] of the queue (forming a tree). Stealing such an older __task__ also steals a (probably) larger subtree of tasks
unfolded if the stolen work item get executed. Since a __sub_task__ is just part of a larger __task__, we dont need to worry about
execution order.
[endsect]