2
0
mirror of https://github.com/boostorg/thread.git synced 2026-01-23 18:12:12 +00:00
Files
thread/doc/rationale.xml
William E. Kempf 9cb9224b3c More BoostBook changes.
[SVN r18149]
2003-04-01 15:29:55 +00:00

410 lines
20 KiB
XML

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE library PUBLIC "-//Boost//DTD BoostBook XML V1.0//EN"
"http://www.boost.org/tools/boostbook/dtd/boostbook.dtd" [
<!ENTITY % threads.entities SYSTEM "entities.xml">
%threads.entities;
]>
<section id="threads.rationale" last-revision="$Date$">
<title>Rationale</title>
<para>This page explains the rationale behind various design decisions in the
&Boost.Threads; library. Having the rationale documented here should explain
how we arrived at the current design as well as prevent future rehashing of
discussions and thought processes that have already occurred. It can also give
users a lot of insight into the design process required for this
library.</para>
<section id="threads.rationale.Boost.Threads">
<title>Rationale for the Creation of &Boost.Threads;</title>
<para>Processes often have a degree of "potential parallelism" and it can
often be more intuitive to design systems with this in mind. Further, these
parallel processes can result in more responsive programs. The benefits for
multithreaded programming are quite well known to most modern programmers,
yet the C++ language doesn't directly support this concept.</para>
<para>Many platforms support multithreaded programming despite the fact that
the language doesn't support it. They do this through external libraries,
which are, unfortunately, platform specific. POSIX has tried to address this
problem through the standardization of a "pthread" library. However, this is
a standard only on POSIX platforms, so its portability is limited.</para>
<para>Another problem with POSIX and other platform specific thread
libraries is that they are almost universally C based libraries. This leaves
several C++ specific issues unresolved, such as what happens when an
exception is thrown in a thread. Further, there are some C++ concepts, such
as destructors, that can make usage much easier than what's available in a C
library.</para>
<para>What's truly needed is C++ language support for threads. However, the
C++ standards committee needs existing practice or a good proposal as a
starting point for adding this to the standard.</para>
<para>The &Boost.Threads; library was developed to provide a C++ developer
with a portable interface for writing multithreaded programs on numerous
platforms. There's a hope that the library can be the basis for a more
detailed proposal for the C++ standards committee to consider for inclusion
in the next C++ standard.</para>
</section>
<section id="threads.rationale.primitives">
<title>Rationale for the Low Level Primitives Supported in &Boost.Threads;</title>
<para>The &Boost.Threads; library supplies a set of low level primitives for
writing multithreaded programs, such as mutexes and condition variables. In
fact, the first release of &Boost.Threads; supports only these low level
primitives. However, computer science research has shown that use of these
primitives is difficult since it's difficult to mathematically prove that a
usage pattern is correct, meaning it doesn't result in race conditions or
deadlocks. There are several algebras (such as CSP, CCS and Join calculus)
that have been developed to help write provably correct parallel
processes. In order to prove the correctness these processes must be coded
using higher level abstractions. So why does &Boost.Threads; support the
lower level concepts?</para>
<para>The reason is simple: the higher level concepts need to be implemented
using at least some of the lower level concepts. So having portable lower
level concepts makes it easier to develop the higher level concepts and will
allow researchers to experiment with various techniques.</para>
<para>Beyond this theoretical application of higher level concepts, however,
the fact remains that many multithreaded programs are written using only the
lower level concepts, so they are useful in and of themselves, even if it's
hard to prove that their usage is correct. Since many users will be familiar
with these lower level concepts but be unfamiliar with any of the higher
level concepts there's also an argument for accessibility.</para>
</section>
<section id="threads.rationale.locks">
<title>Rationale for the Lock Design</title>
<para>Programmers who are used to multithreaded programming issues will
quickly note that the Boost.Thread's design for mutex lock concepts is not
<link linkend="threads.glossary.thread-safe">thread-safe</link> (this is
clearly documented as well). At first this may seem like a serious design
flaw. Why have a multithreading primitive that's not thread-safe
itself?</para>
<para>A lock object is not a synchronization primitive. A lock object's sole
responsibility is to ensure that a mutex is both locked and unlocked in a
manner that won't result in the common error of locking a mutex and then
forgetting to unlock it. This means that instances of a lock object are only
going to be created, at least in theory, within block scope and won't be
shared between threads. Only the mutex objects will be created outside of
block scope and/or shared between threads. Though it's possible to create a
lock object outside of block scope and to share it between threads to do so
would not be a typical usage (in fact, to do so would likely be an
error). Nor are there any cases when such usage would be required.</para>
<para>Lock objects must maintain some state information. In order to allow a
program to determine if a try_lock or timed_lock was successful the lock
object must retain state indicating the success or failure of the call made
in its constructor. If a lock object were to have such state and remain
thread-safe it would need to synchronize access to the state information
which would result in roughly doubling the time of most operations. Worse,
since checking the state can occur only by a call after construction we'd
have a race condition if the lock object were shared between threads.</para>
<para>So, to avoid the overhead of synchronizing access to the state
information and to avoid the race condition the &Boost.Threads; library
simply does nothing to make lock objects thread-safe. Instead, sharing a
lock object between threads results in undefined behavior. Since the only
proper usage of lock objects is within block scope this isn't a problem, and
so long as the lock object is properly used there's no danger of any
multithreading issues.</para>
</section>
<section id="threads.rationale.non-copyable">
<title>Rationale for NonCopyable Thread Type</title>
<para>Programmers who are used to C libraries for multithreaded programming
are likely to wonder why &Boost.Threads; uses a noncopyable design for
<classname>boost::thread</classname>. After all, the C thread types are
copyable, and you often have a need for copying them within user
code. However, careful comparison of C designs to C++ designs shows a flaw
in this logic.</para>
<para>All C types are copyable. It is, in fact, not possible to make a
noncopyable type in C. For this reason types that represent system resources
in C are often designed to behave very similarly to a pointer to dynamic
memory. There's an API for acquiring the resource and an API for releasing
the resources. For memory we have pointers as the type and alloc/free for
the acquisition and release APIs. For files we have FILE* as the type and
fopen/fclose for the acquisition and release APIs. You can freely copy
instances of the types but must manually manage the lifetime of the actual
resource through the acquisition and release APIs.</para>
<para>C++ designs recognize that the acquisition and release APIs are error
prone and try to eliminate possible errors by acquiring the resource in the
constructor and releasing it in the destructor. The best example of such a
design is the std::iostream set of classes which can represent the same
resource as the FILE* type in C. A file is opened in the std::fstream's
constructor and closed in its destructor. However, if an iostream were
copyable it could lead to a file being closed twice, an obvious error, so
the std::iostream types are noncopyable by design. This is the same design
used by boost::thread, which is a simple and easy to understand design
that's consistent with other C++ standard types.</para>
<para>During the design of boost::thread it was pointed out that it would be
possible to allow it to be a copyable type if some form of "reference
management" were used, such as ref-counting or ref-lists, and many argued
for a boost::thread_ref design instead. The reasoning was that copying
"thread" objects was a typical need in the C libraries, and so presumably
would be in the C++ libraries as well. It was also thought that
implementations could provide more efficient reference management than
wrappers (such as boost::shared_ptr) around a noncopyable thread
concept. Analysis of whether or not these arguments would hold true doesn't
appear to bear them out. To illustrate the analysis we'll first provide
pseudo-code illustrating the six typical usage patterns of a thread
object.</para>
<section id="threads.rationale.non-copyable.simple">
<title>1. Simple creation of a thread.</title>
<programlisting>
void foo()
{
create_thread(&amp;bar);
}
</programlisting>
</section>
<section id="threads.rationale.non-copyable.joined">
<title>2. Creation of a thread that's later joined.</title>
<programlisting>
void foo()
{
thread = create_thread(&amp;bar);
join(thread);
}
</programlisting>
</section>
<section id="threads.rationale.non-copyable.loop">
<title>3. Simple creation of several threads in a loop.</title>
<programlisting>
void foo()
{
for (int i=0; i&lt;NUM_THREADS; ++i)
create_thread(&amp;bar);
}
</programlisting>
</section>
<section id="threads.rationale.non-copyable.loop-join">
<title>4. Creation of several threads in a loop which are later joined.</title>
<programlisting>
void foo()
{
for (int i=0; i&lt;NUM_THREADS; ++i)
threads[i] = create_thread(&amp;bar);
for (int i=0; i&lt;NUM_THREADS; ++i)
threads[i].join();
}
</programlisting>
</section>
<section id="threads.rationale.non-copyable.pass">
<title>5. Creation of a thread whose ownership is passed to another object/method.</title>
<programlisting>
void foo()
{
thread = create_thread(&amp;bar);
manager.owns(thread);
}
</programlisting>
</section>
<section id="threads.rationale.non-copyable.shared">
<title>6. Creation of a thread whose ownership is shared between multiple
objects.</title>
<programlisting>
void foo()
{
thread = create_thread(&amp;bar);
manager1.add(thread);
manager2.add(thread);
}
</programlisting>
</section>
<para>Of these usage patterns there's only one that requires reference
management (number 6). Hopefully it's fairly obvious that this usage pattern
simply won't occur as often as the other usage patterns. So there really
isn't a "typical need" for a thread concept, though there is some
need.</para>
<para>Since the need isn't typical we must use different criteria for
deciding on either a thread_ref or thread design. Possible criteria include
ease of use and performance. So let's analyze both of these
carefully.</para>
<para>With ease of use we can look at existing experience. The standard C++
objects that represent a system resource, such as std::iostream, are
noncopyable, so we know that C++ programmers must at least be experienced
with this design. Most C++ developers are also used to smart pointers such
as boost::shared_ptr, so we know they can at least adapt to a thread_ref
concept with little effort. So existing experience isn't going to lead us to
a choice.</para>
<para>The other thing we can look at is how difficult it is to use both
types for the six usage patterns above. If we find it overly difficult to
use a concept for any of the usage patterns there would be a good argument
for choosing the other design. So we'll code all six usage patterns using
both designs.</para>
<section>
<title>1.</title>
<programlisting>
void foo()
{
thread thrd(&amp;bar);
}
void foo()
{
thread_ref thrd = create_thread(&amp;bar);
}
</programlisting>
</section>
<section>
<title>2.</title>
<programlisting>
void foo()
{
thread thrd(&amp;bar);
thrd.join();
}
void foo()
{
thread_ref thrd =
create_thread(&amp;bar);thrd-&gt;join();
}
</programlisting>
</section>
<section>
<title>3.</title>
<programlisting>
void foo()
{
for (int i=0; i&lt;NUM_THREADS; ++i)
thread thrd(&amp;bar);
}
void foo()
{
for (int i=0; i&lt;NUM_THREADS; ++i)
thread_ref thrd = create_thread(&amp;bar);
}
</programlisting>
</section>
<section>
<title>4.</title>
<programlisting>
void foo()
{
std::auto_ptr&lt;thread&gt; threads[NUM_THREADS];
for (int i=0; i&lt;NUM_THREADS; ++i)
threads[i] = std::auto_ptr&lt;thread&gt;(new thread(&amp;bar));
for (int i= 0; i&lt;NUM_THREADS;
++i)threads[i]-&gt;join();
}
void foo()
{
thread_ref threads[NUM_THREADS];
for (int i=0; i&lt;NUM_THREADS; ++i)
threads[i] = create_thread(&amp;bar);
for (int i= 0; i&lt;NUM_THREADS;
++i)threads[i]-&gt;join();
}
</programlisting>
</section>
<section>
<title>5.</title>
<programlisting>
void foo()
{
thread thrd* = new thread(&amp;bar);
manager.owns(thread);
}
void foo()
{
thread_ref thrd = create_thread(&amp;bar);
manager.owns(thrd);
}
</programlisting>
</section>
<section>
<title>6.</title>
<programlisting>
void foo()
{
boost::shared_ptr&lt;thread&gt; thrd(new thread(&amp;bar));
manager1.add(thrd);
manager2.add(thrd);
}
void foo()
{
thread_ref thrd = create_thread(&amp;bar);
manager1.add(thrd);
manager2.add(thrd);
}
</programlisting>
</section>
<para>This shows the usage patterns being nearly identical in complexity for
both designs. The only actual added complexity occurs because of the use of
operator new in (4), (5) and (6) and the use of std::auto_ptr and
boost::shared_ptr in (4) and (6) respectively. However, that's not really
much added complexity, and C++ programmers are used to using these idioms
any way. Some may dislike the presence of operator new in user code, but
this can be eliminated by proper design of higher level concepts, such as
the boost::thread_group class that simplifies example (4) down to:</para>
<programlisting>
void foo()
{
thread_group threads;
for (int i=0; i&lt;NUM_THREADS; ++i)
threads.create_thread(&amp;bar);
threads.join_all();
}
</programlisting>
<para>So ease of use is really a wash and not much help in picking a
design.</para>
<para>So what about performance? If you look at the above code examples we
can analyze the theoretical impact to performance that both designs
have. For (1) we can see that platforms that don't have a ref-counted native
thread type (POSIX, for instance) will be impacted by a thread_ref
design. Even if the native thread type is ref-counted there may be an impact
if more state information has to be maintained for concepts foreign to the
native API, such as clean up stacks for Win32 implementations. For (2) the
performance impact will be identical to (1). The same for (3). For (4)
things get a little more interesting and we find that theoretically at least
the thread_ref may perform faster since the thread design requires dynamic
memory allocation/deallocation. However, in practice there may be dynamic
allocation for the thread_ref design as well, it will just be hidden from
the user. As long as the implementation has to do dynamic allocations the
thread_ref loses again because of the reference management. For (5) we see
the same impact as we do for (4). For (6) we still have a possible impact to
the thread design because of dynamic allocation but thread_ref no longer
suffers because of its reference management, and in fact, theoretically at
least, the thread_ref may do a better job of managing the references. All of
this indicates that thread wins for (1), (2) and (3), with (4) and (5) the
winner depends on the implementation and the platform but the thread design
probably has a better chance, and with (6) it will again depend on the
implementation and platform but this time we favor thread_ref
slightly. Given all of this it's a narrow margin, but the thread design
prevails.</para>
<para>Given this analysis, and the fact that noncopyable objects for system
resources are the normal designs that C++ programmers are used to dealing
with, the &Boost.Threads; library has gone with a noncopyable design.</para>
</section>
<section id="threads.rationale.events">
<title>Rationale for not providing <emphasis>Event Variables</emphasis></title>
<para><emphasis>Event variables</emphasis> are simply far too
error-prone. <classname>boost::condition</classname> variables are a much safer
alternative.</para>
<para>[Note that Graphical User Interface <emphasis>events</emphasis> are a
different concept, and are not what is being discussed here.]</para>
<para>Event variables were one of the first synchronization primitives. They
are still used today, for example, in the native Windows multithreading
API.</para>
<para>Yet both respected computer science researchers and experienced
multithreading practitioners believe event variables are so inherently
error-prone that they should never be used, and thus should not be part of a
multithreading library.</para>
<para>Per Brinch Hansen &cite.Hansen73; analyzed event variables in some
detail, pointing out [emphasis his] that "<emphasis>event operations force
the programmer to be aware of the relative speeds of the sending and
receiving processes</emphasis>". His summary:</para>
<blockquote>
<para>We must therefore conclude that event variables of the previous type
are impractical for system design. <emphasis>The effect of an interaction
between two processes must be independent of the speed at which it is
carried out.</emphasis></para>
</blockquote>
<para>Experienced programmers using the Windows platform today report that
event variables are a continuing source of errors, even after previous bad
experiences caused them to be very careful in their use of event
variables. Overt problems can be avoided, for example, by teaming the event
variable with a mutex, but that may just convert a <link
linkend="threads.glossary.race-condition">race condition</link> into another
problem, such as excessive resource use. One of the most distressing aspects
of the experience reports is the claim that many defects are latent. That
is, the programs appear to work correctly, but contain hidden timing
dependencies which will cause them to fail when environmental factors or
usage patterns change, altering relative thread timings.</para>
<para>The decision to exclude event variables from &Boost.Threads; has been
surprising to some Windows programmers. They have written programs which
work using event variables, and wonder what the problem is. It seems similar
to the "goto considered harmful" controversy of 30 years ago. It isn't that
events, like gotos, can't be made to work, but rather that virtually all
programs using alternatives will be easier to write, debug, read, maintain,
and be less likely to contain latent defects.</para>
<para>[Rationale provided by Beman Dawes]</para>
</section>
</section>