thread/doc/rationale.html

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" type="text/css" href="../../../boost.css">
<title>Boost.Threads - Rationale</title>
</head>
<body link="#0000ff" vlink="#800080">
<table border="0" cellpadding="7" cellspacing="0" width="100%" summary=
    "header">
  <tr>
    <td valign="top" width="300">
      <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../c++boost.gif" border="0"></a></h3>
    </td>
    <td valign="top">
      <h1 align="center">Boost.Threads</h1>
      <h2 align="center">Rationale</h2>
    </td>
  </tr>
</table>
<hr>
<dl class="index">
  <dt><a href="#introduction">Introduction</a></dt>
  <dt><a href="#library">Rationale for the Creation of <b>Boost.Threads</b></a></dt>
  <dt><a href="#primitives">Rationale for the Low Level Primitives Supported in
    <b>Boost.Threads</b></a></dt>
  <dt><a href="#lock_objects">Rationale for the Lock Design</a></dt>
  <dt><a href="#non-copyable">Rationale for NonCopyable Thread Type</a></dt>
  <dt><a href="#events">Rationale for not providing <i>Event Variables</i></a></dt>
</dl>
<h2><a name="introduction"></a>Introduction</h2>
<p>This page explains the rationale behind various design decisions in the <b>Boost.Threads</b>
  library. Having the rationale documented here should explain how we arrived
  at the current design as well as prevent future rehashing of discussions and
  thought processes that have already occurred. It can also give users a lot of
  insight into the design process required for this library.</p>
<h2><a name="library"></a>Rationale for the Creation of <b>Boost.Threads</b></h2>
<p>Processes often have a degree of &quot;potential parallelism&quot; and it can
  often be more intuitive to design systems with this in mind. Further, these
  parallel processes can result in more responsive programs. The benefits for
  multithreaded programming are quite well known to most modern programmers, yet
  the C++ language doesn&#39;t directly support this concept.</p>
<p>Many platforms support multithreaded programming despite the fact that the
  language doesn&#39;t support it. They do this through external libraries, which
  are, unfortunately, platform specific. POSIX has tried to address this problem
  through the standardization of a &quot;pthread&quot; library. However, this
  is a standard only on POSIX platforms, so its portability is limited.</p>
<p>Another problem with POSIX and other platform specific thread libraries is
  that they are almost universally C based libraries. This leaves several C++
  specific issues unresolved, such as what happens when an exception is thrown
  in a thread. Further, there are some C++ concepts, such as destructors, that
  can make usage much easier than what&#39;s available in a C library.</p>
<p>What&#39;s truly needed is C++ language support for threads. However, the C++
  standards committee needs existing practice or a good proposal as a starting
  point for adding this to the standard.</p>
<p>The <b>Boost.Threads</b> library was developed to provide a C++ developer with
  a portable interface for writing multithreaded programs on numerous platforms.
  There&#39;s a hope that the library can be the basis for a more detailed proposal
  for the C++ standards committee to consider for inclusion in the next C++ standard.</p>
<h2><a name="primitives"></a>Rationale for the Low Level Primitives Supported
  in <b>Boost.Threads</b></h2>
<p>The <b>Boost.Threads</b> library supplies a set of low level primitives for
  writing multithreaded programs, such as mutexes and condition variables. In
  fact, the first release of <b>Boost.Threads</b> supports only these low level
  primitives. However, computer science research has shown that use of these primitives
  is difficult since it's difficult to mathematically prove that a usage pattern
  is correct, meaning it doesn&#39;t result in race conditions or deadlocks. There
  are several algebras (such as CSP, CCS and Join calculus) that have been developed
  to help write provably correct parallel processes. In order to prove the correctness
  these processes must be coded using higher level abstractions. So why does <b>Boost.Threads</b>
  support the lower level concepts?</p>
<p>The reason is simple: the higher level concepts need to be implemented using
  at least some of the lower level concepts. So having portable lower level concepts
  makes it easier to develop the higher level concepts and will allow researchers
  to experiment with various techniques.</p>
<p>Beyond this theoretical application of higher level concepts, however, the
  fact remains that many multithreaded programs are written using only the lower
  level concepts, so they are useful in and of themselves, even if it&#39;s hard
  to prove that their usage is correct. Since many users will be familiar with
  these lower level concepts but be unfamiliar with any of the higher level concepts
  there&#39;s also an argument for accessibility.</p>
<h2><a name="lock_objects"></a>Rationale for the Lock Design</h2>
<p>Programmers who are used to multithreaded programming issues will quickly note
  that the Boost.Thread&#39;s design for mutex lock concepts is not <a href="definitions.html#Thread-safe">thread-safe</a>
  (this is clearly documented as well). At first this may seem like a serious
  design flaw. Why have a multithreading primitive that&#39;s not thread-safe
  itself?</p>
<p>A lock object is not a synchronization primitive. A lock object&#39;s sole
  responsibility is to ensure that a mutex is both locked and unlocked in a manner
  that won&#39;t result in the common error of locking a mutex and then forgetting
  to unlock it. This means that instances of a lock object are only going to be
  created, at least in theory, within block scope and won&#39;t be shared between
  threads. Only the mutex objects will be created outside of block scope and/or
  shared between threads. Though it&#39;s possible to create a lock object outside
  of block scope and to share it between threads to do so would not be a typical
  usage (in fact, to do so would likely be an error). Nor are there any cases
  when such usage would be required.</p>
<p>Lock objects must maintain some state information. In order to allow a program
  to determine if a try_lock or timed_lock was successful the lock object must
  retain state indicating the success or failure of the call made in its constructor.
  If a lock object were to have such state and remain thread-safe it would need
  to synchronize access to the state information which would result in roughly
  doubling the time of most operations. Worse, since checking the state can occur
  only by a call after construction we&#39;d have a race condition if the lock
  object were shared between threads.</p>
<p>So, to avoid the overhead of synchronizing access to the state information
  and to avoid the race condition the <b>Boost.Threads</b> library simply does
  nothing to make lock objects thread-safe. Instead, sharing a lock object between
  threads results in undefined behavior. Since the only proper usage of lock objects
  is within block scope this isn&#39;t a problem, and so long as the lock object
  is properly used there&#39;s no danger of any multithreading issues.</p>
<h2><a name="non-copyable"></a>Rationale for NonCopyable Thread Type</h2>
<p>Programmers who are used to C libraries for multithreaded programming are likely
  to wonder why <b>Boost.Threads</b> uses a noncopyable design for <a href="thread.html">boost::thread</a>.
  After all, the C thread types are copyable, and you often have a need for copying
  them within user code. However, careful comparison of C designs to C++ designs
  shows a flaw in this logic.</p>
<p>All C types are copyable. It is, in fact, not possible to make a noncopyable
  type in C. For this reason types that represent system resources in C are often
  designed to behave very similarly to a pointer to dynamic memory. There&#39;s
  an API for acquiring the resource and an API for releasing the resources. For
  memory we have pointers as the type and alloc/free for the acquisition and release
  APIs. For files we have FILE* as the type and fopen/fclose for the acquisition
  and release APIs. You can freely copy instances of the types but must manually
  manage the lifetime of the actual resource through the acquisition and release
  APIs.</p>
<p>C++ designs recognize that the acquisition and release APIs are error prone
  and try to eliminate possible errors by acquiring the resource in the constructor
  and releasing it in the destructor. The best example of such a design is the
  std::iostream set of classes which can represent the same resource as the FILE*
  type in C. A file is opened in the std::fstream&#39;s constructor and closed
  in its destructor. However, if an iostream were copyable it could lead to a
  file being closed twice, an obvious error, so the std::iostream types are noncopyable
  by design. This is the same design used by boost::thread, which is a simple
  and easy to understand design that&#39;s consistent with other C++ standard
  types.</p>
<p>During the design of boost::thread it was pointed out that it would be possible
  to allow it to be a copyable type if some form of &quot;reference management&quot;
  were used, such as ref-counting or ref-lists, and many argued for a boost::thread_ref
  design instead. The reasoning was that copying &quot;thread&quot; objects was
  a typical need in the C libraries, and so presumably would be in the C++ libraries
  as well. It was also thought that implementations could provide more efficient
  reference management than wrappers (such as boost::shared_ptr) around a noncopyable
  thread concept. Analysis of whether or not these arguments would hold true doesn&#39;t
  appear to bear them out. To illustrate the analysis we&#39;ll first provide
  pseudo-code illustrating the six typical usage patterns of a thread object.</p>
<h3>1. Simple creation of a thread.</h3>
<pre>
void foo()
{
   create_thread(&amp;bar);
}
</pre>
<h3>2. Creation of a thread that's later joined.</h3>
<pre>
Void foo()
{
   thread = create_thread(&amp;bar);
   join(thread);
}
</pre>
<h3>3. Simple creation of several threads in a loop.</h3>
<pre>
Void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      create_thread(&amp;bar);
}
</pre>
<h3>4. Creation of several threads in a loop which are later joined.</h3>
<pre>
Void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i] = create_thread(&amp;bar);
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i].join();
}
</pre>
<h3>5. Creation of a thread whose ownership is passed to another object/method.</h3>
<pre>
Void foo()
{
   thread = create_thread(&amp;bar);
   manager.owns(thread);
}
</pre>
<h3>6. Creation of a thread whose ownership is shared between multiple objects.</h3>
<pre>
Void foo()
{
   thread = create_thread(&amp;bar);
   manager1.add(thread);
   manager2.add(thread);
}
</pre>
<p>Of these usage patterns there&#39;s only one that requires reference management
  (number 6). Hopefully it&#39;s fairly obvious that this usage pattern simply
  won&#39;t occur as often as the other usage patterns. So there really isn&#39;t
  a &quot;typical need&quot; for a thread concept, though there is some need.</p>
<p>Since the need isn&#39;t typical we must use different criteria for deciding
  on either a thread_ref or thread design. Possible criteria include ease of use
  and performance. So let&#39;s analyze both of these carefully.</p>
<p>With ease of use we can look at existing experience. The standard C++ objects
  that represent a system resource, such as std::iostream, are noncopyable, so
  we know that C++ programmers must at least be experienced with this design.
  Most C++ developers are also used to smart pointers such as boost::shared_ptr,
  so we know they can at least adapt to a thread_ref concept with little effort.
  So existing experience isn&#39;t going to lead us to a choice.</p>
<p>The other thing we can look at is how difficult it is to use both types for
  the six usage patterns above. If we find it overly difficult to use a concept
  for any of the usage patterns there would be a good argument for choosing the
  other design. So we&#39;ll code all six usage patterns using both designs.</p>
<h3>1.</h3>
<pre>
void foo()
{
   thread thrd(&amp;bar);
}

void foo()
{
   thread_ref thrd = create_thread(&amp;bar);
}
</pre>
<h3>2.</h3>
<pre>
void foo()
{
   thread thrd(&amp;bar);
   thrd.join();
}

void foo()
{
   thread_ref thrd =
   create_thread(&amp;bar);thrd-&gt;join();
}
</pre>
<h3>3.</h3>
<pre>
void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      thread thrd(&amp;bar);
}

void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      thread_ref thrd = create_thread(&amp;bar);
}
</pre>
<h3>4.</h3>
<pre>
void foo()
{
   std::auto_ptr&lt;thread&gt; threads[NUM_THREADS];
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i] = std::auto_ptr&lt;thread&gt;(new thread(&amp;bar));
   for (int i= 0; i&lt;NUM_THREADS;
      ++i)threads[i]-&gt;join();
}

void foo()
{
   thread_ref threads[NUM_THREADS];
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i] = create_thread(&amp;bar);
   for (int i= 0; i&lt;NUM_THREADS;
      ++i)threads[i]-&gt;join();
}
</pre>
<h3>5.</h3>
<pre>
void foo()
{
   thread thrd* = new thread(&amp;bar);
   manager.owns(thread);
}

void foo()
{
   thread_ref thrd = create_thread(&amp;bar);
   manager.owns(thrd);
}
</pre>
<h3>6.</h3>
<pre>
void foo()
{
   boost::shared_ptr&lt;thread&gt; thrd(new thread(&amp;bar));
   manager1.add(thrd);
   manager2.add(thrd);
}

void foo()
{
   thread_ref thrd = create_thread(&amp;bar);
   manager1.add(thrd);
   manager2.add(thrd);
}
</pre>
<p>This shows the usage patterns being nearly identical in complexity for both
  designs. The only actual added complexity occurs because of the use of operator
  new in (4), (5) and (6) and the use of std::auto_ptr and boost::shared_ptr in
  (4) and (6) respectively. However, that&#39;s not really much added complexity,
  and C++ programmers are used to using these idioms any way. Some may dislike
  the presence of operator new in user code, but this can be eliminated by proper
  design of higher level concepts, such as the boost::thread_group class that
  simplifies example (4) down to:</p>
<pre>
void foo()
{
   thread_group threads;
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads.create_thread(&amp;bar);
   threads.join_all();
}
</pre>
<p>So ease of use is really a wash and not much help in picking a design.</p>
<p>So what about performance? If you look at the above code examples we can analyze
  the theoretical impact to performance that both designs have. For (1) we can
  see that platforms that don&#39;t have a ref-counted native thread type (POSIX,
  for instance) will be impacted by a thread_ref design. Even if the native thread
  type is ref-counted there may be an impact if more state information has to
  be maintained for concepts foreign to the native API, such as clean up stacks
  for Win32 implementations. For (2) the performance impact will be identical
  to (1). The same for (3). For (4) things get a little more interesting and we
  find that theoretically at least the thread_ref may perform faster since the
  thread design requires dynamic memory allocation/deallocation. However, in practice
  there may be dynamic allocation for the thread_ref design as well, it will just
  be hidden from the user. As long as the implementation has to do dynamic allocations
  the thread_ref loses again because of the reference management. For (5) we see
  the same impact as we do for (4). For (6) we still have a possible impact to
  the thread design because of dynamic allocation but thread_ref no longer suffers
  because of its reference management, and in fact, theoretically at least, the
  thread_ref may do a better job of managing the references. All of this indicates
  that thread wins for (1), (2) and (3), with (4) and (5) the winner depends on
  the implementation and the platform but the thread design probably has a better
  chance, and with (6) it will again depend on the implementation and platform
  but this time we favor thread_ref slightly. Given all of this it&#39;s a narrow
  margin, but the thread design prevails.</p>
<p>Given this analysis, and the fact that noncopyable objects for system resources
  are the normal designs that C++ programmers are used to dealing with, the <b>Boost.Threads</b>
  library has gone with a noncopyable design.</p>
<h2><a name="events"></a>Rationale for not providing <i>Event Variables</i></h2>
<p><i>Event variables</i> are simply far too error-prone. <a href=
        "condition.html">Condition variables</a> are a much safer alternative.</p>
<p>[Note that Graphical User Interface <i>events</i> are a different concept,
  and are not what is being discussed here.]</p>
<p>Event variables were one of the first synchronization primitives. They are
  still used today, for example, in the native Windows multithreading API.</p>
<p>Yet both respected computer science researchers and experienced multithreading
  practitioners believe event variables are so inherently error-prone that they
  should never be used, and thus should not be part of a multithreading library.</p>
<p>Per Brinch Hansen <a href="bibliography.html#Brinch-Hansen-73"> [Brinch Hansen
  73]</a> analyzed event variables in some detail, pointing out [emphasis his]
  that &quot;<i>event operations force the programmer to be aware of the relative
  speeds of the sending and receiving processes</i>&quot;. His summary:</p>
<blockquote>
  <p>We must therefore conclude that event variables of the previous type are
    impractical for system design. <i>The effect of an interaction between two
    processes must be independent of the speed at which it is carried out.</i></p>
</blockquote>
<p>Experienced programmers using the Windows platform today report that event
  variables are a continuing source of errors, even after previous bad experiences
  caused them to be very careful in their use of event variables. Overt problems
  can be avoided, for example, by teaming the event variable with a mutex, but
  that may just convert a <a href=
        "definitions.html#Race condition">race condition</a> into another problem,
  such as excessive resource use. One of the most distressing aspects of the experience
  reports is the claim that many defects are latent. That is, the programs appear
  to work correctly, but contain hidden timing dependencies which will cause them
  to fail when environmental factors or usage patterns change, altering relative
  thread timings.</p>
<p>The decision to exclude event variables from <b>Boost.Threads</b> has been
  surprising to some Windows programmers. They have written programs which work
  using event variables, and wonder what the problem is. It seems similar to the
  &quot;goto considered harmful&quot; controversy of 30 years ago. It isn&#39;t
  that events, like gotos, can&#39;t be made to work, but rather that virtually
  all programs using alternatives will be easier to write, debug, read, maintain,
  and be less likely to contain latent defects.</p>
<p>[Rationale provided by Beman Dawes]</p>
<hr>
<p>Revised
  <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->
  05 November, 2001
  <!--webbot bot="Timestamp" endspan i-checksum="39359" -->
</p>
<p><i>&copy; Copyright <a href="mailto:wekempf@cox.net">William E. Kempf</a> 2001-2002.
  All Rights Reserved.</i></p>
<p>Permission to use, copy, modify, distribute and sell this software and its
  documentation for any purpose is hereby granted without fee, provided that the
  above copyright notice appear in all copies and that both that copyright notice
  and this permission notice appear in supporting documentation. William E. Kempf
  makes no representations about the suitability of this software for any purpose.
  It is provided &quot;as is&quot; without express or implied warranty.</p>
</body>
</html>