thread/doc/rationale.html

<html>
    <head>
        <meta http-equiv="Content-Type" content=
        "text/html; charset=iso-8859-1">
        <meta name="keywords" content="threads, BTL, thread library, C++">

        <title>Boost.Threads, rationale</title>
    </head>

    <body bgcolor="#ffffff" link="#0000ff" vlink="#800080">
        <table summary="header" border="0" cellpadding="7" cellspacing="0"
        width="100%">
            <tr>
                <td valign="top" width="300">
                    <h3><img height="86" alt="C++ Boost" src=
                    "../../../c++boost.gif" width="277"></h3>
                </td>

                <td valign="top">
                    <h1 align="center">Boost.Threads</h1>

                    <h2 align="center">Rationale</h2>
                </td>
            </tr>
        </table>
        <hr>

        <p>This page explains the rationale behind various design decisions in
        the <b>Boost.Threads</b> library. Having the rationale documented here
        should explain how we arrived at the current design as well as prevent
        future rehashing of discussions and thought processes that have already
        occurred. It can also give users a lot of insight into the design
        process required for this library.</p>

        <h2><a name="library">Rationale for the Creation of
        Boost.Threads</a></h2>

        <p>Processes often have a degree of &quot;potential parallelism&quot;
        and it can often be more intuitive to design systems with this in mind.
        Further, these parallel processes can result in more responsive
        programs. The benefits for multi-threaded programming are quite well
        known to most modern programmers, yet the C++ language doesn&#39;t
        directly support this concept.</p>

        <p>Many platforms support multi-threaded programming despite the fact
        that the language doesn&#39;t support it. They do this through external
        libraries, which are, unfortunately, platform specific. POSIX has tried
        to address this problem through the standardization of a
        &quot;pthread&quot; library. However, this is a standard only on POSIX
        platforms, so its portability is limited.</p>

        <p>Another problem with POSIX and other platform specific thread
        libraries is that they are almost universally C based libraries. This
        leaves several C++ specific issues unresolved, such as what happens
        when an exception is thrown in a thread. Further, there are some C++
        concepts, such as destructors, that can make usage much easier than
        what&#39;s available in a C library.</p>

        <p>What&#39;s truly needed is C++ language support for threads.
        However, the C++ standards committee needs existing practice or a good
        proposal as a starting point for adding this to the standard.</p>

        <p>The Boost.Threads library was developed to provide a C++ developer
        with a portable interface for writing multi-threaded programs on
        numerous platforms. There&#39;s a hope that the library can be the
        basis for a more detailed proposal for the C++ standards committee to
        consider for inclusion in the next C++ standard.</p>

        <h2><a name="primitives">Rationale for the Low Level Primitives
        Supported in Boost.Threads</a></h2>

        <p>The Boost.Threads library supplies a set of low level primitives for
        writing multi-threaded programs, such as mutexes and condition variables.
        In fact, the first release of Boost.Threads supports only these low level
        primitives. However, computer science research has shown that use of these
        primitives is difficult since there&#39;s no way to mathematically prove
        that a usage pattern is correct, meaning it doesn&#39;t result in race
        conditions or deadlocks. There are several algebras (such as CSP, CCS and
        Join calculus) that have been developed to help write provably correct
        parallel processes. In order to prove the correctness these processes must
        be coded using higher level abstractions. So why does Boost.Threads support
        the lower level concepts?</p>

        <p>The reason is simple: the higher level concepts need to be
        implemented using at least some of the lower level concepts. So having
        portable lower level concepts makes it easier to develop the higher
        level concepts and will allow researchers to experiment with various
        techniques.</p>

        <p>Beyond this theoretical application of higher level concepts,
        however, the fact remains that many multi-threaded programs are written
        using only the lower level concepts, so they are useful in and of
        themselves, even if it&#39;s hard to prove that their usage is correct.
        Since many users will be familiar with these lower level concepts but
        be unfamiliar with any of the higher level concepts there&#39;s also an
        argument for accessibility.</p>

        <h2><a name="lock_objects">Rationale for the Lock Design</a></h2>

        <p>Programmers who are used to multi-threaded programming issues will
        quickly note that the Boost.Thread&#39;s design for mutex lock concepts
        is not <a href="definitions.html#Thread-safe">thread-safe</a> (this is
        clearly documented as well). At first this may seem like a serious
        design flaw. Why have a multi-threading primitive that&#39;s not
        thread-safe itself?</p>

        <p>A lock object is not a synchronization primitive. A lock
        object&#39;s sole responsibility is to ensure that a mutex is both
        locked and unlocked in a manner that won&#39;t result in the common
        error of locking a mutex and then forgetting to unlock it. This means
        that instances of a lock object are only going to be created, at least
        in theory, within block scope and won&#39;t be shared between threads.
        Only the mutex objects will be created outside of block scope and/or
        shared between threads. Though it&#39;s possible to create a lock
        object outside of block scope and to share it between threads to do so
        would not be a typical usage. Nor are there any cases when such usage
        would be required.</p>

        <p>Lock objects must maintain some state information. In order to allow
        a program to determine if a try_lock or timed_lock was successful the
        lock object must retain state indicating the success or failure of the
        call made in its constructor. If a lock object were to have such state
        and remain thread-safe it would need to synchronize access to the state
        information which would result in roughly doubling the time of most
        operations. Worse, since checking the state can occur only by a call
        after construction we&#39;d have a race condition if the lock object
        were shared between threads.</p>

        <p>So, to avoid the overhead of synchronizing access to the state
        information and to avoid the race condition the Boost.Threads library
        simply does nothing to make lock objects thread-safe. Instead, sharing
        a lock object between threads results in undefined behavior. Since the
        only proper usage of lock objects is within block scope this isn&#39;t
        a problem, and so long as the lock object is properly used there&#39;s
        no danger of any multi-threading issues.</p>

        <h2><a name="thread">Rationale for Non-copyable Thread Type</a></h2>

        <p>Programmers who are used to C libraries for multi-threaded
        programming are likely to wonder why Boost.Threads uses a non-copyable
        design for <a href="thread.html">boost::thread</a>. After all, the C
        thread types are copyable, and you often have a need for copying them
        within user code. However, careful comparison of C designs to C++
        designs shows a flaw in this logic.</p>

        <p>All C types are copyable. It is, in fact, not possible to make a
        non-copyable type in C. For this reason types that represent system
        resources in C are often designed to behave very similarly to a pointer
        to dynamic memory. There&#39;s an API for acquiring the resource and an
        API for releasing the resources. For memory we have pointers as the
        type and alloc/free for the acquisition and release APIs. For files we
        have FILE* as the type and fopen/fclose for the acquisition and release
        APIs. You can freely copy instances of the types but must manually
        manage the lifetime of the actual resource through the acquisition and
        release APIs.</p>

        <p>C++ designs recognize that the acquisition and release APIs are
        error prone and try to eliminate possible errors by acquiring the
        resource in the constructor and releasing it in the destructor. The
        best example of such a design is the std::iostream set of classes which
        can represent the same resource as the FILE* type in C. A file is
        opened in the std::fstream&#39;s constructor and closed in its
        destructor. However, if an iostream were copyable it could lead to a
        file being closed twice, an obvious error, so the std::iostream types
        are noncopyable by design. This is the same design used by
        boost::thread, which is a simple and easy to understand design
        that&#39;s consistent with other C++ standard types.</p>

        <p>During the design of boost::thread it was pointed out that it would
        be possible to allow it to be a copyable type if some form of
        &quot;reference management&quot; were used, such as ref-counting or
        ref-lists, and many argued for a boost::thread_ref design instead. The
        reasoning was that copying &quot;thread&quot; objects was a typical
        need in the C libraries, and so presumably would be in the C++
        libraries as well. It was also thought that implementations could
        provide more efficient reference management then wrappers (such as
        boost::shared_ptr) around a noncopyable thread concept. Analysis of
        whether or not these arguments would hold true don&#39;t appear to bear
        them out. To illustrate the analysis we&#39;ll first provide
        pseudo-code illustrating the six typical usage patterns of a thread
        object.</p>

        <h3>1. Simple creation of a thread.</h3>
<pre>
void foo()
{
   create_thread(&amp;bar);
}
</pre>

        <h3>2. Creation of a thread that&#39;s later joined.</h3>
<pre>
void foo()
{
   thread = create_thread(&amp;bar);
   join(thread);
}
</pre>

        <h3>3. Simple creation of several threads in a loop.</h3>
<pre>
void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      create_thread(&amp;bar);
}
</pre>

        <h3>4. Creation of several threads in a loop which are later
        joined.</h3>
<pre>
void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i] = create_thread(&amp;bar);
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i].join();
}
</pre>

        <h3>5. Creation of a thread whose ownership is passed to another
        object/method.</h3>
<pre>
void foo()
{
   thread = create_thread(&amp;bar);
   manager.owns(thread);
}
</pre>

        <h3>6. Creation of a thread whose ownership is shared between multiple
        objects.</h3>
<pre>
void foo()
{
   thread = create_thread(&amp;bar);
   manager1.add(thread);
   manager2.add(thread);
}
</pre>

        <p>Of these usage patterns there&#39;s only one that requires reference
        management (number 6). Hopefully it&#39;s fairly obvious that this
        usage pattern simply won&#39;t occur as often as the other usage
        patterns. So there really isn&#39;t a &quot;typical need&quot; for a
        thread concept, though there is some need.</p>

        <p>Since the need isn&#39;t typical we must use different criteria for
        deciding on either a thread_ref or thread design. Possible criteria
        include ease of use and performance. So let&#39;s analyze both of these
        carefully.</p>

        <p>With ease of use we can look at existing experience. The standard
        C++ objects that represent a system resource, such as std::iostream,
        are noncopyable, so we know that C++ programmers must at least be
        experienced with this design. Most C++ developers are also used to
        smart pointers such as boost::shared_ptr, so we know they can at least
        adapt to a thread_ref concept with little effort. So existing
        experience isn&#39;t going to lead us to a choice.</p>

        <p>The other thing we can look at is how difficult it is to use both
        types for the six usage patterns above. If we find it overly difficult
        to use a concept for any of the usage patterns there would be a good
        argument for choosing the other design. So we&#39;ll code all six usage
        patterns using both designs.</p>

        <h3>1.</h3>
<pre>
void foo()
{
   thread thrd(&amp;bar);
}

void foo()
{
   thread_ref thrd = create_thread(&amp;bar);
}
</pre>

        <h3>2.</h3>
<pre>
void foo()
{
   thread thrd(&amp;bar);
   thrd.join();
}

void foo()
{
   thread_ref thrd =
   create_thread(&amp;bar);thrd-&gt;join();
}
</pre>

        <h3>3.</h3>
<pre>
void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      thread thrd(&amp;bar);
}

void foo()
{
   for (int i=0; i&lt;NUM_THREADS; ++i)
      thread_ref thrd = create_thread(&amp;bar);
}
</pre>

        <h3>4.</h3>
<pre>
void foo()
{
   std::auto_ptr&lt;thread&gt; threads[NUM_THREADS];
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i] = std::auto_ptr&lt;thread&gt;(new thread(&amp;bar));
   for (int i= 0; i&lt;NUM_THREADS;
      ++i)threads[i]-&gt;join();
}

void foo()
{
   thread_ref threads[NUM_THREADS];
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads[i] = create_thread(&amp;bar);
   for (int i= 0; i&lt;NUM_THREADS;
      ++i)threads[i]-&gt;join();
}
</pre>

        <h3>5.</h3>
<pre>
void foo()
{
   thread thrd* = new thread(&amp;bar);
   manager.owns(thread);
}

void foo()
{
   thread_ref thrd = create_thread(&amp;bar);
   manager.owns(thrd);
}
</pre>

        <h3>6.</h3>
<pre>
void foo()
{
   boost::shared_ptr&lt;thread&gt; thrd(new thread(&amp;bar));
   manager1.add(thrd);
   manager2.add(thrd);
}

void foo()
{
   thread_ref thrd = create_thread(&amp;bar);
   manager1.add(thrd);
   manager2.add(thrd);
}
</pre>

        <p>This shows the usage patterns being nearly identical in complexity
        for both designs. The only actual added complexity occurs because of
        the use of operator new in (4), (5) and (6) and the use of
        std::auto_ptr and boost::shared_ptr in (4) and (6) respectively.
        However, that&#39;s not really much added complexity, and C++
        programmers are used to using these idioms any way. Some may dislike
        the presence of operator new in user code, but this can be eliminated
        by proper design of higher level concepts, such as the
        boost::thread_group class that simplifies example (4) down to:</p>
<pre>
void foo()
{
   thread_group threads;
   for (int i=0; i&lt;NUM_THREADS; ++i)
      threads.create_thread(&amp;bar);
   threads.join_all();
}
</pre>

        <p>So ease of use is really a wash and not much help in picking a
        design.</p>

        <p>So what about performance? If you look at the above code examples we
        can analyze the theoretical impact to performance that both designs
        have. For (1) we can see that platforms that don&#39;t have a
        ref-counted native thread type (POSIX, for instance) will be impacted
        by a thread_ref design. Even if the native thread type is ref-counted
        there may be an impact if more state information has to be maintained
        for concepts foreign to the native API, such as clean up stacks for
        Win32 implementations. For (2) the performance impact will be identical
        to (1). The same for (3). For (4) things get a little more interesting
        and we find that theoretically at least the thread_ref may perform
        faster since the thread design requires dynamic memory
        allocation/deallocation. However, in practice there may be dynamic
        allocation for the thread_ref design as well, it will just be hidden
        from the user. As long as the implementation has to do dynamic
        allocations the thread_ref loses again because of the reference
        management. For (5) we see the same impact as we do for (4). For (6) we
        still have a possible impact to the thread design because of dynamic
        allocation but thread_ref no longer suffers because of it&#39;s
        reference management, and in fact, theoretically at least, the
        thread_ref may do a better job of managing the references. All of this
        indicates that thread wins for (1), (2) and (3), with (4) and (5) the
        winner depends on the implementation and the platform but the thread
        design probably has a better chance, and with (6) it will again depend
        on the implementation and platform but this time we favor thread_ref
        slightly. Given all of this it&#39;s a narrow margin, but the thread
        design prevails.</p>

        <p>Given this analysis, and the fact that noncopyable objects for
        system resources are the normal designs that C++ programmers are used
        to dealing with, the Boost.Threads library has gone with a noncopyable
        design.</p>

        <h2>Rationale for not providing <i><a name="Events">Event</a>
        Variables</i></h2>

        <p><i>Event variables</i> are simply far too error-prone. <a href=
        "condition.html">Condition variables</a> are a much safer
        alternative.</p>

        <p>[Note that Graphical User Interface <i>events</i> are a different
        concept, and are not what is being discussed here.]</p>

        <p>Event variables were one of the first synchronization primitives.
        They are still used today, for example, in the native Windows
        multithreading API.</p>

        <p>Yet both respected computer science researchers and experienced
        multithreading practitioners believe event variables are so inherently
        error-prone that they should never be used, and thus should not be part
        of a multithreading library.</p>

        <p>Per Brinch Hansen <a href="bibliography.html#Brinch-Hansen-73">
        [Brinch Hansen 73]</a> analyzed event variables in some detail,
        pointing out [emphasis his] that &quot;<i>event operations force the
        programmer to be aware of the relative speeds of the sending and
        receiving processes</i>&quot;. His summary:</p>

        <blockquote>
            <p>We must therefore conclude that event variables of the previous
            type are impractical for system design. <i>The effect of an
            interaction between two processes must be independent of the speed
            at which it is carried out.</i></p>
        </blockquote>

        <p>Experienced programmers using the Windows platform today report that
        event variables are a continuing source of errors, even after previous
        bad experiences caused them to be very careful in their use of event
        variables. Overt problems can be avoided, for example, by teaming the
        event variable with a mutex, but that may just convert a <a href=
        "definitions.html#Race condition">race condition</a> into another
        problem, such as excessive resource use. One of the most distressing
        aspects of the experience reports is the claim that many defects are
        latent. That is, the programs appear to work correctly, but contain
        hidden timing dependencies which will cause them to fail when
        environmental factors or usage patterns change, altering relative
        thread timings.</p>

        <p>The decision to exclude event variables from Boost.Threads has been
        surprising to some Windows programmers. They have written programs
        which work using event variables, and wonder what the problem is. It
        seems similar to the &quot;goto considered harmful&quot; controversy of
        30 years ago. It isn&#39;t that events, like gotos, can&#39;t be made
        to work, but rather that virtually all programs using alternatives will
        be easier to write, debug, read, maintain, and be less likely to
        contain latent defects.</p>

        <p>[Rationale provided by Beman Dawes]</p>
        <hr>

        <p>Revised
        <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->05 November, 2001<!--webbot bot="Timestamp" endspan i-checksum="39359" --></p>

        <p><i>&copy; Copyright <a href="mailto:williamkempf@hotmail.com">
        William E. Kempf</a> 2001 all rights reserved.</i></p>
    </body>
</html>