diff --git a/doc/index.html b/doc/index.html index 5260835d..8fdea3b0 100644 --- a/doc/index.html +++ b/doc/index.html @@ -46,6 +46,7 @@
|
+ |
+
+ Boost.Threads+Rationale+ |
+
This page explains the rationale behind various design decisions in the Boost.Threads +library. Having the rationale documented here should explain how we arrived at the current +design as well as prevent future rehashing of discussions and thought processes that have +already occured. It can also give users a lot of insight into the design process required +for this library.
+ +Processes often have a degree of "potential parallelism" and it can often be more intuitive +to design systems with this in mind. Further, these parallel processes can result in more responsive +programs. The benefits for multi-threaded programming are quite well known to most modern programmers, +yet the C++ language doesn't directly support this concept.
+ +Many platforms support multi-threaded programming despite the fact that the language doesn't support +it. They do this through external libraries, which are, unfortunately, platform specific. POSIX has +tried to address this problem through the standardization of a "pthread" library. However, this +is a standard only on POSIX platforms, so its portability is limited.
+ +Another problem with POSIX and other platform specific thread libraries is that they are +almost universally C based libraries. This leaves several C++ specific issues unresolved, such +as what happens when an exception is thrown in a thread. Further, there are some C++ concepts, +such as destructors, that can make usage much easier than what's available in a C library.
+ +What's truly needed is C++ language support for threads. However, the C++ standards committee needs +existing practice or a good proposal as a starting point for adding this to the standard.
+ +The Boost.Threads library was developed to provide a C++ developer with a portable interface +for writing multi-threaded programs on numerous platforms. There's a hope that the library can +be the basis for a more detailed proposal for the C++ standards committee to consider for inclusion +in the next C++ standard.
+ +The Boost.Threads library supplies a set of low level primitives for writing multi-threaded +programs, such as semaphores, mutexes and condition variables. In fact, the first release of +Boost.Threads supports only these low level primitives. However, scientific research has shown +that use of these primitives is difficult since there's no way to mathematically prove that a +usage pattern is correct, meaning it doesn't result in race conditions or deadlocks. There +are several algebras (such as CSP, CCS and Join calculus) that have been developed to help write +provably correct parallel processes. In order to prove the correctness these processes must +be coded using higher level abstractions. So why does Boost.Threads support the lower level +concepts?
+ +The reason is simple: the higher level concepts need to be implemented using at least some +of the lower level concepts. So having portable lower level concepts makes it easier to develop +the higher level concepts and will allow researchers to experiment with various techniques.
+ +Beyond this theoretical application of higher level concepts, however, the fact remains that +many multi-threaded programs are written using only the lower level concepts, so they are +useful in and of themselves, even if it's hard to prove that their usage is correct. Since +many users will be familiar with these lower level concepts but be unfamiliar with any of the +higher level concepts there's also an argument for accessibility.
+ +Programmers who are used to multi-threaded programming issues will quickly note that the +Boost.Thread's design for mutex lock concepts is not thread safe (this is clearly documented +as well). At first this may seem like a serious design flaw. Why have a multi-threading primitive +that's not thread safe itself?
+ +A lock object is not a synchronization primitive. A lock object's sole responsibility is +to insure that a mutex is both locked and unlocked in a manner that won't result in the common +error of locking a mutex and then forgetting to unlock it. This means that instances of a +lock object are only going to be created, at least in theory, within block scope and won't +be shared between threads. Only the mutex objects will be created outside of block scope and/or +shared between threads. Though it's possible to create a lock object outside of block scope and +to share it between threads to do so would not be a typical usage. Nor are there any cases when +such usage would be required.
+ +Lock objects must maintain some state information. In order to allow a program to determine +if a try_lock or timed_lock was successful the lock object must retain state indicating +the success or failure of the call made in its constructor. If a lock object were to have +such state and remain thread safe it would need to synchronize access to the state information +which would result in roughly doubling the time of most operations. Worse, since checking +the state can occur only by a call after construction we'd have a race condition if the lock +object were shared between threads.
+ +So, to avoid the overhead of synchronizing access to the state information and to avoid +the race condition the Boost.Threads library simply does nothing to make lock objects thread +safe. Instead, sharing a lock object between threads results in undefined behavior. Since the +only proper usage of lock objects is within block scope this isn't a problem, and so long +as the lock object is properly used there's no danger of any multi-threading issues.
+ +Programmers who are used to C libraries for multi-threaded programming are likely to +wonder why Boost.Threads uses a non-copyable design for boost::thread. After all, the C +thread types are copyable, and you often have a need for copying them within user code. +However, careful comparison of C designs to C++ designs shows a flaw in this logic.
+ +All C types are copyable. It is, in fact, not possible to make a non-copyable type in +C. For this reason types that represent system resources in C are often designed to behave +very similarly to a pointer to dynamic memory. There's an API for acquiring the resource +and an API for releasing the resources. For memory we have pointers as the type and +alloc/free for the acquisition and release APIs. For files we have FILE* as the type +and fopen/fclose for the acquisition and release APIs. You can freely copy instances of the +types but must manually manage the lifetime of the actual resource through the acquisition +and release APIs.
+ +C++ designs recognize that the acquisition and release APIs are error prone and try +to eliminate possible errors by acquiring the resource in the constructor and releasing it +in the destructor. The best example of such a design is the std::iostream set of classes +which can represent the same resource as the FILE* type in C. A file is opened in the +std::fstream's constructor and closed in its destructor. However, if an iostream were +copyable it could lead to a file being closed twice, an obvious error, so the std::iostream +types are noncopyable by design. This is the same design used by boost::thread, which +is a simple and easy to understand design that's consistent with other C++ standard types.
+ +During the design of boost::thread it was pointed out that it would be possible to allow +it to be a copyable type if some form of "reference management" were used, such as ref-counting +or ref-lists, and many argued for a boost::thread_ref design instead. The reasoning was +that copying "thread" objects was a typical need in the C libraries, and so presumably would +be in the C++ libraries as well. It was also thought that implementations could provide +more efficient reference management then wrappers (such as boost::shared_ptr) around a noncopyable +thread concept. Analysis of whether or not these arguments would hold true don't appear to +bear them out. To illustrate the analysis we'll first provide pseudo-code illustrating the six +typical usage patterns of a thread object.
+ +
+void foo()
+{
+ create_thread(&bar);
+}
+
+
+
+void foo()
+{
+ thread = create_thread(&bar);
+ join(thread);
+}
+
+
+
+void foo()
+{
+ for (int i=0; i<NUM_THREADS; ++i)
+ create_thread(&bar);
+}
+
+
+
+void foo()
+{
+ for (int i=0; i<NUM_THREADS; ++i)
+ threads[i] = create_thread(&bar);
+ for (int i=0; i<NUM_THREADS; ++i)
+ threads[i].join();
+}
+
+
+
+void foo()
+{
+ thread = create_thread(&bar);
+ manager.owns(thread);
+}
+
+
+
+void foo()
+{
+ thread = create_thread(&bar);
+ manager1.add(thread);
+ manager2.add(thread);
+}
+
+
+Of these usage patterns there's only one that requires reference management (number 6). +Hopefully it's fairly obvious that this usage pattern simply won't occur as often as the +other usage patterns. So there really isn't a "typical need" for a thread concept, though +there is some need.
+ +Since the need isn't typical we must use different criteria for deciding on either a +thread_ref or thread design. Possible criteria include ease of use and performance. So let's +analyze both of these carefully.
+ +With ease of use we can look at existing experience. The standard C++ objects that +represent a system resource, such as std::iostream, are noncopyable, so we know that C++ +programmers must at least be experienced with this design. Most C++ developers are also +used to smart pointers such as boost::shared_ptr, so we know they can at least adapt to +a thread_ref concept with little effort. So existing experience isn't going to lead us +to a choice.
+ +The other thing we can look at is how difficult it is to use both types for the six usage +patterns above. If we find it overly difficult to use a concept for any of the usage patterns +there would be a good argument for choosing the other design. So we'll code all six usage +patterns using both designs.
+ +
+void foo()
+{
+ thread thrd(&bar);
+}
+
+void foo()
+{
+ thread_ref thrd = create_thread(&bar);
+}
+
+
+
+void foo()
+{
+ thread thrd(&bar);
+ thrd.join();
+}
+
+void foo()
+{
+ thread_ref thrd =
+ create_thread(&bar);thrd->join();
+}
+
+
+
+void foo()
+{
+ for (int i=0; i<NUM_THREADS; ++i)
+ thread thrd(&bar);
+}
+
+void foo()
+{
+ for (int i=0; i<NUM_THREADS; ++i)
+ thread_ref thrd = create_thread(&bar);
+}
+
+
+
+void foo()
+{
+ std::auto_ptr<thread> threads[NUM_THREADS];
+ for (int i=0; i<NUM_THREADS; ++i)
+ threads[i] = std::auto_ptr<thread>(new thread(&bar));
+ for (int i= 0; i<NUM_THREADS;
+ ++i)threads[i]->join();
+}
+
+void foo()
+{
+ thread_ref threads[NUM_THREADS];
+ for (int i=0; i<NUM_THREADS; ++i)
+ threads[i] = create_thread(&bar);
+ for (int i= 0; i<NUM_THREADS;
+ ++i)threads[i]->join();
+}
+
+
+
+void foo()
+{
+ thread thrd* = new thread(&bar);
+ manager.owns(thread);
+}
+
+void foo()
+{
+ thread_ref thrd = create_thread(&bar);
+ manager.owns(thrd);
+}
+
+
+
+void foo()
+{
+ boost::shared_ptr<thread> thrd(new thread(&bar));
+ manager1.add(thrd);
+ manager2.add(thrd);
+}
+
+void foo()
+{
+ thread_ref thrd = create_thread(&bar);
+ manager1.add(thrd);
+ manager2.add(thrd);
+}
+
+
+This shows the usage patterns being nearly identical in complexity for both designs. +The only actual added complexity occurs because of the use of operator new in (4), (5) +and (6) and the use of std::auto_ptr and boost::shared_ptr in (4) and (6) respectively. +However, that's not really much added complexity, and C++ programmers are used to using +these idioms any way. Some may dislike the presence of operator new in user code, +but this can be eliminated by proper design of higher level concepts, such as the +boost::thread_group class that simplifies example (4) down to:
+ +
+void foo()
+{
+ thread_group threads;
+ for (int i=0; i<NUM_THREADS; ++i)
+ threads.create_thread(&bar);
+ threads.join_all();
+}
+
+
+So ease of use is really a wash and not much help in picking a design.
+ +So what about performance? If you look at the above code examples we can analyze +the theoretical impact to performance that both designs have. For (1) we can see that +platforms that don't have a ref-counted native thread type (POSIX, for instance) will +be impacted by a thread_ref design. Even if the native thread type is ref-counted there +may be an impact if more state information has to be maintained for concepts foreign +to the native API, such as clean up stacks for Win32 implementations. For (2) the +performance impact will be identical to (1). The same for (3). For (4) things get a +little more interesting and we find that theoretically at least the thread_ref may +perform faster since the thread design requires dynamic memory allocation/deallocation. +However, in practice there may be dynamic allocation for the thread_ref design as well, +it will just be hidden from the user. As long as the implementation has to do dynamic +allocations the thread_ref loses again because of the reference management. For (5) +we see the same impact as we do for (4). For (6) we still have a possible impact +to the thread design because of dynamic allocation but thread_ref no longer suffers +because of it's reference management, and in fact, theoretically at least, the thread_ref +may do a better job of managing the references. All of this indicates that thread wins +for (1), (2) and (3), with (4) and (5) the winner depends on the implementation and the platform +but the thread design probably has a better chance, and with (6) it will again +depend on the implementation and platform but this time we favor thread_ref slightly. +Given all of this it's a narrow margin, but the thread design prevails.
+ +Given this analysis, and the fact that noncopyable objects for system resources are +the normal designs that C++ programmers are used to dealing with, the Boost.Threads +library has gone with a noncopyable design.
+ +Copyright William E. Kempf +2001 all rights reserved.
+ + +