mirror of
https://github.com/boostorg/interval.git
synced 2026-02-21 15:12:24 +00:00
587 lines
27 KiB
HTML
587 lines
27 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
|
||
"http://www.w3.org/TR/html4/loose.dtd">
|
||
<html>
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
||
<link rel="stylesheet" type="text/css" href="../../../../boost.css">
|
||
<title>Rounding Policies</title>
|
||
</head>
|
||
|
||
<body lang="en">
|
||
<h1>Rounding Policies</h1>
|
||
|
||
<p>In order to be as general as possible, the library uses a class to compute
|
||
all the necessary functions rounded upward or downward. This class is the
|
||
first parameter of <code>policies</code>, it is also the type named
|
||
<code>rounding</code> in the policy definition of <code>interval</code>.</p>
|
||
|
||
<p>By default, it is <code>interval_lib::rounded_math<T></code>. The
|
||
class <code>interval_lib::rounded_math</code> is already specialized for the
|
||
standard floating types (<code>float</code> , <code>double</code> and
|
||
<code>long double</code>). So if the base type of your intervals is not one
|
||
of these, a good solution would probably be to provide a specialization of
|
||
this class. But if the default specialization of
|
||
<code>rounded_math<T></code> for <code>float</code>,
|
||
<code>double</code>, or <code>long double</code> is not what you seek, or you
|
||
do not want to specialize <code>interval_lib::rounded_math<T></code>
|
||
(say because you prefer to work in your own namespace) you can also define
|
||
your own rounding policy and pass it directly to
|
||
<code>interval_lib::policies</code>.</p>
|
||
|
||
<h2>Requirements</h2>
|
||
|
||
<p>Here comes what the class is supposed to provide. The domains are written
|
||
next to their respective functions (as you can see, the functions do not have
|
||
to worry about invalid values, but they have to handle infinite
|
||
arguments).</p>
|
||
<pre>/* Rounding requirements */
|
||
struct rounding {
|
||
// defaut constructor, destructor
|
||
rounding();
|
||
~rounding();
|
||
// mathematical operations
|
||
T add_down(T, T); // [-∞;+∞][-∞;+∞]
|
||
T add_up (T, T); // [-∞;+∞][-∞;+∞]
|
||
T sub_down(T, T); // [-∞;+∞][-∞;+∞]
|
||
T sub_up (T, T); // [-∞;+∞][-∞;+∞]
|
||
T mul_down(T, T); // [-∞;+∞][-∞;+∞]
|
||
T mul_up (T, T); // [-∞;+∞][-∞;+∞]
|
||
T div_down(T, T); // [-∞;+∞]([-∞;+∞]-{0})
|
||
T div_up (T, T); // [-∞;+∞]([-∞;+∞]-{0})
|
||
T sqrt_down(T); // ]0;+∞]
|
||
T sqrt_up (T); // ]0;+∞]
|
||
T exp_down(T); // [-∞;+∞]
|
||
T exp_up (T); // [-∞;+∞]
|
||
T log_down(T); // ]0;+∞]
|
||
T log_up (T); // ]0;+∞]
|
||
T cos_down(T); // [0;2π]
|
||
T cos_up (T); // [0;2π]
|
||
T tan_down(T); // ]-π/2;π/2[
|
||
T tan_up (T); // ]-π/2;π/2[
|
||
T asin_down(T); // [-1;1]
|
||
T asin_up (T); // [-1;1]
|
||
T acos_down(T); // [-1;1]
|
||
T acos_up (T); // [-1;1]
|
||
T atan_down(T); // [-∞;+∞]
|
||
T atan_up (T); // [-∞;+∞]
|
||
T sinh_down(T); // [-∞;+∞]
|
||
T sinh_up (T); // [-∞;+∞]
|
||
T cosh_down(T); // [-∞;+∞]
|
||
T cosh_up (T); // [-∞;+∞]
|
||
T tanh_down(T); // [-∞;+∞]
|
||
T tanh_up (T); // [-∞;+∞]
|
||
T asinh_down(T); // [-∞;+∞]
|
||
T asinh_up (T); // [-∞;+∞]
|
||
T acosh_down(T); // [1;+∞]
|
||
T acosh_up (T); // [1;+∞]
|
||
T atanh_down(T); // [-1;1]
|
||
T atanh_up (T); // [-1;1]
|
||
T median(T, T); // [-∞;+∞][-∞;+∞]
|
||
T int_down(T); // [-∞;+∞]
|
||
T int_up (T); // [-∞;+∞]
|
||
// conversion functions
|
||
T conv_down(U);
|
||
T conv_up (U);
|
||
// unprotected rounding class
|
||
typedef ... unprotected_rounding;
|
||
};</pre>
|
||
|
||
<p>The constructor and destructor of the rounding class have a very important
|
||
semantic requirement: they are responsible for setting and resetting the
|
||
rounding modes of the computation on T. For instance, if T is a standard
|
||
floating point type and floating point computation is performed according to
|
||
the Standard IEEE 754, the constructor can save the current rounding state,
|
||
each <code>_up</code> (resp. <code>_down</code>) function will round up
|
||
(resp. down), and the destructor will restore the saved rounding state.
|
||
Indeed this is the behavior of the default rounding policy.</p>
|
||
|
||
<p>The meaning of all the mathematical functions up until
|
||
<code>atanh_up</code> is clear: each function returns number representable in
|
||
the type <code>T</code> which is a lower bound (for<code> _down</code>) or
|
||
upper bound (for <code>_up</code>) on the true mathematical result of the
|
||
corresponding function. The function <code>median</code> computes the average
|
||
of its two arguments rounded to its nearest representable number. The
|
||
functions <code>int_down</code> and <code>int_up</code> compute the nearest
|
||
integer smaller or bigger than their argument. Finally,
|
||
<code>conv_down</code> and <code>conv_up</code> are responsible of the
|
||
conversions of values of other types to the base number type: the first one
|
||
must round down the value and the second one must round it up.</p>
|
||
|
||
<p>The type <code>unprotected_rounding</code> allows to remove all controls.
|
||
For reasons why one might to do this, see the <a
|
||
href="#Protection">protection</a> paragraph below.</p>
|
||
|
||
<h2>Overview of the provided classes</h2>
|
||
|
||
<p>A lot of classes are provided. The classes are organized by level. At the
|
||
bottom is the class <code>rounding_control</code>. At the next level come
|
||
<code>rounded_arith_exact</code>, <code>rounded_arith_std</code> and
|
||
<code>rounded_arith_opp</code>. Then there are
|
||
<code>rounded_transc_dummy</code>, <code>rounded_transc_exact</code>,
|
||
<code>rounded_transc_std</code> and <code>rounded_transc_opp</code>. And
|
||
finally are <code>save_state</code> and <code>save_state_nothing</code>. Each
|
||
of these classes provide a set of members that are required by the classes of
|
||
the next level. For example, a <code>rounded_transc_...</code> class needs
|
||
the members of a <code>rounded_arith_...</code> class.</p>
|
||
|
||
<p>When they exist in two versions <code>_std</code> and <code>_opp</code>,
|
||
the first one does switch the rounding mode each time, and the second one
|
||
tries to keep it oriented toward plus infinity. The main purpose of the
|
||
<code>_opp</code> version is to speed up the computations through the use of
|
||
the "opposite trick" (see the <a href="#perf">performance notes</a>). This
|
||
version requires the rounding mode to be upward before entering any
|
||
computation functions of the class. It guarantees that the rounding mode will
|
||
still be upward at the exit of the functions.</p>
|
||
|
||
<p>Please note that it is really a very bad idea to mix the <code>_opp</code>
|
||
version with the <code>_std</code> since they do not have compatible
|
||
properties.</p>
|
||
|
||
<p>There is a third version named <code>_exact</code> which computes the
|
||
functions without changing the rounding mode. It is an "exact" version
|
||
because it is intended for a base type that produces exact results.</p>
|
||
|
||
<p>The last version is the <code>_dummy</code> version. It does not do any
|
||
computations but still produces compatible results.</p>
|
||
|
||
<p>Please note that it is possible to use the "exact" version for an inexact
|
||
base type, e.g. <code>float</code> or <code>double</code>. In that case, the
|
||
inclusion property is no longer guaranteed, but this can be useful to speed
|
||
up the computation when the inclusion property is not desired strictly. For
|
||
instance, in computer graphics, a small error due to floating-point roundoff
|
||
is acceptable as long as an approximate version of the inclusion property
|
||
holds.</p>
|
||
|
||
<p>Here comes what each class defines. Later, when they will be described
|
||
more thoroughly, these members will not be repeated. Please come back here in
|
||
order to see them. Inheritance is also used to avoid repetitions.</p>
|
||
<pre>template <class T>
|
||
struct rounding_control
|
||
{
|
||
typedef ... rounding_mode;
|
||
void set_rounding_mode(rounding_mode);
|
||
void get_rounding_mode(rounding_mode&);
|
||
void downward ();
|
||
void upward ();
|
||
void to_nearest();
|
||
T to_int(T);
|
||
T force_rounding(T);
|
||
};
|
||
|
||
template <class T, class Rounding>
|
||
struct rounded_arith_... : Rounding
|
||
{
|
||
void init();
|
||
T add_down(T, T);
|
||
T add_up (T, T);
|
||
T sub_down(T, T);
|
||
T sub_up (T, T);
|
||
T mul_down(T, T);
|
||
T mul_up (T, T);
|
||
T div_down(T, T);
|
||
T div_up (T, T);
|
||
T sqrt_down(T);
|
||
T sqrt_up (T);
|
||
T median(T, T);
|
||
T int_down(T);
|
||
T int_up (T);
|
||
};
|
||
|
||
template <class T, class Rounding>
|
||
struct rounded_transc_... : Rounding
|
||
{
|
||
T exp_down(T);
|
||
T exp_up (T);
|
||
T log_down(T);
|
||
T log_up (T);
|
||
T cos_down(T);
|
||
T cos_up (T);
|
||
T tan_down(T);
|
||
T tan_up (T);
|
||
T asin_down(T);
|
||
T asin_up (T);
|
||
T acos_down(T);
|
||
T acos_up (T);
|
||
T atan_down(T);
|
||
T atan_up (T);
|
||
T sinh_down(T);
|
||
T sinh_up (T);
|
||
T cosh_down(T);
|
||
T cosh_up (T);
|
||
T tanh_down(T);
|
||
T tanh_up (T);
|
||
T asinh_down(T);
|
||
T asinh_up (T);
|
||
T acosh_down(T);
|
||
T acosh_up (T);
|
||
T atanh_down(T);
|
||
T atanh_up (T);
|
||
};
|
||
|
||
template <class Rounding>
|
||
struct save_state_... : Rounding
|
||
{
|
||
save_state_...();
|
||
~save_state_...();
|
||
typedef ... unprotected_rounding;
|
||
};</pre>
|
||
|
||
<h2>Synopsis.</h2>
|
||
<pre>namespace boost {
|
||
namespace numeric {
|
||
namespace interval_lib {
|
||
|
||
<span style="color: #FF0000">/* basic rounding control */</span>
|
||
template <class T> struct rounding_control;
|
||
|
||
<span style="color: #FF0000">/* arithmetic functions rounding */</span>
|
||
template <class T, class Rounding = rounding_control<T> > struct rounded_arith_exact;
|
||
template <class T, class Rounding = rounding_control<T> > struct rounded_arith_std;
|
||
template <class T, class Rounding = rounding_control<T> > struct rounded_arith_opp;
|
||
|
||
<span style="color: #FF0000">/* transcendental functions rounding */</span>
|
||
template <class T, class Rounding> struct rounded_transc_dummy;
|
||
template <class T, class Rounding = rounded_arith_exact<T> > struct rounded_transc_exact;
|
||
template <class T, class Rounding = rounded_arith_std<T> > struct rounded_transc_std;
|
||
template <class T, class Rounding = rounded_arith_opp<T> > struct rounded_transc_opp;
|
||
|
||
<span style="color: #FF0000">/* rounding-state-saving classes */</span>
|
||
template <class Rounding> struct save_state;
|
||
template <class Rounding> struct save_state_nothing;
|
||
|
||
<span style="color: #FF0000">/* default policy for type T */</span>
|
||
template <class T> struct rounded_math;
|
||
template <> struct rounded_math<float>;
|
||
template <> struct rounded_math<double>;
|
||
|
||
<span style="color: #FF0000">/* some metaprogramming to convert a protected to unprotected rounding */</span>
|
||
template <class I> struct unprotect;
|
||
|
||
} // namespace interval_lib
|
||
} // namespace numeric
|
||
} // namespace boost</pre>
|
||
|
||
<h2>Description of the provided classes</h2>
|
||
|
||
<p>We now describe each class in the order they appear in the definition of a
|
||
rounding policy (this outermost-to-innermost order is the reverse order from
|
||
the synopsis).</p>
|
||
|
||
<h3 id="Protection">Protection control</h3>
|
||
|
||
<p>Protection refers to the fact that the interval operations will be
|
||
surrounded by rounding mode controls. Unprotecting a class means to remove
|
||
all the rounding controls. Each rounding policy provides a type
|
||
<code>unprotected_rounding</code>. The required type
|
||
<code>unprotected_rounding</code> gives another rounding class that enables
|
||
to work when nested inside rounding. For example, the first three lines below
|
||
should all produce the same result (because the first operation is the
|
||
rounding constructor, and the last is its destructor, which take care of
|
||
setting the rounding modes); and the last line is allowed to have an
|
||
undefined behavior (since no rounding constructor or destructor is ever
|
||
called).</p>
|
||
<pre>T c; { rounding rnd; c = rnd.add_down(a, b); }
|
||
T c; { rounding rnd1; { rounding rnd2; c = rnd2.add_down(a, b); } }
|
||
T c; { rounding rnd1; { rounding::unprotected_rounding rnd2; c = rnd2.add_down(a, b); } }
|
||
T d; { rounding::unprotected_rounding rnd; d = rnd.add_down(a, b); }</pre>
|
||
|
||
<p>Naturally <code>rounding::unprotected_rounding</code> may simply be
|
||
<code>rounding</code> itself. But it can improve performance if it is a
|
||
simplified version with empty constructor and destructor. In order to avoid
|
||
undefined behaviors, in the library, an object of type
|
||
<code>rounding::unprotected_rounding</code> is guaranteed to be created only
|
||
when an object of type <code>rounding</code> is already alive. See the <a
|
||
href="#perf">performance notes</a> for some additional details.</p>
|
||
|
||
<p>The support library defines a metaprogramming class template
|
||
<code>unprotect</code> which takes an interval type <code>I</code> and
|
||
returns an interval type <code>unprotect<I>::type</code> where the
|
||
rounding policy has been unprotected. Some information about the types:
|
||
<code>interval<T, interval_lib::policies<Rounding, _>
|
||
>::traits_type::rounding</code> <b>is</b> the same type as
|
||
<code>Rounding</code>, and <code>unprotect<interval<T,
|
||
interval_lib::policies<Rounding, _> > >::type</code> <b>is</b>
|
||
the same type as <code>interval<T,
|
||
interval_lib::policies<Rounding::unprotected, _> ></code>.</p>
|
||
|
||
<h3>State saving</h3>
|
||
|
||
<p>First comes <code>save_state</code>. This class is responsible for saving
|
||
the current rounding mode and calling init in its constructor, and for
|
||
restoring the saved rounding mode in its destructor. This class also defines
|
||
the <code>unprotected_rounding</code> type.</p>
|
||
|
||
<p>If the rounding mode does not require any state-saving or initialization,
|
||
<code>save_state_nothing</code> can be used instead of
|
||
<code>save_state</code>.</p>
|
||
|
||
<h3>Transcendental functions</h3>
|
||
|
||
<p>The classes <code>rounded_transc_exact</code>,
|
||
<code>rounded_transc_std</code> and <code>rounded_transc_opp</code> expect
|
||
the std namespace to provide the functions exp log cos tan acos asin atan
|
||
cosh sinh tanh acosh asinh atanh. For the <code>_std</code> and
|
||
<code>_opp</code> versions, all these functions should respect the current
|
||
rounding mode fixed by a call to downward or upward.</p>
|
||
|
||
<p><strong>Please note:</strong> Unfortunately, the latter is rarely the
|
||
case. It is the reason why a class <code>rounded_transc_dummy</code> is
|
||
provided which does not depend on the functions from the std namespace. There
|
||
is no magic, however. The functions of <code>rounded_transc_dummy</code> do
|
||
not compute anything. They only return valid values. For example,
|
||
<code>cos_down</code> always returns -1. In this way, we do verify the
|
||
inclusion property for the default implementation, even if this has strictly
|
||
no value for the user. In order to have useful values, another policy should
|
||
be used explicitely, which will most likely lead to a violation of the
|
||
inclusion property. In this way, we ensure that the violation is clearly
|
||
pointed out to the user who then knows what he stands against. This class
|
||
could have been used as the default transcendental rounding class, but it was
|
||
decided it would be better for the compilation to fail due to missing
|
||
declarations rather than succeed thanks to valid but unusable functions.</p>
|
||
|
||
<h3>Basic arithmetic functions</h3>
|
||
|
||
<p>The classes <code>rounded_arith_std</code> and
|
||
<code>rounded_arith_opp</code> expect the operators + - * / and the function
|
||
<code>std::sqrt</code> to respect the current rounding mode.</p>
|
||
|
||
<p>The class <code>rounded_arith_exact</code> requires
|
||
<code>std::floor</code> and <code>std::ceil</code> to be defined since it can
|
||
not rely on <code>to_int</code>.</p>
|
||
|
||
<h3>Rounding control</h3>
|
||
|
||
<p>The functions defined by each of the previous classes did not need any
|
||
explanation. For example, the behavior of <code>add_down</code> is to compute
|
||
the sum of two numbers rounded downward. For <code>rounding_control</code>,
|
||
the situation is a bit more complex.</p>
|
||
|
||
<p>The basic function is <code>force_rounding</code> which returns its
|
||
argument correctly rounded accordingly to the current rounding mode if it was
|
||
not already the case. This function is necessary to handle delayed rounding.
|
||
Indeed, depending on the way the computations are done, the intermediate
|
||
results may be internaly stored in a more precise format and it can lead to a
|
||
wrong rounding. So the function enforces the rounding. <a
|
||
href="#extreg">Here</a> is an example of what happens when the rounding is
|
||
not enforced.</p>
|
||
|
||
<p>The function <code>get_rounding_mode</code> returns the current rounding
|
||
mode, <code>set_rounding_mode</code> sets the rounding mode back to a
|
||
previous value returned by <code>get_rounding_mode</code>.
|
||
<code>downward</code>, <code>upward</code> and <code>to_nearest</code> sets
|
||
the rounding mode in one of the three directions. This rounding mode should
|
||
be global to all the functions that use the type <code>T</code>. For example,
|
||
after a call to <code>downward</code>, <code>force_rounding(x+y)</code> is
|
||
expected to return the sum rounded toward -∞.</p>
|
||
|
||
<p>The function <code>to_int</code> computes the nearest integer accordingly
|
||
to the current rounding mode.</p>
|
||
|
||
<p>The non-specialized version of <code>rounding_control</code> does not do
|
||
anything. The functions for the rounding mode are empty, and
|
||
<code>to_int</code> and <code>force_rounding</code> are identity functions.
|
||
The <code>pi_</code> constant functions return suitable integers (for
|
||
example, <code>pi_up</code> returns <code>T(4)</code>).</p>
|
||
|
||
<p>The class template <code>rounding_control</code> is specialized for
|
||
<code>float</code>, <code>double</code> and <code>long double</code> in order
|
||
to best use the floating point unit of the computer.</p>
|
||
|
||
<h2>Template class <tt>rounded_math</tt></h2>
|
||
|
||
<p>The default policy (aka <code>rounded_math<T></code>) is simply
|
||
defined as:</p>
|
||
<pre>template <class T> struct rounded_math<T> : save_state_nothing<rounded_arith_exact<T> > {};</pre>
|
||
|
||
<p>and the specializations for <code>float</code>, <code>double</code> and
|
||
<code>long double</code> use <code>rounded_arith_opp</code>, as in:</p>
|
||
<pre>template <> struct rounded_math<float> : save_state<rounded_arith_opp<float> > {};
|
||
template <> struct rounded_math<double> : save_state<rounded_arith_opp<double> > {};
|
||
template <> struct rounded_math<long double> : save_state<rounded_arith_opp<long double> > {};</pre>
|
||
|
||
<h2 id="perf">Performance Issues</h2>
|
||
|
||
<p>This paragraph deals mostly with the performance of the library with
|
||
intervals using the floating-point unit (FPU) of the computer. Let's consider
|
||
the sum of [<i>a</i>,<i>b</i>] and [<i>c</i>,<i>d</i>] as an example. The
|
||
result is [<code>down</code>(<i>a</i>+<i>c</i>),
|
||
<code>up</code>(<i>b</i>+<i>d</i>)], where <code>down</code> and
|
||
<code>up</code> indicate the rounding mode needed.</p>
|
||
|
||
<h3>Rounding Mode Switch</h3>
|
||
|
||
<p>If the FPU is able to use a different rounding mode for each operation,
|
||
there is no problem. For example, it's the case for the Alpha processor: each
|
||
floating-point instruction can specify a different rounding mode. However,
|
||
the IEEE-754 Standard does not require such a behavior. So most of the FPUs
|
||
only provide some instructions to set the rounding mode for all subsequent
|
||
operations. And generally, these instructions need to flush the pipeline of
|
||
the FPU.</p>
|
||
|
||
<p>In this situation, the time needed to sum [<i>a</i>,<i>b</i>] and
|
||
[<i>c</i>,<i>d</i>] is far worse than the time needed to calculate
|
||
<i>a</i>+<i>b</i> and <i>c</i>+<i>d</i> since the two additions cannot be
|
||
parallelized. Consequently, the objective is to diminish the number of
|
||
rounding mode switches.</p>
|
||
|
||
<p>If this library is not used to provide exact computations, but only for
|
||
pair arithmetic, the solution is quite simple: do not use rounding. In that
|
||
case, doing the sum [<i>a</i>,<i>b</i>] and [<i>c</i>,<i>d</i>] will be as
|
||
fast as computing <i>a</i>+<i>b</i> and <i>c</i>+<i>d</i>. Everything is
|
||
perfect.</p>
|
||
|
||
<p>However, if exact computations are required, such a solution is totally
|
||
unthinkable. So, are we penniless? No, there is still a trick available.
|
||
Indeed, down(<i>a</i>+<i>c</i>) = -up(-<i>a</i>-<i>c</i>) if the unary minus
|
||
is an exact operation. It is now possible to calculate the whole sum with the
|
||
same rounding mode. Generally, the cost of the mode switching is worse than
|
||
the cost of the sign changes.</p>
|
||
|
||
<h4>Speeding up consecutive operations</h4>
|
||
|
||
<p>The interval addition is not the only operation; most of the interval
|
||
operations can be computed by setting the rounding direction of the FPU only
|
||
once. So the operations of the floating point rounding policy assume that the
|
||
direction is correctly set. This assumption is usually not true in a program
|
||
(the user and the standard library expect the rounding direction to be to
|
||
nearest), so these operations have to be enclosed in a shell that sets the
|
||
floating point environment. This protection is done by the constructor and
|
||
destructor of the rounding policy.</p>
|
||
|
||
<p>Les us now consider the case of two consecutive interval additions:
|
||
[<i>a</i>,<i>b</i>] + [<i>c</i>,<i>d</i>] + [<i>e</i>,<i>f</i>]. The generated
|
||
code should look like:</p>
|
||
<pre>init_rounding_mode(); // rounding object construction during the first addition
|
||
t1 = -(-a - c);
|
||
t2 = b + d;
|
||
restore_rounding_mode(); // rounding object destruction
|
||
init_rounding_mode(); // rounding object construction during the second addition
|
||
x = -(-t1 - e);
|
||
y = t2 + f;
|
||
restore_rounding_mode(); // rounding object destruction
|
||
// the result is the interval [x,y]</pre>
|
||
|
||
<p>Between the two operations, the rounding direction is restored, and then
|
||
initialized again. Ideally, compilers should be able to optimize this useless
|
||
code away. But unfortunately they are not, and this slows the code down by an
|
||
order of magnitude. In order to avoid this bottleneck, the user can tell to
|
||
the interval operations that they do not need to be protected anymore. It will
|
||
then be up to the user to protect the interval computations. The compiler will
|
||
then be able to generate such a code:</p>
|
||
<pre>init_rounding_mode(); // done by the user
|
||
x = -(-a - c - e);
|
||
y = b + d + f;
|
||
restore_rounding_mode(); // done by the user</pre>
|
||
|
||
<p>The user will have to create a rounding object. And as long as this object
|
||
is alive, unprotected versions of the interval operations can be used. They
|
||
are selected by using an interval type with a specific rounding policy. If the
|
||
initial interval type is <code>I</code>, then <code>I::traits_type::rounding</code>
|
||
is the type of the rounding object, and <code>interval_lib::unprotect<I>::type</code>
|
||
is the type of the unprotected interval type.</p>
|
||
|
||
<p>Because the rounding mode of the FPU is changed during the life of the
|
||
rounding object, any arithmetic floating point operation that does not involve
|
||
the interval library can lead to unexpected results. And reciprocally, using
|
||
unprotected interval operation when no rounding object is alive will produce
|
||
intervals that are not guaranteed anymore to contain the real result.</p>
|
||
|
||
<h4 id="perfexp">Example</h4>
|
||
|
||
<p>Here is an example of Horner's scheme to compute the value of a polynom.
|
||
The rounding mode switches are disabled for the whole computation.</p>
|
||
<pre>// I is an interval class, the polynom is a simple array
|
||
template<class I>
|
||
I horner(const I& x, const I p[], int n) {
|
||
|
||
// save and initialize the rounding mode
|
||
typename I::traits_type::rounding rnd;
|
||
|
||
// define the unprotected version of the interval type
|
||
typedef typename boost::numeric::interval_lib::unprotect<I>::type R;
|
||
|
||
const R& a = x;
|
||
R y = p[n - 1];
|
||
for(int i = n - 2; i >= 0; i--) {
|
||
y = y * a + (const R&)(p[i]);
|
||
}
|
||
return y;
|
||
|
||
// restore the rounding mode with the destruction of rnd
|
||
}</pre>
|
||
|
||
<p>Please note that a rounding object is specially created in order to protect
|
||
all the interval computations. Each interval of type I is converted in
|
||
an interval of type R before any operations. If this conversion is not done,
|
||
the result is still correct, but the interest of this whole optimization has
|
||
disappeared. Whenever possible, it is good to convert to <code>const
|
||
R&</code> instead of <code>R</code>: indeed, the function could already
|
||
be called inside an unprotection block so the types <code>R</code> and
|
||
<code>I</code> would be the same interval, no need for a conversion.</p>
|
||
|
||
<h4>Uninteresting remark</h4>
|
||
|
||
<p>It was said at the beginning that the Alpha processors can use a specific
|
||
rounding mode for each operation. However, due to the instruction format, the
|
||
rounding toward plus infinity is not available. Only the rounding toward
|
||
minus infinity can be used. So the trick using the change of sign becomes
|
||
essential, but there is no need to save and restore the rounding mode on both
|
||
sides of an operation.</p>
|
||
|
||
<h3 id="extreg">Extended Registers</h3>
|
||
|
||
<p>There is another problem besides the cost of the rounding mode switch.
|
||
Some FPUs use extended registers (for example, float computations will be
|
||
done with double registers, or double computations with long double
|
||
registers). Consequently, many problems can arise.</p>
|
||
|
||
<p>The first one is due to to the extended precision of the mantissa. The
|
||
rounding is also done on this extended precision. And consequently, we still
|
||
have down(<i>a</i>+<i>b</i>) = -up(-<i>a</i>-<i>b</i>) in the extended
|
||
registers. But back to the standard precision, we now have
|
||
down(<i>a</i>+<i>b</i>) < -up(-<i>a</i>-<i>b</i>) instead of an equality.
|
||
A solution could be not to use this method. But there still are other
|
||
problems, with the comparisons between numbers for example.</p>
|
||
|
||
<p>Naturally, there is also a problem with the extended precision of the
|
||
exponent. To illustrate this problem, let <i>m</i> be the biggest number
|
||
before +<i>inf</i>. If we calculate 2*[<i>m</i>,<i>m</i>], the answer should
|
||
be [<i>m</i>,<i>inf</i>]. But due to the extended registers, the FPU will
|
||
first store [<i>2m</i>,<i>2m</i>] and then convert it to
|
||
[<i>inf</i>,<i>inf</i>] at the end of the calculus (when the rounding mode is
|
||
toward +<i>inf</i>). So the answer is no more accurate.</p>
|
||
|
||
<p>There is only one solution: to force the FPU to convert the extended
|
||
values back to standard precision after each operation. Some FPUs provide an
|
||
instruction able to do this conversion (for example the PowerPC processors).
|
||
But for the FPUs that do not provide it (the x86 processors), the only
|
||
solution is to write the values to memory and read them back. Such an
|
||
operation is obviously very expensive.</p>
|
||
|
||
<h2>Some Examples</h2>
|
||
|
||
<p>Here come several cases:</p>
|
||
<ul>
|
||
<li>if you need precise computations with the <code>float</code> or
|
||
<code>double</code> types, use the default
|
||
<code>rounded_math<T></code>;</li>
|
||
<li>for fast wide intervals without any rounding nor precision, use
|
||
<code>save_state_nothing<rounded_transc_exact<T>
|
||
></code>;</li>
|
||
<li>for an exact type (like int or rational with a little help for infinite
|
||
and NaN values) without support for transcendental functions, the
|
||
solution could be <code>save_state_nothing<rounded_transc_dummy<T,
|
||
rounded_arith_exact<T> > ></code> or directly
|
||
<code>save_state_nothing<rounded_arith_exact<T> ></code>;</li>
|
||
<li>if it is a more complex case than the previous ones, the best thing is
|
||
probably to directly define your own policy.</li>
|
||
</ul>
|
||
<hr>
|
||
|
||
<p>Revised: 2005-12-19<br>
|
||
Copyright (c) Guillaume Melquiond, Sylvain Pion, Herv<72> Br<42>nnimann, 2002.
|
||
Polytechnic University.<br>
|
||
Copyright (c) Guillaume Melquiond, 2004-2005. ENS Lyon.</p>
|
||
</body>
|
||
</html>
|