mirror of
https://github.com/boostorg/math.git
synced 2026-01-27 19:12:08 +00:00
255 lines
10 KiB
Plaintext
255 lines
10 KiB
Plaintext
[section:implementation Additional Implementation Notes]
|
|
|
|
The majority of the implementation notes are included with the documentation
|
|
of each function or distribution. The notes here are of a more general nature,
|
|
and reflect more the general implementation philosophy used.
|
|
|
|
[h4 Implemention philosophy]
|
|
|
|
"First be right, then be fast."
|
|
|
|
There will always be potential compromises
|
|
to be made between speed and accuracy.
|
|
It may be possible to find faster methods,
|
|
particularly for certain limited ranges of arguments,
|
|
but for most applications of math functions and distributions,
|
|
we judge that speed is rarely as important as accuracy.
|
|
|
|
So our priority is accuracy.
|
|
|
|
To permit evaluation of accuracy of the special functions,
|
|
production of extremely accurate tables of test values
|
|
has received considerable effort.
|
|
|
|
(It also required much CPU effort -
|
|
there was some danger of molten plastic dripping from the bottom of JM's laptop,
|
|
so instead, PAB's Dual-core desktop was kept 50% busy for *days*
|
|
calculating some tables of test values!)
|
|
|
|
For a specific RealType, say float or double,
|
|
it may be possible to find approximations for some functions
|
|
that are simpler and thus faster, but less accurate
|
|
(perhaps because there are no refining iterations,
|
|
for example, when calculating inverse functions).
|
|
|
|
If these prove accurate enough to be "fit for his purpose",
|
|
then a user may substitute his custom specialization.
|
|
|
|
For example, there are approximations dating back from times when computation was a *lot* more expensive:
|
|
|
|
H Goldberg and H Levine, Approximate formulas for percentage points and normalisation of t and chi squared, Ann. Math. Stat., 17(4), 216 - 225 (Dec 1946).
|
|
|
|
A H Carter, Approximations to percentage points of the z-distribution, Biometrika 34(2), 352 - 358 (Dec 1947).
|
|
|
|
These could still provide sufficient accuracy for some speed-critical applications.
|
|
|
|
[h4 Accuracy and Representation of Test Values]
|
|
|
|
In order to be accurate enough for as many as possible real types,
|
|
constant values are given to 50 decimal digits if available
|
|
(though many sources proved only accurate to 64-bit doubles).
|
|
Values are specified as long double types by appending L,
|
|
unless they are exactly representable, for example integers.
|
|
This avoids the risk of loss of accuracy converting from double, the default type.
|
|
Values are used after static_cast<RealType>(1.2345L)
|
|
to provide the appropriate real type for spot tests.
|
|
|
|
Functions that return constants values, like kurtosis for example, are written as
|
|
|
|
`static_cast<RealType>(-3) / 5;`
|
|
|
|
to provide the most accurate value
|
|
that the compiler can compute for the real type.
|
|
(The denominator is an integer and so will be promoted exactly).
|
|
|
|
So tests for one third, *not* exactly representable with radix two floating-point,
|
|
(should) use, for example:
|
|
|
|
`static_cast<RealType>(1) / 3;`
|
|
|
|
If a function is very sensitive to changes in input,
|
|
specifying an inexact value as input (such as 0.1) can throw
|
|
the result off by a noticeable amount: 0.1f is "wrong"
|
|
by ~1e-7 for example (because 0.1 has no exact binary representation).
|
|
That is why exact binary values - halves, quarters, and eighths etc -
|
|
are used in test code along with the occasional fraction `a/b` with `b`
|
|
a power of two (in order to ensure that the result is an exactly
|
|
representable binary value).
|
|
|
|
|
|
[h4 Handling Unsuitable Arguments]
|
|
|
|
In
|
|
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1665.pdf Errors in Mathematical Special Functions], J. Marraffino & M. Paterno
|
|
it is proposed that signalling a domain error is mandatory
|
|
when the argument would give an mathematically undefined result.
|
|
|
|
*Guideline 1
|
|
|
|
[:A mathematical function is said to be defined at a point a = (a1, a2, . . .)
|
|
if the limits as x = (x1, x2, . . .) 'approaches a from all directions agree'.
|
|
The defined value may be any number, or +infinity, or -infinity.]
|
|
|
|
Put crudely, if the function goes to + infinity
|
|
and then emerges 'round-the-back' with - infinity,
|
|
it is NOT defined.
|
|
|
|
[:The library function which approximates a mathematical function shall signal a domain error
|
|
whenever evaluated with argument values for which the mathematical function is undefined.]
|
|
|
|
*Guideline 2
|
|
|
|
[:The library function which approximates a mathematical function
|
|
shall signal a domain error whenever evaluated with argument values
|
|
for which the mathematical function obtains a non-real value.]
|
|
|
|
This implementation is believed to follow these proposals.
|
|
|
|
See [link math_toolkit.special.error_handling error handling]
|
|
for a detailed explanation of the mechanism, and
|
|
[link math_toolkit.dist.stat_tut.weg.error_eg error_handling example]
|
|
and
|
|
[@/../../example/error_handling_example.cpp error_handling_example.cpp]
|
|
|
|
[caution If you enable throw but do NOT have try & catch block,
|
|
then the program will terminate with an uncaught exception and probably abort.
|
|
Therefore to get the benefit of helpful error messages, enabling *all* exceptions
|
|
*and* using try&catch is recommended for all applications.
|
|
However, for simplicity, this is not done for most examples.]
|
|
|
|
[h4 Handling of Functions that are Not Implemented]
|
|
|
|
Functions that are not implemented for any reason,
|
|
usually because they are not defined, (or we were unable to find a definition),
|
|
are handled as domain errors.
|
|
|
|
If the instruction
|
|
|
|
`#define BOOST_MATH_THROW_ON_DOMAIN_ERROR`
|
|
|
|
appears before the first Boost include, then if the un-implemented function is called,
|
|
a domain_error exception will be thrown,
|
|
and caught to provide an appropriate error message.
|
|
|
|
Otherwise the value of std::numeric_limits<T>::quiet_NaN() will be returned.
|
|
|
|
[warning If `std::numeric_limits<T>::has_quiet_NaN` is false
|
|
(for example T is a User-defined type),
|
|
then an exception will always be thrown when a domain error occurs.
|
|
Catching exceptions is therefore strongly recommended.]
|
|
|
|
[h4 Median of distributions]
|
|
|
|
There are many distributions for which we have been unable to find an analytic formula,
|
|
and this has deterred us from implementing
|
|
[@http://en.wikipedia.org/wiki/Median median functions], the mid-point in a list of values.
|
|
|
|
However a useful median approximation for distribution `dist` may be available from
|
|
|
|
`quantile(dist, 0.5)`.
|
|
|
|
[@http://www.amstat.org/publications/jse/v13n2/vonhippel.html Mean, Median, and Skew, Paul T von Hippel]
|
|
|
|
[@http://documents.wolfram.co.jp/teachersedition/MathematicaBook/24.5.html Descriptive Statistics,]
|
|
|
|
[@http://documents.wolfram.co.jp/v5/Add-onsLinks/StandardPackages/Statistics/DescriptiveStatistics.html and ]
|
|
|
|
[@http://documents.wolfram.com/v5/TheMathematicaBook/AdvancedMathematicsInMathematica/NumericalOperationsOnData/3.8.1.html
|
|
Mathematica Basic Statistics.] give more detail, in particular for discrete distributions.
|
|
|
|
|
|
[h4 Handling of Floating-Point Infinity]
|
|
|
|
Some functions and distributions are well defined with + or - infinity as
|
|
argument(s), but after some experiments with handling infinite arguments
|
|
as special cases, we concluded that it was more useful to forbid this,
|
|
and instead to return the result of __domain_error.
|
|
|
|
Handling infinity as special cases was additionally complicated
|
|
because, unlike built-in types on most - but not all - platforms,
|
|
not all User-Defined Types are
|
|
specialized to provide `std::numeric_limits<RealType>::infinity()`
|
|
and would return zero rather than any representation of infinity.
|
|
|
|
The rationale is that non-finiteness may happen because of error
|
|
or overflow in the users code, and it will be more helpful for this
|
|
to be diagnosed promptly rather than just continuing.
|
|
The code also became much more complicated, more error-prone,
|
|
much more work to test, and much less readable.
|
|
|
|
We have also tried to catch boundary cases where the mathematical specification
|
|
would result in divide by zero or overflow and signalling these similarly.
|
|
|
|
[h4 Scale, Shape and Location]
|
|
|
|
We considered adding location and scale to the list of functions, for example:
|
|
|
|
template <class RealType>
|
|
inline RealType scale(const triangular_distribution<RealType>& dist)
|
|
{
|
|
RealType lower = dist.lower();
|
|
RealType mode = dist.mode();
|
|
RealType upper = dist.upper();
|
|
RealType result; // of checks.
|
|
if(false == detail::check_triangular(BOOST_CURRENT_FUNCTION, lower, mode, upper, &result))
|
|
{
|
|
return result;
|
|
}
|
|
return (upper - lower);
|
|
}
|
|
|
|
but found that these concepts are not defined (or their definition too contentious)
|
|
for too many distributions to be generally applicable.
|
|
Because they are non-member functions, they can be added if required.
|
|
|
|
[h4 Notes on Implementation of Specific Functions & Distributions]
|
|
|
|
* Default parameters for the Triangular Distribution.
|
|
We are uncertain about the best default parameters.
|
|
Some sources suggest that the Standard Triangular Distribution has
|
|
lower = 0, mode = half and upper = 1.
|
|
However as a approximation for the normal distribution,
|
|
the most common usage, lower = -1, mode = 0 and upper = 1 would be more suitable.
|
|
|
|
[h4 Sources of Test Data]
|
|
|
|
We found a large number of sources of test data.
|
|
We have assumed that these are /"known good"/
|
|
if they agree with the results from our test
|
|
and only consulted other sources for their /'vote'/
|
|
in the case of serious disagreement.
|
|
The accuracy, actual and claimed, vary very widely.
|
|
Only [@http://functions.wolfram.com/ Wolfram Mathematica functions]
|
|
provided a higher accuracy than
|
|
C++ double (64-bit floating-point) and was regarded as
|
|
the most-trusted source by far.
|
|
|
|
A useful index of sources is:
|
|
[@http://www.sal.hut.fi/Teaching/Resources/ProbStat/table.html
|
|
Web-oriented Teaching Resources in Probability and Statistics]
|
|
|
|
[@http://espse.ed.psu.edu/edpsych/faculty/rhale/hale/507Mat/statlets/free/pdist.htm Statlet]:
|
|
Is a Javascript application that calculates and plots probability distributions,
|
|
and provides the most complete range of distributions:
|
|
|
|
[:Bernoulli, Binomial, discrete uniform, geometric, hypergeometric,
|
|
negative binomial, Poisson, beta, Cauchy-Lorentz, chi-sequared, Erlang,
|
|
exponential, extreme value, Fisher, gamma, Laplace, logistic,
|
|
lognormal, normal, Parteo, Student's t, triangular, uniform, and Weibull.]
|
|
|
|
It calculates pdf, cdf, survivor, log survivor, hazard, tail areas,
|
|
& critical values for 5 tail values.
|
|
|
|
It is also the only independent source found for the Weibull distribution,
|
|
unfortunately it appears to suffer from very poor accuracy in areas where
|
|
the underlying special function is known to be difficult to implement.
|
|
|
|
[endsect][/section:implementation Implementation Notes]
|
|
|
|
[/
|
|
Copyright 2006 John Maddock and Paul A. Bristow.
|
|
Distributed under the Boost Software License, Version 1.0.
|
|
(See accompanying file LICENSE_1_0.txt or copy at
|
|
http://www.boost.org/LICENSE_1_0.txt).
|
|
]
|