mirror of
https://github.com/boostorg/math.git
synced 2026-01-29 19:52:08 +00:00
356 lines
26 KiB
Plaintext
356 lines
26 KiB
Plaintext
|
|
[section:perf Performance]
|
|
|
|
[template perf[name value] [value]]
|
|
[template para[text] '''<para>'''[text]'''</para>''']
|
|
|
|
By and large the performance of this library should be acceptable
|
|
for most needs. However, you should note that the library's primary
|
|
emphasis is on accuracy and numerical stability, and /not/ speed.
|
|
|
|
In terms of the algorithms used, this library aims to use the same "best
|
|
of breed" algorithms as many other libraries: the principle difference
|
|
is that this library is implemented in C++ - taking advantage of all
|
|
the abstraction mechanisms that C++ offers - where as most traditional
|
|
numeric libraries are implemented in C or FORTRAN. Traditionally
|
|
languages such as C or FORTAN are perceived as easier to optimise
|
|
than more complex languages like C++, so in a sense this library
|
|
provides a good test of current compiler technology, and the
|
|
"abstraction penalty" - if any - of C++ compared to other languages.
|
|
|
|
[heading Getting the Best Performance from this Library]
|
|
|
|
By far the most important thing you can do when using this library
|
|
is turn on your compiler's optimisation options. As the following
|
|
table shows the penalty for using the library in debug mode can be
|
|
quite large.
|
|
|
|
[caution
|
|
In all of the following tables, the best performing
|
|
result in each row, is assigned a relative value of "1" and shown
|
|
in bold, so a score of "2" means ['"twice as slow as the best
|
|
performing result".] Actual timings in seconds per function call
|
|
are also shown in parenthesis.
|
|
|
|
Result were obtained on a system
|
|
with an Intel 2.8GHz Pentium 4 processor with 2Gb of RAM and running
|
|
either Windows XP or Mandriva Linux.
|
|
|
|
As usual
|
|
with performance results these should be taken with a large pinch
|
|
of salt: relative performance is known to shift quite a bit depending
|
|
upon the architecture of the particular test system used. Further
|
|
more, our performance results were obtained using our own test data:
|
|
these test values are designed to provide good coverage of our code and test
|
|
all the appropriate corner cases. They do not necessarily represent
|
|
"typical" usage: whatever that may be.
|
|
]
|
|
|
|
[table Performance Comparison of Release and Debug Settings
|
|
[[Function]
|
|
[Microsoft Visual C++ 8.0
|
|
|
|
Debug Settings: /Od /ZI
|
|
]
|
|
[Microsoft Visual C++ 8.0
|
|
|
|
Release settings: /Ox /arch:SSE2
|
|
]]
|
|
|
|
[[__erf][[perf msvc-debug-erf..[para 16.65][para (1.028e-006s)]]][[perf msvc-erf..[para *1.00*][para (6.173e-008s)]]]]
|
|
[[__erf_inv][[perf msvc-debug-erf_inv..[para 19.28][para (1.215e-006s)]]][[perf msvc-erf_inv..[para *1.00*][para (6.302e-008s)]]]]
|
|
[[__ibeta and __ibetac][[perf msvc-debug-ibeta..[para 8.32][para (1.540e-005s)]]][[perf msvc-ibeta..[para *1.00*][para (1.852e-006s)]]]]
|
|
[[__ibeta_inv and ibetac_inv][[perf msvc-debug-ibeta_inv..[para 10.25][para (7.492e-005s)]]][[perf msvc-ibeta_inv..[para *1.00*][para (7.311e-006s)]]]]
|
|
[[__ibeta_inva, __ibetac_inva, __ibeta_invb and __ibetac_invb][[perf msvc-debug-ibeta_invab..[para 8.57][para (2.441e-004s)]]][[perf msvc-ibeta_invab..[para *1.00*][para (2.847e-005s)]]]]
|
|
[[__gamma_p and gamma_q][[perf msvc-debug-igamma..[para 10.98][para (1.044e-005s)]]][[perf msvc-igamma..[para *1.00*][para (9.504e-007s)]]]]
|
|
[[__gamma_p_inv and gamma_q_inv][[perf msvc-debug-igamma_inv..[para 10.25][para (3.721e-005s)]]][[perf msvc-igamma_inv..[para *1.00*][para (3.631e-006s)]]]]
|
|
[[__gamma_p_inva and gamma_q_inva][[perf msvc-debug-igamma_inva..[para 11.26][para (1.124e-004s)]]][[perf msvc-igamma_inva..[para *1.00*][para (9.982e-006s)]]]]
|
|
]
|
|
|
|
[heading Comparing Compilers]
|
|
|
|
After a good choice of build settings the next most important thing
|
|
you can do, is choose your compiler
|
|
- and the standard C library it sits on top of - very carefully. GCC-3.x
|
|
in particular has been found to be particularly bad at inlining code,
|
|
and performing the kinds of high level transformations that good C++ performance
|
|
demands (thankfully GCC-4.x is somewhat better in this respect).
|
|
|
|
[table Performance Comparison of Various Windows Compilers
|
|
[[Function]
|
|
[Intel C++ 10.0
|
|
|
|
( /Ox /Qipo /QxN )
|
|
]
|
|
[Microsoft Visual C++ 8.0
|
|
|
|
( /Ox /arch:SSE2 )
|
|
]
|
|
[Cygwin G++ 3.4
|
|
|
|
( /O3 )
|
|
]]
|
|
[[__erf][[perf intel-erf..[para *1.00*][para (4.118e-008s)]]][[perf msvc-erf..[para 1.50][para (6.173e-008s)]]][[perf gcc-erf..[para 3.24][para (1.336e-007s)]]]]
|
|
[[__erf_inv][[perf intel-erf_inv..[para *1.00*][para (4.439e-008s)]]][[perf msvc-erf_inv..[para 1.42][para (6.302e-008s)]]][[perf gcc-erf_inv..[para 7.88][para (3.500e-007s)]]]]
|
|
[[__ibeta and __ibetac][[perf intel-ibeta..[para *1.00*][para (1.631e-006s)]]][[perf msvc-ibeta..[para 1.14][para (1.852e-006s)]]][[perf gcc-ibeta..[para 3.05][para (4.975e-006s)]]]]
|
|
[[__ibeta_inv and ibetac_inv][[perf intel-ibeta_inv..[para *1.00*][para (6.133e-006s)]]][[perf msvc-ibeta_inv..[para 1.19][para (7.311e-006s)]]][[perf gcc-ibeta_inv..[para 2.60][para (1.597e-005s)]]]]
|
|
[[__ibeta_inva, __ibetac_inva, __ibeta_invb and __ibetac_invb][[perf intel-ibeta_invab..[para *1.00*][para (2.453e-005s)]]][[perf msvc-ibeta_invab..[para 1.16][para (2.847e-005s)]]][[perf gcc-ibeta_invab..[para 2.83][para (6.947e-005s)]]]]
|
|
[[__gamma_p and gamma_q][[perf intel-igamma..[para *1.00*][para (6.735e-007s)]]][[perf msvc-igamma..[para 1.41][para (9.504e-007s)]]][[perf gcc-igamma..[para 2.78][para (1.872e-006s)]]]]
|
|
[[__gamma_p_inv and gamma_q_inv][[perf intel-igamma_inv..[para *1.00*][para (2.637e-006s)]]][[perf msvc-igamma_inv..[para 1.38][para (3.631e-006s)]]][[perf gcc-igamma_inv..[para 3.31][para (8.736e-006s)]]]]
|
|
[[__gamma_p_inva and gamma_q_inva][[perf intel-igamma_inva..[para *1.00*][para (7.716e-006s)]]][[perf msvc-igamma_inva..[para 1.29][para (9.982e-006s)]]][[perf gcc-igamma_inva..[para 2.56][para (1.974e-005s)]]]]
|
|
]
|
|
|
|
[heading Performance Tuning Macros]
|
|
|
|
There are a small number of performance tuning options
|
|
that are determined by configuration macros. These should be set
|
|
in boost/math/tools/user.hpp; or else reported to the Boost-development
|
|
mailing list so that the appropriate option for a given compiler and
|
|
OS platform can be set automatically in our configuration setup.
|
|
|
|
[table
|
|
[[Macro][Meaning]]
|
|
[[BOOST_MATH_POLY_METHOD]
|
|
[Determines how polynomials and most rational functions
|
|
are evaluated. Define to one
|
|
of the values 0, 1, 2 or 3: see below for the meaning of these values.]]
|
|
[[BOOST_MATH_RATIONAL_METHOD]
|
|
[Determines how symmetrical rational functions are evaluated: mostly
|
|
this only effects how the Lanczos approximation is evaluated, and how
|
|
the `evaluate_rational` function behaves. Define to one
|
|
of the values 0, 1, 2 or 3: see below for the meaning of these values.
|
|
]]
|
|
[[BOOST_MATH_MAX_POLY_ORDER]
|
|
[The maximum order of polynomial or rational function that will
|
|
be evaluated by a method other than 0 (a simple "for" loop).
|
|
]]
|
|
[[BOOST_MATH_INT_TABLE_TYPE(RT, IT)]
|
|
[Many of the coefficients to the polynomials and rational functions
|
|
used by this library are integers. Normally these are stored as tables
|
|
as integers, but if mixed integer / floating point arithmetic is much
|
|
slower than regular floating point arithmetic then they can be stored
|
|
as tables of floating point values instead. If mixed arithmetic is slow
|
|
then add:
|
|
|
|
#define BOOST_MATH_INT_TABLE_TYPE(RT, IT) RT
|
|
|
|
to boost/math/tools/user.hpp, otherwise the default of:
|
|
|
|
#define BOOST_MATH_INT_TABLE_TYPE(RT, IT) IT
|
|
|
|
Set in boost/math/config.hpp is fine, and may well result in smaller
|
|
code.
|
|
]]
|
|
]
|
|
|
|
The values to which `BOOST_MATH_POLY_METHOD` and `BOOST_MATH_RATIONAL_METHOD`
|
|
may be set are as follows:
|
|
|
|
[table
|
|
[[Value][Effect]]
|
|
[[0][The polynomial or rational function is evaluated using Horner's
|
|
method, and a simple for-loop.
|
|
|
|
Note that if the order of the polynomial
|
|
or rational function is a runtime parameter, or the order is
|
|
greater than the value of `BOOST_MATH_MAX_POLY_ORDER`, then
|
|
this method is always used, irrespective of the value
|
|
of `BOOST_MATH_POLY_METHOD` or `BOOST_MATH_RATIONAL_METHOD`.]]
|
|
[[1][The polynomial or rational function is evaluated without
|
|
the use of a loop, and using Horner's method. This only occurs
|
|
if the order of the polynomial is known at compile time and is less
|
|
than or equal to `BOOST_MATH_MAX_POLY_ORDER`. ]]
|
|
[[2][The polynomial or rational function is evaluated without
|
|
the use of a loop, and using a second order Horner's method.
|
|
In theory this permits two operations to occur in parallel
|
|
for polynomials, and four in parallel for rational functions.
|
|
This only occurs
|
|
if the order of the polynomial is known at compile time and is less
|
|
than or equal to `BOOST_MATH_MAX_POLY_ORDER`.]]
|
|
[[3][The polynomial or rational function is evaluated without
|
|
the use of a loop, and using a second order Horner's method.
|
|
In theory this permits two operations to occur in parallel
|
|
for polynomials, and four in parallel for rational functions.
|
|
This differs from method "2" in that the code is carefully ordered
|
|
to make the parallelisation more obvious to the compiler: rather than
|
|
relying on the compiler's optimiser to spot the parallelisation
|
|
opportunities.
|
|
This only occurs
|
|
if the order of the polynomial is known at compile time and is less
|
|
than or equal to `BOOST_MATH_MAX_POLY_ORDER`.]]
|
|
]
|
|
|
|
To determine which
|
|
of these options is best for your particular compiler/platform build
|
|
the performance test application with your usual release settings,
|
|
and run the program with the --tune command line option.
|
|
|
|
In practice the difference between methods is rather small at present,
|
|
as the following table shows. However, parallelisation /vectorisation
|
|
is likely to become more important in the future: quite likely the methods
|
|
currently supported will need to be supplemented or replaced by ones more
|
|
suited to highly vectorisable processors in the future.
|
|
|
|
[table A Comparison of Polynomial Evaluation Methods
|
|
[[Compiler/platform][Method 0][Method 1][Method 2][Method 3]]
|
|
[[Microsoft C++ 8.0, Polynomial evaluation] [[perf msvc-Polynomial-method-0..[para 1.34][para (1.161e-007s)]]][[perf msvc-Polynomial-method-1..[para 1.13][para (9.777e-008s)]]][[perf msvc-Polynomial-method-2..[para 1.07][para (9.289e-008s)]]][[perf msvc-Polynomial-method-3..[para *1.00*][para (8.678e-008s)]]]]
|
|
[[Microsoft C++ 8.0, Rational evaluation] [[perf msvc-Rational-method-0..[para *1.00*][para (1.443e-007s)]]][[perf msvc-Rational-method-1..[para 1.03][para (1.492e-007s)]]][[perf msvc-Rational-method-2..[para 1.20][para (1.736e-007s)]]][[perf msvc-Rational-method-3..[para 1.07][para (1.540e-007s)]]]]
|
|
[[Intel C++ 10.0 (Windows), Polynomial evaluation] [[perf intel-Polynomial-method-0..[para 1.03][para (7.702e-008s)]]][[perf intel-Polynomial-method-1..[para 1.03][para (7.702e-008s)]]][[perf intel-Polynomial-method-2..[para *1.00*][para (7.446e-008s)]]][[perf intel-Polynomial-method-3..[para 1.03][para (7.690e-008s)]]]]
|
|
[[Intel C++ 10.0 (Windows), Rational evaluation] [[perf intel-Rational-method-0..[para *1.00*][para (1.245e-007s)]]][[perf intel-Rational-method-1..[para *1.00*][para (1.245e-007s)]]][[perf intel-Rational-method-2..[para 1.18][para (1.465e-007s)]]][[perf intel-Rational-method-3..[para 1.06][para (1.318e-007s)]]]]
|
|
[[GNU G++ 4.2 (Linux), Polynomial evaluation] [[perf gcc-4_2-ld-Polynomial-method-0..[para 1.61][para (1.220e-007s)]]][[perf gcc-4_2-ld-Polynomial-method-1..[para 1.68][para (1.269e-007s)]]][[perf gcc-4_2-ld-Polynomial-method-2..[para 1.23][para (9.275e-008s)]]][[perf gcc-4_2-ld-Polynomial-method-3..[para *1.00*][para (7.566e-008s)]]]]
|
|
[[GNU G++ 4.2 (Linux), Rational evaluation] [[perf gcc-4_2-ld-Rational-method-0..[para 1.26][para (1.660e-007s)]]][[perf gcc-4_2-ld-Rational-method-1..[para 1.33][para (1.758e-007s)]]][[perf gcc-4_2-ld-Rational-method-2..[para *1.00*][para (1.318e-007s)]]][[perf gcc-4_2-ld-Rational-method-3..[para 1.15][para (1.513e-007s)]]]]
|
|
[[Intel C++ 10.0 (Linux), Polynomial evaluation] [[perf intel-linux-Polynomial-method-0..[para 1.15][para (9.154e-008s)]]][[perf intel-linux-Polynomial-method-1..[para 1.15][para (9.154e-008s)]]][[perf intel-linux-Polynomial-method-2..[para *1.00*][para (7.934e-008s)]]][[perf intel-linux-Polynomial-method-3..[para *1.00*][para (7.934e-008s)]]]]
|
|
[[Intel C++ 10.0 (Linux), Rational evaluation] [[perf intel-linux-Rational-method-0..[para *1.00*][para (1.245e-007s)]]][[perf intel-linux-Rational-method-1..[para *1.00*][para (1.245e-007s)]]][[perf intel-linux-Rational-method-2..[para 1.35][para (1.684e-007s)]]][[perf intel-linux-Rational-method-3..[para 1.04][para (1.294e-007s)]]]]
|
|
]
|
|
|
|
There is one final performance tuning option that is available as a compile time
|
|
[link math_toolkit.policy policy]. Normally when evaluating functions at `double`
|
|
precision, these are actually evaluated at `long double` precision internally:
|
|
this helps to ensure that as close to full `double` precision as possible is
|
|
achieved, but may slow down execution in some environments. The defaults for
|
|
this policy can be changed by
|
|
[link math_toolkit.policy.pol_ref.policy_defaults
|
|
defining the macro `BOOST_MATH_PROMOTE_DOUBLE_POLICY`]
|
|
to `false`, or
|
|
[link math_toolkit.policy.pol_ref.internal_promotion
|
|
by specifying a specific policy] when calling the special
|
|
functions or distributions. See also the
|
|
[link math_toolkit.policy.pol_tutorial policy tutorial].
|
|
|
|
[table Performance Comparison with and Without Internal Promotion to long double
|
|
[[Function]
|
|
[GCC 4.2 , Linux
|
|
|
|
(with internal promotion of double to long double).
|
|
]
|
|
[GCC 4.2, Linux
|
|
|
|
(without promotion of double).
|
|
]
|
|
]
|
|
[[__erf][[perf gcc-4_2-ld-erf..[para 1.48][para (1.387e-007s)]]][[perf gcc-4_2-erf..[para *1.00*][para (9.377e-008s)]]]]
|
|
[[__erf_inv][[perf gcc-4_2-ld-erf_inv..[para 1.11][para (4.009e-007s)]]][[perf gcc-4_2-erf_inv..[para *1.00*][para (3.598e-007s)]]]]
|
|
[[__ibeta and __ibetac][[perf gcc-4_2-ld-ibeta..[para 1.29][para (5.354e-006s)]]][[perf gcc-4_2-ibeta..[para *1.00*][para (4.137e-006s)]]]]
|
|
[[__ibeta_inv and ibetac_inv][[perf gcc-4_2-ld-ibeta_inv..[para 1.44][para (2.220e-005s)]]][[perf gcc-4_2-ibeta_inv..[para *1.00*][para (1.538e-005s)]]]]
|
|
[[__ibeta_inva, __ibetac_inva, __ibeta_invb and __ibetac_invb][[perf gcc-4_2-ld-ibeta_invab..[para 1.25][para (7.009e-005s)]]][[perf gcc-4_2-ibeta_invab..[para *1.00*][para (5.607e-005s)]]]]
|
|
[[__gamma_p and gamma_q][[perf gcc-4_2-ld-igamma..[para 1.26][para (3.116e-006s)]]][[perf gcc-4_2-igamma..[para *1.00*][para (2.464e-006s)]]]]
|
|
[[__gamma_p_inv and gamma_q_inv][[perf gcc-4_2-ld-igamma_inv..[para 1.27][para (1.178e-005s)]]][[perf gcc-4_2-igamma_inv..[para *1.00*][para (9.291e-006s)]]]]
|
|
[[__gamma_p_inva and gamma_q_inva][[perf gcc-4_2-ld-igamma_inva..[para 1.20][para (2.765e-005s)]]][[perf gcc-4_2-igamma_inva..[para *1.00*][para (2.311e-005s)]]]]
|
|
]
|
|
|
|
[heading Comparisons to Other Open Source Libraries]
|
|
|
|
We've run our performance tests both for our own code, and against other
|
|
open source implementations of the same functions. The results are
|
|
presented below to give you a rough idea of how they all compare.
|
|
|
|
[caution
|
|
You should exercise extreme caution when interpreting
|
|
these results, relative performance may vary by platform, the tests use
|
|
data that gives good code coverage of /our/ code, but which may skew the
|
|
results towards the corner cases. Finally, remember that different
|
|
libraries make different choices with regard to performance verses
|
|
numerical stability.
|
|
]
|
|
|
|
[heading Comparison to GSL-1.9 and Cephes]
|
|
|
|
All the results were measured on a 2.8GHz Intel Pentium 4, 2Gb RAM, Windows XP
|
|
machine, with all the libraries compiled with Microsoft Visual C++ 2005 using
|
|
the `/Ox /arch:SSE2` options.
|
|
|
|
[table
|
|
[[Function][Boost][GSL-1.9][Cephes]]
|
|
[[__tgamma][[perf msvc-gamma..[para 1.50][para (2.566e-007s)]]][[perf msvc-gamma-gsl..[para 1.54][para (2.627e-007s)]]][[perf msvc-gamma-cephes..[para *1.00*][para (1.709e-007s)]]]]
|
|
[[__lgamma][[perf msvc-lgamma..[para 1.73][para (2.688e-007s)]]][[perf msvc-lgamma-gsl..[para 3.61][para (5.621e-007s)]]][[perf msvc-lgamma-cephes..[para *1.00*][para (1.556e-007s)]]]]
|
|
[[__gamma_p and __gamma_q][[perf msvc-igamma..[para *1.00*][para (9.504e-007s)]]][[perf msvc-igamma-gsl..[para 2.15][para (2.042e-006s)]]][[perf msvc-igamma-cephes..[para 2.57][para (2.439e-006s)]]]]
|
|
[[__gamma_p_inv and __gamma_q_inv][[perf msvc-igamma_inv..[para *1.00*][para (3.631e-006s)]]][N\/A][+INF [footnote Cephes gets stuck in an infinite loop while trying to execute our test cases.]]]
|
|
[[__ibeta and __ibetac][[perf msvc-ibeta..[para *1.00*][para (1.852e-006s)]]][[perf msvc-ibeta-cephes..[para 1.07][para (1.974e-006s)]]][[perf msvc-ibeta-cephes..[para 1.07][para (1.974e-006s)]]]]
|
|
[[__ibeta_inv and __ibetac_inv][[perf msvc-ibeta_inv..[para *1.00*][para (7.311e-006s)]]][N\/A][[perf msvc-ibeta_inv-cephes..[para 2.24][para (1.637e-005s)]]]]
|
|
]
|
|
|
|
[heading Comparison to the R Statistical Library on Windows]
|
|
|
|
All the results were measured on a 2.8GHz Intel Pentium 4, 2Gb RAM, Windows XP
|
|
machine, with the test program compiled with Microsoft Visual C++ 2005, and
|
|
R-2.5.0 compiled in "standalone mode" with MinGW-3.4
|
|
(R-2.5.0 appears not to be buildable with Visual C++).
|
|
|
|
[table A Comparison to the R Statistical Library on Windows XP
|
|
[[Statistical Function][Boost][R]]
|
|
[[__beta_distrib CDF][[perf msvc-dist-beta-cdf..[para 1.20][para (1.916e-006s)]]][[perf msvc-dist-beta-R-cdf..[para *1.00*][para (1.597e-006s)]]]]
|
|
[[__beta_distrib Quantile][[perf msvc-dist-beta-quantile..[para *1.00*][para (6.570e-006s)]]][[perf msvc-dist-beta-R-quantile..[para 74.66[footnote There are a small number of our test cases where the R library fails to converge on a result: these tend to dominate the performance result.]][para (4.905e-004s)]][para (4.905e-004s)]]]]
|
|
[[__binomial_distrib CDF][[perf msvc-dist-binomial-cdf..[para *1.00*][para (5.276e-007s)]]][[perf msvc-dist-binom-R-cdf..[para 2.45][para (1.293e-006s)]]]]
|
|
[[__binomial_distrib Quantile][[perf msvc-dist-binomial-quantile..[para *1.00*][para (4.013e-006s)]]][[perf msvc-dist-binom-R-quantile..[para 1.32][para (5.280e-006s)]]]]
|
|
[[__cauchy_distrib CDF][[perf msvc-dist-cauchy-cdf..[para *1.00*][para (1.231e-007s)]]][[perf msvc-dist-cauchy-R-cdf..[para 1.28][para (1.576e-007s)]]]]
|
|
[[__cauchy_distrib Quantile][[perf msvc-dist-cauchy-quantile..[para *1.00*][para (1.498e-007s)]]][[perf msvc-dist-cauchy-R-quantile..[para *1.00*][para (1.498e-007s)]]]]
|
|
[[__chi_squared_distrib CDF][[perf msvc-dist-chi_squared-cdf..[para *1.00*][para (7.889e-007s)]]][[perf msvc-dist-chisq-R-cdf..[para 2.48][para (1.955e-006s)]]]]
|
|
[[__chi_squared_distrib Quantile][[perf msvc-dist-chi_squared-quantile..[para *1.00*][para (4.303e-006s)]]][[perf msvc-dist-chisq-R-quantile..[para 1.61][para (6.925e-006s)]]]]
|
|
[[__exp_distrib CDF][[perf msvc-dist-exponential-cdf..[para *1.00*][para (1.955e-007s)]]][[perf msvc-dist-exp-R-cdf..[para 1.97][para (3.844e-007s)]]]]
|
|
[[__exp_distrib Quantile][[perf msvc-dist-exponential-quantile..[para 1.07][para (1.206e-007s)]]][[perf msvc-dist-exp-R-quantile..[para *1.00*][para (1.126e-007s)]]]]
|
|
[[__F_distrib CDF][[perf msvc-dist-fisher_f-cdf..[para *1.00*][para (1.309e-006s)]]][[perf msvc-dist-f-R-cdf..[para 2.12][para (2.780e-006s)]]]]
|
|
[[__F_distrib Quantile][[perf msvc-dist-fisher_f-quantile..[para *1.00*][para (7.204e-006s)]]][[perf msvc-dist-f-R-quantile..[para 1.78][para (1.280e-005s)]]]]
|
|
[[__gamma_distrib CDF][[perf msvc-dist-gamma-cdf..[para *1.00*][para (1.076e-006s)]]][[perf msvc-dist-gamma-R-cdf..[para 2.07][para (2.227e-006s)]]]]
|
|
[[__gamma_distrib Quantile][[perf msvc-dist-gamma-quantile..[para *1.00*][para (5.189e-006s)]]][[perf msvc-dist-gamma-R-quantile..[para 1.14][para (5.937e-006s)]]]]
|
|
[[__lognormal_distrib CDF][[perf msvc-dist-lognormal-cdf..[para *1.00*][para (2.078e-007s)]]][[perf msvc-dist-lnorm-R-cdf..[para 1.41][para (2.930e-007s)]]]]
|
|
[[__lognormal_distrib Quantile][[perf msvc-dist-lognormal-quantile..[para *1.00*][para (6.692e-007s)]]][[perf msvc-dist-lnorm-R-quantile..[para 1.63][para (1.090e-006s)]]]]
|
|
[[__negative_binomial_distrib CDF][[perf msvc-dist-negative_binomial-cdf..[para *1.00*][para (9.005e-007s)]]][[perf msvc-dist-nbinom-R-cdf..[para 2.42][para (2.178e-006s)]]]]
|
|
[[__negative_binomial_distrib Quantile][[perf msvc-dist-negative_binomial-quantile..[para *1.00*][para (9.601e-006s)]]][[perf msvc-dist-nbinom-R-quantile..[para 53.59[footnote The R library appears to use a linear-search strategy, that can perform very badly in a small number of pathological cases, but may or may not be more efficient in "typical" cases]][para (5.145e-004s)]][para (5.145e-004s)]]]]
|
|
[[__normal_distrib CDF][[perf msvc-dist-normal-cdf..[para *1.00*][para (5.926e-008s)]]][[perf msvc-dist-norm-R-cdf..[para 3.01][para (1.785e-007s)]]]]
|
|
[[__normal_distrib Quantile][[perf msvc-dist-normal-quantile..[para *1.00*][para (1.248e-007s)]]][[perf msvc-dist-norm-R-quantile..[para 1.05][para (1.311e-007s)]]]]
|
|
[[__poisson_distrib CDF][[perf msvc-dist-poisson-cdf..[para *1.00*][para (8.999e-007s)]]][[perf msvc-dist-pois-R-cdf..[para 2.42][para (2.175e-006s)]]]]
|
|
[[__poisson_distrib][[perf msvc-dist-poisson-quantile..[para *1.00*][para (1.853e-006s)]]][[perf msvc-dist-pois-R-quantile..[para 2.17][para (4.014e-006s)]]]]
|
|
[[__students_t_distrib CDF][[perf msvc-dist-students_t-cdf..[para *1.00*][para (1.223e-006s)]]][[perf msvc-dist-t-R-cdf..[para 1.13][para (1.376e-006s)]]]]
|
|
[[__students_t_distrib Quantile][[perf msvc-dist-students_t-quantile..[para *1.00*][para (2.570e-006s)]]][[perf msvc-dist-t-R-quantile..[para 1.04][para (2.668e-006s)]]]]
|
|
[[__weibull_distrib CDF][[perf msvc-dist-weibull-cdf..[para *1.00*][para (4.741e-007s)]]][[perf msvc-dist-weibull-R-cdf..[para 1.46][para (6.943e-007s)]]]]
|
|
[[__weibull_distrib Quantile][[perf msvc-dist-weibull-quantile..[para *1.00*][para (7.926e-007s)]]][[perf msvc-dist-weibull-R-quantile..[para 1.08][para (8.542e-007s)]]]]
|
|
]
|
|
|
|
[heading Comparison to the R Statistical Library on Linux]
|
|
|
|
All the results were measured on a 2.8GHz Intel Pentium 4, 2Gb RAM, Mandriva Linux
|
|
machine, with the test program and R-2.5.0 compiled with GNU G++ 4.2.0.
|
|
|
|
[table A Comparison to the R Statistical Library on Linux
|
|
[[Statistical Function][Boost][R]]
|
|
[[__beta_distrib CDF][[perf gcc-4_2-dist-beta-cdf..[para 1.71][para (3.508e-006s)]]][[perf gcc-4_2-dist-beta-R-cdf..[para *1.00*][para (2.050e-006s)]]]]
|
|
[[__beta_distrib Quantile][[perf gcc-4_2-dist-beta-quantile..[para *1.00*][para (1.294e-005s)]]][[perf gcc-4_2-dist-beta-R-quantile..[para 44.06[footnote There are a small number of our test cases where the R library fails to converge on a result: these tend to dominate the performance result.]][para (5.701e-004s)]][para (5.701e-004s)]]]]
|
|
[[__binomial_distrib CDF][[perf gcc-4_2-dist-binomial-cdf..[para 1.22][para (1.342e-006s)]]][[perf gcc-4_2-dist-binom-R-cdf..[para *1.00*][para (1.104e-006s)]]]]
|
|
[[__binomial_distrib Quantile][[perf gcc-4_2-dist-binomial-quantile..[para 1.36][para (7.083e-006s)]]][[perf gcc-4_2-dist-binom-R-quantile..[para *1.00*][para (5.194e-006s)]]]]
|
|
[[__cauchy_distrib CDF][[perf gcc-4_2-dist-cauchy-cdf..[para *1.00*][para (1.372e-007s)]]][[perf gcc-4_2-dist-cauchy-R-cdf..[para 1.47][para (2.017e-007s)]]]]
|
|
[[__cauchy_distrib Quantile][[perf gcc-4_2-dist-cauchy-quantile..[para *1.00*][para (1.542e-007s)]]][[perf gcc-4_2-dist-cauchy-R-quantile..[para 1.14][para (1.752e-007s)]]]]
|
|
[[__chi_squared_distrib CDF][[perf gcc-4_2-dist-chi_squared-cdf..[para 1.04][para (1.820e-006s)]]][[perf gcc-4_2-dist-chisq-R-cdf..[para *1.00*][para (1.753e-006s)]]]]
|
|
[[__chi_squared_distrib Quantile][[perf gcc-4_2-dist-chi_squared-quantile..[para 1.39][para (9.345e-006s)]]][[perf gcc-4_2-dist-chisq-R-quantile..[para *1.00*][para (6.728e-006s)]]]]
|
|
[[__exp_distrib CDF][[perf gcc-4_2-dist-exponential-cdf..[para *1.00*][para (2.195e-007s)]]][[perf gcc-4_2-dist-exp-R-cdf..[para 1.17][para (2.561e-007s)]]]]
|
|
[[__exp_distrib Quantile][[perf gcc-4_2-dist-exponential-quantile..[para *1.00*][para (1.123e-007s)]]][[perf gcc-4_2-dist-exp-R-quantile..[para 1.03][para (1.155e-007s)]]]]
|
|
[[__F_distrib CDF][[perf gcc-4_2-dist-fisher_f-cdf..[para *1.00*][para (2.744e-006s)]]][[perf gcc-4_2-dist-f-R-cdf..[para 1.08][para (2.970e-006s)]]]]
|
|
[[__F_distrib Quantile][[perf gcc-4_2-dist-fisher_f-quantile..[para 1.14][para (1.550e-005s)]]][[perf gcc-4_2-dist-f-R-quantile..[para *1.00*][para (1.359e-005s)]]]]
|
|
[[__gamma_distrib CDF][[perf gcc-4_2-dist-gamma-cdf..[para 1.29][para (2.578e-006s)]]][[perf gcc-4_2-dist-gamma-R-cdf..[para *1.00*][para (1.992e-006s)]]]]
|
|
[[__gamma_distrib Quantile][[perf gcc-4_2-dist-gamma-quantile..[para 1.77][para (1.020e-005s)]]][[perf gcc-4_2-dist-gamma-R-quantile..[para *1.00*][para (5.757e-006s)]]]]
|
|
[[__lognormal_distrib CDF][[perf gcc-4_2-dist-lognormal-cdf..[para *1.00*][para (1.782e-007s)]]][[perf gcc-4_2-dist-lnorm-R-cdf..[para 2.00][para (3.564e-007s)]]]]
|
|
[[__lognormal_distrib Quantile][[perf gcc-4_2-dist-lognormal-quantile..[para *1.00*][para (7.093e-007s)]]][[perf gcc-4_2-dist-lnorm-R-quantile..[para 1.07][para (7.607e-007s)]]]]
|
|
[[__negative_binomial_distrib CDF][[perf gcc-4_2-dist-negative_binomial-cdf..[para 1.03][para (2.209e-006s)]]][[perf gcc-4_2-dist-nbinom-R-cdf..[para *1.00*][para (2.141e-006s)]]]]
|
|
[[__negative_binomial_distrib Quantile][[perf gcc-4_2-dist-negative_binomial-quantile..[para *1.00*][para (1.826e-005s)]]][[perf gcc-4_2-dist-nbinom-R-quantile..[para 30.07[footnote The R library appears to use a linear-search strategy, that can perform very badly in a small number of pathological cases, but may or may not be more efficient in "typical" cases]][para (5.490e-004s)]][para (5.490e-004s)]]]]
|
|
[[__normal_distrib CDF][[perf gcc-4_2-dist-normal-cdf..[para *1.00*][para (8.542e-008s)]]][[perf gcc-4_2-dist-norm-R-cdf..[para 2.09][para (1.782e-007s)]]]]
|
|
[[__normal_distrib Quantile][[perf gcc-4_2-dist-normal-quantile..[para *1.00*][para (1.362e-007s)]]][[perf gcc-4_2-dist-norm-R-quantile..[para 1.26][para (1.722e-007s)]]]]
|
|
[[__poisson_distrib CDF][[perf gcc-4_2-dist-poisson-cdf..[para 1.10][para (1.953e-006s)]]][[perf gcc-4_2-dist-pois-R-cdf..[para *1.00*][para (1.775e-006s)]]]]
|
|
[[__poisson_distrib][[perf gcc-4_2-dist-poisson-quantile..[para 1.12][para (4.214e-006s)]]][[perf gcc-4_2-dist-pois-R-quantile..[para *1.00*][para (3.752e-006s)]]]]
|
|
[[__students_t_distrib CDF][[perf gcc-4_2-dist-students_t-cdf..[para 1.55][para (2.441e-006s)]]][[perf gcc-4_2-dist-t-R-cdf..[para *1.00*][para (1.576e-006s)]]]]
|
|
[[__students_t_distrib Quantile][[perf gcc-4_2-dist-students_t-quantile..[para 1.33][para (3.972e-006s)]]][[perf gcc-4_2-dist-t-R-quantile..[para *1.00*][para (2.990e-006s)]]]]
|
|
[[__weibull_distrib CDF][[perf gcc-4_2-dist-weibull-cdf..[para *1.00*][para (6.640e-007s)]]][[perf gcc-4_2-dist-weibull-R-cdf..[para 1.06][para (7.031e-007s)]]]]
|
|
[[__weibull_distrib Quantile][[perf gcc-4_2-dist-weibull-quantile..[para *1.00*][para (7.504e-007s)]]][[perf gcc-4_2-dist-weibull-R-quantile..[para 1.03][para (7.710e-007s)]]]]
|
|
]
|
|
|
|
[endsect]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|