2
0
mirror of https://github.com/boostorg/math.git synced 2026-01-26 18:52:10 +00:00
Files
math/doc/distributions/dist_reference.qbk
Evan Miller 18ed616376 Kolmogorov-Smirnov distribution (#422)
* Kolmogorov-Smirnov distribution #421

Add a new distribution, kolmogorov_smirnov_distribution, which takes a
parameter that represents the number of observations used in a
Kolmogorov-Smirnov test. (The K-S test is a popular test for comparing
two CDFs, but the test statistic is not implemented here.)

This implementation includes Kolmogorov's original 1st order Taylor
expansion. There is a literature on the distribution's other
mathematical properties (higher order terms and exact version); this
literature is summarized in the main header file for anyone who may
want to expand the implementation later.

The CDF is implemented using a Jacobi theta function, and the PDF is a
hand-rolled derivative of that function. Quantiles plug the CDF and PDF
into a Newton-Raphson iteration. The mean and variance have nice
closed-form expressions, and the mode uses a dumb run-time maximizer.

This commit includes graphs, a ULP plotter for the PDF, and the usual
compilation and numerical tests. The test file is on the small side, but
it integrates the distribution from zero to infinity, and covers the
quantiles pretty well. As of now the numerical tests only verify
self-consistency (e.g. distribution moments and CDF-quantile relations),
so there's room to add some external checks.

* Implement skewness for K-S distribution [CI SKIP]

The third moment integrates nicely with the help of Apery's constant
(zeta_three). Verify the result via quadrature.

* Implement kurtosis for the K-S distribution

Verify the result via quadrature.
2020-09-04 08:48:51 -04:00

142 lines
4.5 KiB
Plaintext

[section:dist_ref Statistical Distributions Reference]
[include non_members.qbk]
[section:dists Distributions]
[include arcsine.qbk]
[include bernoulli.qbk]
[include beta.qbk]
[include binomial.qbk]
[include cauchy.qbk]
[include chi_squared.qbk]
[include empirical_cdf.qbk]
[include exponential.qbk]
[include extreme_value.qbk]
[include fisher.qbk]
[include gamma.qbk]
[include geometric.qbk]
[include hyperexponential.qbk]
[include hypergeometric.qbk]
[include inverse_chi_squared.qbk]
[include inverse_gamma.qbk]
[include inverse_gaussian.qbk]
[include kolmogorov_smirnov.qbk]
[include laplace.qbk]
[include logistic.qbk]
[include lognormal.qbk]
[include negative_binomial.qbk]
[include nc_beta.qbk]
[include nc_chi_squared.qbk]
[include nc_f.qbk]
[include nc_t.qbk]
[include normal.qbk]
[include pareto.qbk]
[include poisson.qbk]
[include rayleigh.qbk]
[include skew_normal.qbk]
[include students_t.qbk]
[include triangular.qbk]
[include uniform.qbk]
[include weibull.qbk]
[endsect] [/section:dists Distributions]
[include dist_algorithms.qbk]
[endsect] [/section:dist_ref Statistical Distributions and Functions Reference]
[section:future Extras/Future Directions]
[h4 Adding Additional Location and Scale Parameters]
In some modelling applications we require a distribution
with a specific location and scale:
often this equates to a specific mean and standard deviation, although for many
distributions the relationship between these properties and the location and
scale parameters are non-trivial. See
[@http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm]
for more information.
The obvious way to handle this is via an adapter template:
template <class Dist>
class scaled_distribution
{
scaled_distribution(
const Dist dist,
typename Dist::value_type location,
typename Dist::value_type scale = 0);
};
Which would then have its own set of overloads for the non-member accessor functions.
[h4 An "any_distribution" class]
It is easy to add a distribution object that virtualises
the actual type of the distribution, and can therefore hold "any" object
that conforms to the conceptual requirements of a distribution:
template <class RealType>
class any_distribution
{
public:
template <class Distribution>
any_distribution(const Distribution& d);
};
// Get the cdf of the underlying distribution:
template <class RealType>
RealType cdf(const any_distribution<RealType>& d, RealType x);
// etc....
Such a class would facilitate the writing of non-template code that can
function with any distribution type.
The [@http://sourceforge.net/projects/distexplorer/ Statistical Distribution Explorer]
utility for Windows is a usage example.
It's not clear yet whether there is a compelling use case though.
Possibly tests for goodness of fit might
provide such a use case: this needs more investigation.
[h4 Higher Level Hypothesis Tests]
Higher-level tests roughly corresponding to the
[@http://documents.wolfram.com/mathematica/Add-onsLinks/StandardPackages/Statistics/HypothesisTests.html Mathematica Hypothesis Tests]
package could be added reasonably easily, for example:
template <class InputIterator>
typename std::iterator_traits<InputIterator>::value_type
test_equal_mean(
InputIterator a,
InputIterator b,
typename std::iterator_traits<InputIterator>::value_type expected_mean);
Returns the probability that the data in the sequence \[a,b) has the mean
/expected_mean/.
[h4 Integration With Statistical Accumulators]
[@http://boost-sandbox.sourceforge.net/libs/accumulators/doc/html/index.html
Eric Niebler's accumulator framework] - also work in progress - provides the means
to calculate various statistical properties from experimental data. There is an
opportunity to integrate the statistical tests with this framework at some later date:
// Define an accumulator, all required statistics to calculate the test
// are calculated automatically:
accumulator_set<double, features<tag::test_expected_mean> > acc(expected_mean=4);
// Pass our data to the accumulator:
acc = std::for_each(mydata.begin(), mydata.end(), acc);
// Extract the result:
double p = probability(acc);
[endsect] [/section:future Extras Future Directions]
[/ dist_reference.qbk
Copyright 2006, 2010 John Maddock and Paul A. Bristow.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]