2
0
mirror of https://github.com/boostorg/math.git synced 2026-02-21 15:12:28 +00:00
Files
math/doc/distributions/negative_binomial_example.qbk
Paul A. Bristow e8a40a50ab replaced Bernoulli refer
[SVN r38659]
2007-08-14 14:21:12 +00:00

250 lines
10 KiB
Plaintext

[section:neg_binom_eg Negative Binomial Distribution Examples]
(See also the reference documentation for the __negative_binomial_distrib.)
[section:neg_binom_conf Calculating Confidence Limits on the Frequency of Occurrence for the Negative Binomial Distribution]
Imagine you have a process that follows a negative binomial distribution:
for each trial conducted, an event either occurs or does it does not, referred
to as "successes" and "failures". The frequency with which successes occur
is variously referred to as the
success fraction, success ratio, success percentage, occurrence frequency, or probability of occurrence.
If, by experiment, you want to measure the
the best estimate of success fraction is given simply
by /k/ \/ /N/, for /k/ successes out of /N/ trials.
However our confidence in that estimate will be shaped by how many trials were conducted,
and how many successes were observed. The static member functions
`negative_binomial_distribution<>::find_lower_bound_on_p` and
`negative_binomial_distribution<>::find_upper_bound_on_p`
allow you to calculate the confidence intervals for your estimate of the success fraction.
The sample program [@../../example/neg_binomial_confidence_limits.cpp
neg_binomial_confidence_limits.cpp] illustrates their use.
[import ../../example/neg_binomial_confidence_limits.cpp]
[neg_binomial_confidence_limits]
Let's see some sample output for a 1 in 10
success ratio, first for a mere 20 trials:
[pre'''______________________________________________
2-Sided Confidence Limits For Success Fraction
______________________________________________
Number of trials = 20
Number of successes = 2
Number of failures = 18
Observed frequency of occurrence = 0.1
___________________________________________
Confidence Lower Upper
Value (%) Limit Limit
___________________________________________
50.000 0.04812 0.13554
75.000 0.03078 0.17727
90.000 0.01807 0.22637
95.000 0.01235 0.26028
99.000 0.00530 0.33111
99.900 0.00164 0.41802
99.990 0.00051 0.49202
99.999 0.00016 0.55574
''']
As you can see, even at the 95% confidence level the bounds (0.012 to 0.26) are
really very wide, and very asymmetric about the observed value 0.1.
Compare that with the program output for a mass
2000 trials:
[pre'''______________________________________________
2-Sided Confidence Limits For Success Fraction
______________________________________________
Number of trials = 2000
Number of successes = 200
Number of failures = 1800
Observed frequency of occurrence = 0.1
___________________________________________
Confidence Lower Upper
Value (%) Limit Limit
___________________________________________
50.000 0.09536 0.10445
75.000 0.09228 0.10776
90.000 0.08916 0.11125
95.000 0.08720 0.11352
99.000 0.08344 0.11802
99.900 0.07921 0.12336
99.990 0.07577 0.12795
99.999 0.07282 0.13206
''']
Now even when the confidence level is very high, the limits (at 99.999%, 0.07 to 0.13) are really
quite close and nearly symmetric to the observed value of 0.1.
[endsect][/section:neg_binom_conf Calculating Confidence Limits on the Frequency of Occurrence]
[section:neg_binom_size_eg Estimating Sample Sizes for the Negative Binomial.]
Imagine you have an event
(let's call it a "failure" - though we could equally well call it a success if we felt it was a 'good' event)
that you know will occur in 1 in N trials. You may want to know how many trials you need to
conduct to be P% sure of observing at least k such failures.
If the failure events follow a negative binomial
distribution (each trial either succeeds or fails)
then the static member function `negative_binomial_distibution<>::find_minimum_number_of_trials`
can be used to estimate the minimum number of trials required to be P% sure
of observing the desired number of failures.
The example program
[@../../example/neg_binomial_sample_sizes.cpp neg_binomial_sample_sizes.cpp]
demonstrates its usage.
[import ../../example/neg_binomial_sample_sizes.cpp]
[neg_binomial_sample_sizes]
[note Since we're calculating the /minimum/ number of trials required,
we'll err on the safe side and take the ceiling of the result.
Had we been calculating the
/maximum/ number of trials permitted to observe less than a certain
number of /failures/ then we would have taken the floor instead. We
would also have called `find_minimum_number_of_trials` like this:
``
floor(negative_binomial::find_minimum_number_of_trials(failures, p, alpha[i]))
``
which would give us the largest number of trials we could conduct and
still be P% sure of observing /failures or less/ failure events, when the
probability of success is /p/.]
We'll finish off by looking at some sample output, firstly suppose
we wish to observe at least 5 "failures" with a 50/50 (0.5) chance of
success or failure:
[pre
'''Target number of failures = 5, Success fraction = 50%
____________________________
Confidence Min Number
Value (%) Of Trials
____________________________
50.000 11
75.000 14
90.000 17
95.000 18
99.000 22
99.900 27
99.990 31
99.999 36
'''
]
So 18 trials or more would yield a 95% chance that at least our 5
required failures would be observed.
Compare that to what happens if the success ratio is 90%:
[pre'''Target number of failures = 5.000, Success fraction = 90.000%
____________________________
Confidence Min Number
Value (%) Of Trials
____________________________
50.000 57
75.000 73
90.000 91
95.000 103
99.000 127
99.900 159
99.990 189
99.999 217
''']
So now 103 trials are required to observe at least 5 failures with
95% certainty.
[endsect] [/section:neg_binom_size_eg Estimating Sample Sizes.]
[section:negative_binomial_example1 Negative Binomial example 1.]
The example program
[@../../example/negative_binomial_example1.cpp negative_binomial_example1.cpp (full source code)]
demonstrates a simple use to find the probability of meeting a sale quota.
Based on [@http://en.wikipedia.org/wiki/Negative_binomial_distribution
a problem by Dr. Diane Evans,
Professor of Mathematics at Rose-Hulman Institute of Technology].
Pat is required to sell candy bars to raise money for the 6th grade field trip.
There are thirty houses in the neighborhood,
and Pat is not supposed to return home until five candy bars have been sold.
So the child goes door to door, selling candy bars.
At each house, there is a 0.4 probability (40%) of selling one candy bar
and a 0.6 probability (60%) of selling nothing.
What is the probability mass (density) function for selling the last (fifth)
candy bar at the nth house?
The Negative Binomial(r, p) distribution describes the probability of k failures
and r successes in k+r Bernoulli(p) trials with success on the last trial.
(A [@http://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli trial]
is one with only two possible outcomes, success of failure,
and p is the probability of success).
Selling five candy bars means getting five successes, so successes r = 5.
The total number of trials (n) in this case, houses visited) this takes is therefore
= sucesses + failures or k + r = k + 5.
The random variable we are interested in is the number of houses (k)
that must be visited to sell five candy bars,
so we substitute k = n 5 into a negative_binomial(5, 0.4) mass (density) function
and obtain the following mass (density) function of the distribution of houses (for n >= 5):
Obviously, the best case is that Pat makes sales on all the first five houses.
What is the probability that Pat finishes /on the tenth house/?
f(10) = 0.1003290624, or about 1 in 10
What is the probability that Pat finishes /on or before/ reaching the eighth house?
To finish on or before the eighth house,
Pat must finish at the fifth, sixth, seventh, or eighth house.
Sum those probabilities:
f(5) = 0.01024
f(6) = 0.03072
f(7) = 0.055296
f(8) = 0.0774144
sum {j=5 to 8} f(j) = 0.17367
What is the probability that Pat exhausts all 30 houses in the neighborhood,
and still doesn't sell the required 5 candy bars?
1 - sum{j=5 to 30} f(j) = 1 - incomplete beta (p = 0.4)(5, 30-5+1) =~ 1 - 0.99849 = 0.00151 = 0.15%.
See also [@ http://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution]
and [@http://www.math.uah.edu/stat/bernoulli/Introduction.xhtml Bernoulli applications].
In this example, we will deliberately produce a variety of calculations
and outputs to demonstrate the ways that the negative binomial distribution
can be implemented with this library,
and it is also deliberately over-commented.
[import ../../example/negative_binomial_example1.cpp]
[negative_binomial_eg1_1]
[endsect] [/section:negative_binomial_example1]
[section:negative_binomial_example2 Negative Binomial example 2.]
Example program showing output of a table of values of cdf and pdf for various k failures.
[import ../../example/negative_binomial_example2.cpp]
[neg_binomial_example2]
[neg_binomial_example2_1]
[endsect] [/section:negative_binomial_example1 Negative Binomial example 2.]
[section:negative_binomial_example3 Negative Binomial example 3.]
The example program
[@../../example/negative_binomial_example3.cpp negative_binomial_example3.cpp (full source code)]
demonstrates example from K. Krishnamoorthy.
[import ../../example/negative_binomial_example3.cpp]
[neg_binomial_example3]
[neg_binomial_example3_1]
[endsect] [/section:negative_binomial_example1 Negative Binomial example 3.]
[endsect] [/section:neg_binom_eg Negative Binomial Distribution Examples]