-
-
Notifications
You must be signed in to change notification settings - Fork 463
Unexpected sample values from beta distribution for small parameters #999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think I would start by plotting the beta distribution for |
This fixes the issue for me, can you confirm that it makes sense? diff --git a/rand_distr/src/gamma.rs b/rand_distr/src/gamma.rs
index ba8e4e0eb3..907be37d8f 100644
--- a/rand_distr/src/gamma.rs
+++ b/rand_distr/src/gamma.rs
@@ -495,7 +495,11 @@ where
fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> N {
let x = self.gamma_a.sample(rng);
let y = self.gamma_b.sample(rng);
- x / (x + y)
+ if x == N::from(0.) {
+ N::from(0.)
+ } else {
+ x / (x + y)
+ }
}
}
@@ -566,6 +570,15 @@ mod test {
Beta::new(0., 0.).unwrap();
}
+ #[test]
+ fn test_beta_small_param() {
+ let beta = Beta::<f64>::new(1e-3, 1e-3).unwrap();
+ let mut rng = crate::test::rng(206);
+ for _ in 0..1000 {
+ assert!(!beta.sample(&mut rng).is_nan());
+ }
+ }
+
#[test]
fn value_stability() {
fn test_samples<N: Float + core::fmt::Debug, D: Distribution<N>>( |
I'm not sure what you are plotting, but with such extreme parameters you should be getting either 0 or 1. Are you using some kind kernel density estimation? |
Yes, sorry, I am using a gaussian kernel, but the samples are the crosses in bottom: they are all 0 or 1, except for two sample between 0 and 0.2. The point is that replacing nan samples by zero (effectively this is what you proposed) is biased. Thinking about the simulation using gamma variables, the issue arises when both x and y are zero. Then, x / (x + y) should be... zero, or one? |
Switching to another algorithm (that also performs better for small parameters) gets rid of the |
Wow!! Great!! I confirm that this fixes my issues! |
Background
Beta distribution is implemented through the Beta struct and samples should give a number between zero and one. It is known that this distribution is numerically delicate when dealing with both parameters (alpha and beta) small.
The implementation of the
sample
method is though the following characterization.If X, Y are independent and X follows Gamma(alpha, theta) and Y follows Gamma(beta, theta), then X / (X + Y) follows Beta(alpha, beta).
For more such characterization, see here.
Sampling from a beta distribution with both alpha and beta parameters small returns NAN samples. This is clear from the implementation, but is not expected for the user at all!
By the way, values of
1.0e-3
are already small enough to easily get a NAN result. Just run the following code.What is your motivation?
I as doing numerical simulations and need to simulation beta samples as part of a rejection sampling algorithm. Running into nan values was unexpected, but could solve the issue by a particular symmetry present in my problem.
What type of application is this? (E.g. cryptography, game, numerical simulation)
Numerical simulation.
Feature request
I would like to contribute to a more robust simulation method of the beta variable that takes into account such cases.
The text was updated successfully, but these errors were encountered: