Skip to content

Multi-dimensional distributions #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dhardy opened this issue Feb 26, 2025 · 6 comments
Open

Multi-dimensional distributions #16

dhardy opened this issue Feb 26, 2025 · 6 comments

Comments

@dhardy
Copy link
Member

dhardy commented Feb 26, 2025

#14 argues the importance of a variable-length input Dirichlet distribution. rust-random/rand#1478 proposes a Multinomial distribution.

It is also questionable whether the existing Distribution trait is appropriate for these; it may be preferable to use a distinct trait like:

pub trait MultiDistribution<F> {
    fn sample<R: Rng + ?Sized>(&mut self, rng: &mut R, output: &mut [F]);
}

Although:

  • The above does not support error checking of the length of output, aside from panic. This may be the most appropriate option anyway, though I would expect at minimum that implementations do assert the length of output is exactly as expected.
  • There is probably no reason to add this trait over simply adding pub fn sample methods to the distribution types.

Questions:

  • Should we start a new rand_multi crate? Or maybe rand_distr::multi module?
  • Should we add a new MultiDistribution trait? Or simply use methods on the distributions without a trait?
@benjamin-lieser
Copy link
Member

I am in favor of a MultiDistribution trait, I might make it a bit more general:

pub trait MultiDistribution<F> {
    fn sample<R: Rng + ?Sized>(&mut self, rng: &mut R, output: &mut F);
}

So we can have MultiDistribution<[f64]> and MultiDistribution<[f64;N]> if needed.

I don't think we need a new crate, there should be space in rand_distr in a multi module.

The questions we have to answer is if we want to have Dyn and Const versions of the distributions. I would say a Dyn version is enough if it implements MultiDistribution<[f64]>

@dhardy
Copy link
Member Author

dhardy commented Feb 26, 2025

By Const you mean fixed-length? That should only be a property of the distribution type, not the trait.

Don't use F in the trait; that is used a few places for "floating point type". Maybe V for "vector".

You then want V: ?Sized in the trait to support MultiDistribution<[f64]> which is fine for the trait, but not for a struct field.

But I am wondering why we want this trait at all. Even trait Distribution is not really useful except to specify exactly what sample method an implementation must have. I don't recall seeing generic code of the type D: Distribution<f64> (other than fn Rng::sample, the Iter and Map adapters, and some benchmark code) — we could almost have made StandardUniform a trait instead).

@benjamin-lieser
Copy link
Member

With Const I mean const generic distribution types like our current Dirichlet, it currently implements Distribution<[F;N]> but it also could implement MultiDistribution<[F;N]>.

Fair, I forgot about the ? Sized thing and F is a misleading name.

I used the trait a decent number of times in my code, imagine functions which do monte carlo estimations of some statistic. Yes is a lot of cases the distributions you will put into it will come from the same family (same distribution type) but this does not have to be the case and then the Distribution trait is very useful. I am very sure someone will either write it themself or ask for it if we do not add it.

@dhardy
Copy link
Member Author

dhardy commented Mar 1, 2025

FYI I don't expect to have time to look at this for at least the next week. If you @benjamin-lieser (or anyone else) wish to make a start, please do, but no pressure.

I don't think we need a new crate, there should be space in rand_distr in a multi module.

I agree in that there does not seem to be much incentive to use a new crate. (Usage of a large new dependency or a lot of new code could be incentives.)

@benjamin-lieser benjamin-lieser mentioned this issue Mar 2, 2025
1 task
@zoonders
Copy link

Hi! I stumbled upon this discussion because I was interested in an implementation of sampling from a multivariate normal distribution. Is this something that could be implemented in rand_distr as MultiDistribution<[f64;N]> as well. Then I could start something. However, this would probably require the Cholesky Decomposition of ndarray-linalg. Is that a dependency that is wanted for rand_distr?

Cheers

@benjamin-lieser
Copy link
Member

Hi! I stumbled upon this discussion because I was interested in an implementation of sampling from a multivariate normal distribution. Is this something that could be implemented in rand_distr as MultiDistribution<[f64;N]> as well. Then I could start something. However, this would probably require the Cholesky Decomposition of ndarray-linalg. Is that a dependency that is wanted for rand_distr?

Cheers

Right now it's about getting the interface right. Then I would say the multidim normal would be a good addition and I would say it's fine to depend on either ndarray-linalg or nalgebra for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants