Skip to content

Implement quantile(model(), u)? #2525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mvsoom opened this issue Mar 31, 2025 · 4 comments
Open

Implement quantile(model(), u)? #2525

mvsoom opened this issue Mar 31, 2025 · 4 comments

Comments

@mvsoom
Copy link

mvsoom commented Mar 31, 2025

Sampling from priors works perfectly:

using Turing # v0.21.10

@model function test()
	x ~ Normal()
	y ~ Uniform()
	return 0.
end

rand(test())
# (x = 1.310427089277206, y = 0.2538675545828133)

I wonder if it would be possible to implement quantile to sample from the prior using the quantile transform:

u = [.1, .5]
quantile(test(), u)
# MethodError: no method matching length(::DynamicPPL.Model{Main.var"workspace#336".var"#test#1", (), (), (), Tuple{}, Tuple{}, DynamicPPL.DefaultContext})
# The function `length` exists, but no method is defined for this combination of argument types.

Quantile transform simply maps a u ~ Uniform to the desired distribution, e.g.,

using Distributions
quantile(Normal(), .5)
# 0.

Use cases:
The hypercube [0,1]ᴺ parametrization u ↦ θ is e.g. used in Nested Sampling.
It is effective in inference because fixed length steps in u space result in prior-dependent step lengths in the original θ parameter space.
And it can be used to unify inference with discrete and continuous variates. (e.g. jaxns)

If this could be implemented, hooking up NestedSamplers.jl would be relatively simple:

# From README of https://github.com/TuringLang/NestedSamplers.jl
d = 10
@model function funnel()
    θ ~ Truncated(Normal(0, 3), -3, 3)
    z ~ MvNormal(zeros(d - 1), exp(θ) * I)
    return x ~ MvNormal(z, I)
end

model = funnel()

data = model()

prior_transform(X) = quantile(funnel(), X)
logl(X) = loglikelihood(funnel() | (x=data,), prior_transform(X))

NestedModel(logl, prior_transform)

Since this is the way many random variates are generated computationally, maybe implementing this would relatively easy?
I could give it a try with some pointers, since I only recently picked up Julia again and don't know Turing.jl.

@penelopeysm
Copy link
Member

Hi @mvsoom - how would this work with multivariate distributions?

@mvsoom
Copy link
Author

mvsoom commented Apr 1, 2025

Hi, that's a good point! Thanks for raising it! I thought about and it made me realize a few things... I'd be delighted to discuss these with you!

how would this work with multivariate distributions?

Short answer: if the multivariate distribution can be "uncorrelated" into independent variables, then the answer is of course to perform the inverse sampling transform on each of these variables separately and then tie them back up again with some correlating transform.

For example: the multivariate Gaussian in $d$ dimensions with zero mean and covariance $\Sigma = LL^T$:
Generate u = rand(d), then z = quantile.(Normal(), u), finally x = L*z is distributed according to $N(0, \Sigma)$.

This quantile.(Normal(), u) bit is how the code in NestedSamplers.jl currently works. And in nested sampling in general, prior information is typically quite vague and thus generally specified in this form. There are tools to uncorrelate posterior samples and arbitrary distributions in the case it isn't.

So, for general multivariate distributions that can't be uncorrelated readily, the picture changes. The inverse sampling transform is not uniquely defined mathematically, and a variety of techniques like copulas are used.

But this is irrelevant in my mind. Any sample from a $d$-dim multivariate distribution depends one-to-one on a RNG instance which emits ... $d$ independent Uniform(0, 1) variables. So while we do not have strictly a quantile transform for any multivariate distribution, we do have the means to parametrize everything in the hypercube [0,1]ᴺ parametrization.

So the name "quantile transform" is misleading, my bad. It is more about decorrelating and bringing everything back to a common parameter space [0,1]ᴺ, with unifying benefits such as independence, single unified dimension of inverse probability, unified inference for discrete and continuous params, and distribution-dependent step lengths (larger in regions of low prior mass, smaller in regions of higher prior mass). AD takes care of the derivatives dL/du too.

Is it possible for me to take a Turing model and to somehow run inference in u space? Then I could hook it up to NestedSamplers.jl relatively easily I think.

@penelopeysm
Copy link
Member

penelopeysm commented Apr 3, 2025

This is interesting. Sorry for the delay in response, I've been pondering it. It seems to me we would first need to introduce a QuantileBijector in Bijectors.jl, which obeys the following API (ish).

d = Normal()
q = QuantileBijector(d)

# converts points in original space to [0, 1]
# specifically if q(x) = y, then we require that cdf(d, x) = y
q(0) = 0.5

# returns a distribution that behaves like Uniform(0, 1), i.e. rand(u) should be the same as rand()
u = transformed(d, q)

# returns the log jacobian of the transform
logabsdetjac(q, 0)

This is probably not too difficult for univariate, using the cdf. Also getting the log-jacobian of the transform seems to be fairly trivial (it's just logpdf of the original value with the original dist). While I do see the fundamental point about how rand(d) for any d (even multivariate) ultimately stems from calls to rand(), I don't immediately see a strategy for how to use this to implement the actual code for it though.

After that we might worry about how to use this in DynamicPPL so that you can evaluate and sample from a model in this hypercube space. I think this is doable but quite tricky and might require unravelling some assumptions that we have inside DynamicPPL.

All in all, it feels like a nice thing to have but I'm not sure whether the core team will be able to implement it any time soon. Of course, if you wanted to take some of it on we would be happy to help.

Thoughts / corrections are welcome @TuringLang/maintainers

@penelopeysm
Copy link
Member

penelopeysm commented Apr 3, 2025

Oh, as it happens, there's already an existing PR for the univariate one 😄 TuringLang/Bijectors.jl#273 although what I called QuantileBijector in my post above is CDFBijector in this PR, and QuantileBijector in the PR is the inverse of that, which makes more sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants