-
Notifications
You must be signed in to change notification settings - Fork 32
Possibly confusing .~
meaning
#435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Not in Julia though 😕 Since it's column-major and thus extracting each observation using an access pattern of
I do agree that things are a bit confusing and ambigious, and appreciate your input on this:) But IMO this is a broader issue and not something that we should begin addressing in It essentially is just a question of how to represent "batches" of inputs, i.e. a collection of "independent" inputs in the sense that we want the computation to be performed on each of the elements in the collection independently but we provide it as a This is something we've given a lot of thought already (https://github.com/TuringLang/Bijectors.jl/discussions/178) and something I've started playing around with in a Draft PR over in Bijectors.jl. We essentially want to extend the And I don't think we want to in any specialize on Tables.jl in Turing.jl (this is just my personal opinion though!). This too should just be wrapped in something similar to |
Is there a reason why? I think this ability to specialize on a standardized API and have all my packages "just work" together the way I expect them to is probably my favorite thing about Julia.
I'm not sure about this, either from the perspective of intuitiveness/compatibility with Julia Base or speed:
|
Yeah, I'm 100% with this! What I'm trying to say is that we should strive to make things work with more general inputs, e.g. iterators rather than demanding it to be an array, etc. rather than explicitly implement support for Tables.jl:) But if a special impl would still be required, we could add some glue code in Turing.jl (the rest of the changes needs to happen in DynamicPPL.jl) 👍
I agree this particular aspect is counter-intuitive (I was suprised to find that we handled this case tbh when I first saw it), but it's the sort of feature that no one will accidentally use since it's only applicable if you do Btw, the reason why we've done the above is to replicate the way Distributions.jl handles collections of samples from
The things is though, we don't support observations of mixed types since Distributions.jl doesn't 😕 And so such an example would have to be handled properly anyways. The fact that we don't support Tables.jl for example, is not an issue with
Once we've allow that, extensions to Tables.jl and others will be muuuuch easier:) You could even just use I started working on these two points a bit yesterday btw (#292). |
@ParadaCarleton Something like this: TuringLang/AbstractPPL.jl#26? |
I have no idea what that's doing but if you claim it helps I'll believe you. 😅 |
Haha, I was referring to the example of sampling a |
From the previous issue on calculations for
pointwise_loglikelihood
:I'm unsure if this is the most intuitive behavior. The problem here is that, in general, the rows of a table represent independent observations, while each column is a different attribute. My suggestion would be to use the following set of rules for
.~
broadcasting:pointwise_loglikelihood
and any similar functions. (Although internally, it should be treated as a broadcast over columns -- this should improve performance, since columns usually have only one type.).~
will throw a dimension mismatch if the user meant to broadcast over rows. We can assume the user meant to broadcast over columns because Julia is column major without risking too much.This is annoying given Julia is column-major, but I think it's the best solution to avoid errors in data analysis.
The text was updated successfully, but these errors were encountered: