-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Broadcasting as functor map #32081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Might there be ambiguities about which behavior to apply, eg. for a Would these new behaviors be up to the datatype to define, eg. |
With nonlinear datatypes, we do lose the "just iterate" capability, especially because there's no general way to construct an instance of the functor. However, yes, developers should be expected to define a new style which includes the appropriate mapping operation along with the datatype. (Think |
I must confess I'm very confused. There seem to be three different features you'd like here:
I understand you're uniting them all as abstract functor maps, but in practice they each act and behave very differently. And the latter two are quite outside of the generic expectation of what dots currently mean. I think it'd be hard to generically use broadcast if each data structure was given this much freedom. The dot syntax is very powerful — both for its concise syntax and its implementation, which holds a complete representation of the AST. But there's also a power in its unified meaning. |
The example with option types ( julia> string.(abs.(Set([-1, 1, 2])))
Set(["1", "2"]) With respect to the expectation of what dots mean, that's part of what this issue aims to change. As a community, it seems advantageous to broaden our view of what broadcasting means from a strictly array-based view to a mentality which includes other data structures. I agree, the dot syntax is extremely powerful. However, I disagree with the claim that it has a unified meaning. Currently, there is no formalism around what "broadcasting" means, and it's unclear which data structures should support this Julia-specific operation. If we take inspiration from category theory and its implementation in other languages, though, we could end up with a clear, unified, and precise description of which data structures support broadcasting: any parameterized type which satisfies the functor laws.
|
That example is issue #26359. In fact, I'd argue that I quite strongly disagree with the claim that broadcasting does not have a unified meaning. It applies functions elementwise, extruding singleton dimensions to match shapes between arguments as needed. It works on nearly all collections — including sets — by falling back to iteration. If you find its documentation lacking in formalism, let's try to address that. Now, I agree there's a challenge in getting the "right" collection type back — and we can definitely work to improve that. I'm not sure Sets are the poster child for that case, though, given the above issue. |
One issue here, I think, is that 1-argument
It's
I believe this is a rhetorical trick (and one you tend to see in the PL world) that conflates (1) what something does, and (2) whether there is a clear and precise description of it. Everybody wants things to be clear and precise, but those are general properties that might apply to lots of things. But the implication here is that only one, specific, preferred meaning can count as clear and precise. |
With regards to the other points, I think it would be possible to define auto-lifting for Some and not-Some. It might only be defined and valid in certain cases (such as 1-arg, or not mixed with other iterables), but I think creating definitions such that |
Ah, I was unaware of this definition; this does seem sufficiently precise. I've been reading the docs on broadcasting, and looks like I missed it:
I would still claim, though, that this is a rather restrictive definition, encouraging array-oriented programming where there seems to be room for additional flexibility in supported data structures. Design-wise, it seems like there's a consensus that aside from scalars, broadcasting over different types doesn't make much sense (unless there's a "coercion", such as repeating along singleton dimensions). Then, given In other words, At least from an API standpoint, I believe this design should support all existing array-based functionalities, while also being extensible to additional datatypes (trees, options, etc). |
Gah, that section of the documentation that you quoted is woefully out of date. We no longer default to "scalar" and are generally much more accepting of broadcasting over other data structures.
I don't see such a consensus at all. We have an entire system dedicated to supporting broadcast over heterogeneous arguments! We have an open issue talking about how broadcast might be extended to arbitrary key-value collections (like dictionaries) and how that might interoperate with mixtures of arrays (#25904).
I 100% agree. The very tricky part is reducing discontinuities (both in behavior and implementation) when you swap out one of your homogenous |
Although this debate could very easily descend into subjective opinion, I'd like to make a brief argument against this style. Supporting heterogeneous arguments (aside from isomorphism/coercion) seems like it adds significant complexity, to the point that it becomes unclear which pairwise combinations of types can be broadcasted together and unclear what each pair does. Additionally, when arguments become non-scalar and non-isomorphic, the inherent meaning of broadcasting seems to get lost.
Given a scalar, I believe the julia> zip(Node(Leaf(2),Node(Leaf(3),Leaf(4))), "r")
Node(Leaf((2,"r")),Node(Leaf((3,"r")),Leaf((4,"r")))) Otherwise, swapping an element of Overall, I rather like the meaning and design of broadcasting, and I think it's an incredibly powerful and elegant way to perform elementwise operations. However, I believe that it could become even more useful if we allowed the primary design to break away from array-like types to arbitrary containers, generalizing what it means to be an "element". |
Can |
In theory, I see nothing that prevents a package from experimenting with this. Say, So, are there some dispatches in Base that create ambiguity errors for you? We can maybe fix that. Are there simple things in the lowering that would need to work differently? It is unlikely that this can change, but maybe we can find a workaround. But is there any example of some new code that should live in Base instead of a package? |
With our current system, that's not at all accurate. Which pairwise combinations of types can be broadcasted together? All of them. Seriously. If a type works in broadcast by itself, it'll work with any other broadcastable thing. It's not ambiguous at all. Is it unclear what it does? Not at all. It applies the function over the arguments elementwise, repeating singleton dimensions as necessary to match shapes, and returns a new container with the elementwise results. Now, the part that is unclear is which container type it might return — but it's going to be something that has Now, were we to extend the meaning of dots to abstract functor maps, then yes, your above statement would be true. And of course the part of the sentence that I elided in the above quote — that supporting heterogeneous arguments adds complexity — is also very true. But it's also a very big feature. Folks definitely use heterogeneous array arguments quite frequently, with all sorts of combinations of |
@mbauman This is what I meant by "aside from isomorphism/coercion". I agree that supporting isomorphic heterogeneous arguments is very useful and reasonable, such as broadcasting over
I really like this, and the goal of this issue would be to extend it. The idea that broadcasting over array-like types produces a value of some type which is guaranteed to support a certain interface is powerful, and I often find myself wishing this was possible for non-array types, as well. |
This thread makes me think if there is a way to extend the notion of broadcasting to container with more complex "shape" than arrays. Also, can we make such extension useful even for arrays? First of all, by broadcasting, I mean "stretching singleton dimension to fit with other arrays". (I'm clarifying it here since I have an impression that @HarrisonGrodin is using "broadcasting" to only mean fused mapping and not in the sense of The idea I have is to represent "structural absence" by julia> xs = [1, 2];
ys = [10, 20, 30];
julia> xs .+ ys
ERROR: DimensionMismatch("arrays could not be broadcast to a common size")
Stacktrace:
... But I think we can formalize the (extended) broadcasting in such a way that an array julia> expand.(xs .+ ys)
3-element Array{Union{Nothing, Int64},1}:
11
22
nothing
julia> drop.(xs .+ ys)
2-element Array{Int64,1}:
11
22 (I think they are implementable by specializing Both variants would work nicely with trees with different shape:
As julia> expand.(string.(float.(Some(3))))
Some("3.0")
julia> expand.(string.(float.(nothing)))
nothing This feature would be useful for arrays as well because we can build a complex filter using it: onlypositive(x) = x > 0 ? x : nothing
drop.(onlypositive.(xs .- mean(xs)) .+ onlypositive.(ys .- mean(ys))) OK. The last example may seem to be better be done with Anyway, a minimal comment I want to make is that it can be done in external packages without new syntax (or even macros) because [1] I think |
Uh oh!
There was an error while loading. Please reload this page.
Currently, broadcasting is heavily specialized to work with array-like (i.e. linear, indexable, and finite) data structures. However, the notion of mapping a function over a parameterized datatype
F{T}
applies to many data structures, such as options (Some(3)
vsnothing
) and trees. In fact, this operation is exactly "functor map", which applies over all functor datatypes.Functors require that for a given type and its associated "mapping" function (often referred to as
fmap
or<$>
), the following laws hold (using Haskell notation):In other words, a functor must (a) ensure that mapping the identity function has no effect, and (b) ensure that mapping the function composition
f ∘ g
is the same as first mappingg
and then mappingf
.From this mathematical definition, we have derived the "broadcasting" functor map operation
f.(x)
withx::F{T}
for some typeF
which forms a functor. (Note that the second functor law guarantees that "fusion" is valid!)This issue proposes that broadcasting be extended to work more seamlessly with other (i.e. non-linear and non-indexable) datatypes, including the fusion feature.
There are a few edge cases to consider, such as combining multiple different functor type in a single broadcast call (e.g.
f.([1,2,3], Tree(...))
). Additionally, a slight redesign of broadcasting may be necessary, due to its design focus on axes.The text was updated successfully, but these errors were encountered: