Skip to content

Return Any[] from broadcast() for empty input when no method applies #20049

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

nalimilan
Copy link
Member

This is consistent with map().

This is one way of fixing #20033.

@kshyatt kshyatt added the broadcast Applying a function over a collection label Jan 15, 2017
@JeffBezanson JeffBezanson changed the title Return Any[] from broadcast() for empty input when no metod applies Return Any[] from broadcast() for empty input when no method applies Jan 16, 2017
@nalimilan nalimilan requested a review from pabloferz January 19, 2017 21:28
@@ -280,7 +280,10 @@ end
eltypestuple(a) = (Base.@_pure_meta; Tuple{eltype(a)})
eltypestuple(T::Type) = (Base.@_pure_meta; Tuple{Type{T}})
eltypestuple(a, b...) = (Base.@_pure_meta; Tuple{eltypestuple(a).types..., eltypestuple(b...).types...})
_broadcast_eltype(f, A, Bs...) = Base._return_type(f, eltypestuple(A, Bs...))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave this as is, and make the change here to return similar(Array{T === Union{} ? Any : T}, shape)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I could do that, but that would mean _broadcast_eltype doesn't actually return the type used by broadcast. This can be confusing, and possibly slightly annoying for packages which call this function for their broadcast implementation. What's the advantage of doing that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_broadcast_eltype does not return broadcast's return eltype, only a guess; in general determining broadcast's return eltype requires performing work equivalent to the broadcast operation itself. Please find additional discussion in #19937 (comment). Best!

@martinholters
Copy link
Member

While being consistent with map may have its merits, I find Union{}[] more correct than Any[] here. Is there any use case where this change would be beneficial?

I've briefly considered whether just throwing might be an alternative, but

err::Nullable{String}
error.(err)

to error iff err is not null might actually be useful.

@nalimilan
Copy link
Member Author

Union{} is more correct but makes it impossible to add any elements to the resulting array. That could be an advantage if we want to make it obvious this is equivalent to an error.

But the main issue with returning Array{Union{}} is the same as with throwing an error: that means that improving type inference in the future would break working code which relied on adding elements to the resulting array (since an Any[] input will always return Any[], not Union{}[]).

@martinholters
Copy link
Member

With that argument, broadcast should always return Any[] in the empty case: maybe push!(f.([]), "a") works now only because f cannot be inferred to always return Int. That's probably not the best idea. Ref #11034.

@nalimilan
Copy link
Member Author

That's true, though that doesn't mean we should create even more problematic cases if we can avoid it. I don't really have an opinion on this matter, I'm just trying to measure the consequences. Would there be any drawback in returning Union{}[] from map and from generators (which map uses)? I guess the 0.5 behavior was chosen for a reason?

Cc: @JeffBezanson

@pabloferz
Copy link
Contributor

If I recall correctly, the idea was to use the result of inference if the type was concrete or if the array/iterator was empty even if things are not optimal or predictable in this case. This is what Jeff has called the StevenGJ compromise. I guess we could add the additional rule of being able to push things to empty arrays, which means replacing Union{} by Any when the former is found by inference.

@nalimilan
Copy link
Member Author

Yes, that was the discussion in #7258. But at the time it seems that Any[] was chosen (at least in practice), and I don't know whether returning Union{}[] was considered or not.

@martinholters
Copy link
Member

Speaking of consistency:

julia> map((x) -> error(x), 1:0)
0-element Array{Union{},1}

julia> map((x) -> error(x), [])
0-element Array{Any,1}

@nalimilan
Copy link
Member Author

Please add it to the list at #20033.

@nalimilan nalimilan added this to the 0.6.0 milestone Feb 6, 2017
@nalimilan
Copy link
Member Author

More opinions? I've added the 0.6 milestone since the existing code on master is a behavior change from 0.5, so we should be sure we want to make it, and if so do it for both map and broadcast.

@tkelman
Copy link
Contributor

tkelman commented Feb 9, 2017

Sounds like we should go forward with making them consistent.

@tkelman
Copy link
Contributor

tkelman commented Feb 9, 2017

decision is whether map should return Union{} element type, or broadcast should return Any

@StefanKarpinski
Copy link
Member

The map over Any[] returning Any[] seems to be a vestigial special case. FWIW, returning Bottom{}[] for map(x->error(), T[]) and similarly for broadcast seems better to me.

@martinholters
Copy link
Member

Looks like the special casing is not only for empty arrays, though:

julia> map(identity, Integer[1])
1-element Array{Int64,1}:
 1

julia> map(identity, Any[1])
1-element Array{Any,1}:
 1

FWIW, deleting that line does not break any tests, but I do wonder whether any packages depend on that.

@TotalVerb
Copy link
Contributor

It seems that the line is not actually needed for bootstrap, so it should be removed.

@nalimilan
Copy link
Member Author

I've had a look at changing map to return Union{}[] when no method applies, and it appears the Any comes from _default_eltype, which itself comes from Core.Inference.return_type. Illustration of what _default_eltype does:

julia> h3(x::Float64...) = prod(x)
h3 (generic function with 1 method)

julia> itr=(h3(x) for x in Int[])
Base.Generator{Array{Int64,1},#h3}(h3, Int64[])

julia> Core.Inference.return_type(first, Tuple{typeof(itr)})
Any

The problem is that first applies to itr in theory, it's just that it throws an error. This fact cannot be detected by inference since it's dependent on the length of the iterator.

So we may have to find another way of computing _default_eltype. For example:

_default_eltype{T, F}(::Base.Generator{T, F}) = Core.Inference.return_type(F.instance, T)

Of course f.instance is a hack that won't work e.g. for type constructors, so we need something better.

@TotalVerb
Copy link
Contributor

instance will also fail for generators that close over variables in the enclosing scope.

Why wouldn't using a similar strategy to broadcast work, i.e. if the inferred type in the empty case is abstract, instead return Union{}[]?

@StefanKarpinski
Copy link
Member

This has been lingering for a rather long time. Can we make forward progress here?

@TotalVerb
Copy link
Contributor

I think the consensus has changed to change map to match broadcast instead of vice versa.

@nalimilan
Copy link
Member Author

IIUC this PR is obsolete, and #20866 should be merged instead. Or maybe we need more fixes. Cf. #20033.

@StefanKarpinski
Copy link
Member

The correct behavior here should be to return Vector{Union{}}() rather than Vector{Any}(). We should also make map do the same thing in these circumstances.

@JeffBezanson
Copy link
Member

#20866 is orthogonal to this. Whether inference returns Any or Union{} is unpredictable, and map and broadcast should agree independent of that.

@StefanKarpinski
Copy link
Member

The way forward here is to make both map and broadcast return arrays with element type Union{} when the function would not be applicable or can otherwise be inferred to always error.

@JeffBezanson
Copy link
Member

Ah, the difference seems to be due to calling return_type on the no-method function itself, which is what broadcast does, vs. on a function that calls such a function, which is what map does (since map is based on generators). Since the decision was made to use the result of return_type directly, we will be stuck with cases like this where inference doesn't return what you wanted.

@nalimilan nalimilan deleted the nl/broadcast branch March 16, 2017 17:06
@nalimilan
Copy link
Member Author

Could we change one of map or broadcast to use the other's approach? Looks like the only way to get consistent results in all cases.

@JeffBezanson
Copy link
Member

I can try to do this for map.

@TotalVerb
Copy link
Contributor

Alternatively, would it make sense to never return an abstractly typed empty array, and always simply return Union{}[] from empty map/broadcast? That would restrict the inference dependent behaviour to the empty case where the type can be inferred as a concrete type, which would prevent this and similar problems in the future.

@nalimilan nalimilan restored the nl/broadcast branch December 21, 2018 21:18
@nalimilan nalimilan deleted the nl/broadcast branch December 21, 2018 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants