-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reimplement afoldl
using recursion
#54478
base: master
Are you sure you want to change the base?
Conversation
Replace the hack implementation with something more elegant and flexible. FTR, this PR is part of a series of changes which (re)implement many of the operations on tuples using a new recursive technique. The ultimate goal is to hopefully increase the value of the loop-vs-recurse cutoff (`Any32`, sometimes hardcoded `32`) for tuple operations. As-is, this creates a performance regression for tuples with length just above the cutoff, e.g., 32-40. This shouldn't matter once the cutoff value is increased, IMO. Demanding type inference example: ```julia-repl julia> isconcretetype(Core.Compiler.return_type(foldl, Tuple{typeof((l,r) -> l => r),Tuple{Vararg{Int,30}}})) true ```
I don't really see how the current one is a "hack implementation". It seems very straightforward compared to what replaces it here at least... |
Edited PR description to be more clear and less inflammatory.
Perhaps, but note that the first half of the change is not fold-specific, it will hopefully be reused for other implementations. See, e.g., #54479. |
This is going in the wrong direction. If the change is any good, it should perform better with a lower cutoff. Excessive unrolling as required by this PR is often a sign of an unreliable design |
This replaces an implementation where the unrolling is literally hardcoded. So I don't see how it's fair to say that my implementation is the one that requires unrolling. And I can't find any regressions other than the mentioned one. If the mentioned regression is really a problem, it'd be easy to just add another method for |
Also, I don't understand this. Surely increasing the cutoff value is a worthy goal of its own. |
struct _TupleViewFront end | ||
struct _TupleViewTail end | ||
const _TupleView = Union{_TupleViewFront,_TupleViewTail} | ||
_tupleview_length_representation_impl(n::Int) = ((nothing for _ ∈ OneTo(n))...,)::Tuple{Vararg{Nothing}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really necessary? The decrement is done with arithmetic anyway, so would an integer just work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's necessary, but not sure. Will check later.
We have been through a couple implementations of this function and it has been risky to touch it in the past. This needs more motivation; what is the big benefit we are after? |
This improved design is much closer to the current behavior, it shouldn't introduce any regressions.
My motivation, not sure if it's enough motivation for you devs, is to make it possible to increase the cuttoffs where the tuple operations switch to looping from the current limits at around thirty-ish to a (few?) hundred. The current BTW I think the new commit could make this PR more palatable. |
The test that fails now is caused by some preexisting issue that was only triggered now, I think. With this PR, function f()
as = ntuple(_ -> rand(), Val(10))
hypot(as...)
end
function g(as::Vararg{Float64,10})
hypot(as...)
end So giving the compiler extra information causes the compiler to generate worse code? |
Okay but why? This has no inherent value so there is something missing here. Is it because you think the unrolled code will be faster to compile or execute. |
I think there are three objectives when it comes to implementations of functions acting on tuples like
Where it should be noted that 2. and 3. should also be examined for non-concrete argument types. Ideally, any new implementation should improve at least one of those while not doing worse on any of them. Otherwise, it's a matter of discussion to find suitable trade-offs. I'm unclear how the proposed change fares in this regard. |
The new implementation is more elegant and flexible, doing away with the code duplication and extreme amounts of constant-hardcoding.
FTR, this PR is part of a series of changes which (re)implement many of the operations on tuples using a new recursive technique. The ultimate goal is to hopefully increase the value of the loop-vs-recurse cutoff (
Any32
, sometimes hardcoded32
) for tuple operations.Demanding type inference example: