Skip to content

Interest for an Iterators.nth(x, n) API? #54454

@ghyatzo

Description

@ghyatzo

Hello,

After searching far and wide both in issues, PR and on the discourse, I could not find any discussion about adding an Iterators.nth(x, n) API just for ease of use and simplicity. This is the only other reference about this possibility I could find.

I have played a little bit with it in the past during various projects and ended up with a slight evolution over the basic version mentioned by @stevengj in the linked post, which I am carrying around when needed:

_inbounds_nth(itr, n) = getindex(iterate(Base.Iterators.drop(itr, n-1)), 1)
_safe_nth(itr, n) = begin
	y = iterate(Base.Iterators.drop(itr, n-1))
	isnothing(y) ? nothing : getindex(y, 1)
end
nth(itr, n; skip_checks=false) = skip_checks ? _inbounds_nth(itr, n) : _safe_nth(itr, n)

simple_nth(itr, n) = first(Iterators.drop(itr, n-1))

which offers the ability to skip bounds checking at the expense of a crash (opposed to just returning nothing).

julia> itr = collect(1:10000)
julia> _safe_nth(itr, 10001)

julia> _inbounds_nth(itr, 10001)
ERROR: MethodError: no method matching getindex(::Nothing, ::Int64)
Stacktrace:
 [1] _inbounds_nth(itr::Vector{Int64}, n::Int64)
   @ Main .\REPL[198]:1
 [2] top-level scope
   @ REPL[205]:1

but that offers decent performance benefits, although we can't escape the O(n) complexity without extra assumptions (not that I know of at least)

julia> @btime _inbounds_nth(itr, 9999) setup=(itr=collect(1:10000))
  151.222 ns (0 allocations: 0 bytes)
9999

julia> @btime _safe_nth(itr, 9999) setup=(itr=collect(1:10000))
  4.400 μs (0 allocations: 0 bytes)
9999

julia> @btime simple_nth(itr, 9999) setup=(itr=collect(1:10000))
  4.414 μs (0 allocations: 0 bytes)
9999

(btw simple_nth also errors out when called out of bounds).

Instead of straight up opening a PR I wanted to check if there was any desire for this kind of little QOL pieces of code.
And more importantly, check with much more knowledgeable people a couple of doubts:

  • is it better to just throw an error or return nothing for these kinds of APIs?
  • maybe instead of a keyword argument skip_checks it is possible to "retrofit" the @inbounds macro to have something like
    @inbounds Iterators.nth(itr, n) kind of calls, is that even a good idea?
  • On the topic of the question above, is there a better way to avoid the branch in the call to begin with? is it automatically optimized away?
  • Should such API just return the whole iteration tuple with the state, and let the user deal with that (doesn't feel right to me though)?
  • I don't know enough about all possible edge cases of what "an iterable" is like, therefore my naive inbounds version only offers performance benefits in this particular case with a vector, so: is making such distinction even worth at all?
  • Maybe useful for make KeyIterator and ValueIterator more array-like #10092?

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureIndicates new feature / enhancement requests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions