Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsigned ranges with large ("negative") step are (semi)empty #29801

Closed
PeterJacko opened this issue Oct 25, 2018 · 14 comments · Fixed by #43042
Closed

Unsigned ranges with large ("negative") step are (semi)empty #29801

PeterJacko opened this issue Oct 25, 2018 · 14 comments · Fixed by #43042
Labels
regression Regression in behavior compared to a previous version

Comments

@PeterJacko
Copy link

PeterJacko commented Oct 25, 2018

While signed integers can count backwards,

julia> collect( Int16( 10 ) : -Int16( 1 ) : Int16( 0 ) )
11-element Array{Int16,1}:
 10
  9
  8
  7
  6
  5
  4
  3
  2
  1
  0

unsigned integers can't :(

julia> collect( UInt16( 10 ) : -UInt16( 1 ) : UInt16( 0 ) )
0-element Array{UInt16,1}

I am using it in a for loop like

for n = UInt16( 10 ) : -UInt16( 1 ) : UInt16( 0 )
    ...
end

because I need "n" to be of type UInt16 inside the loop (to avoid type changes, to improve performance).

julia> versioninfo()
Julia Version 1.0.1
Commit 0d713926f8 (2018-09-29 19:05 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)
Environment:
  JULIA_EDITOR = "C:\JuliaPro-1.0.1.1\app-1.29.0\atom.exe" -a
  JULIA_NUM_THREADS = 2
  JULIA_PKG_SERVER = https://pkg.juliacomputing.com/
  JULIA_PKG_TOKEN_PATH = C:\Users\Peter\.julia\token.toml
@PeterJacko PeterJacko changed the title Unsigned Int collections - bug? Unsigned Int backward collections - bug? Oct 25, 2018
@miguelraz
Copy link
Contributor

miguelraz commented Oct 25, 2018 via email

@mcognetta
Copy link
Contributor

See also #29576

@PeterJacko
Copy link
Author

PeterJacko commented Oct 25, 2018

The - will promote to signed or crash. Does using reverse(UInt(1):UInt(16)) help?

The "-" actually loops through and instead of "-1" gives "65535" (the highest value of UInt16).

No, reverse doesn't help:

julia> collect(reverse(UInt16(0):UInt16(10)))
0-element Array{UInt16,1}

@mbauman mbauman changed the title Unsigned Int backward collections - bug? Unsigned ranges with negative step are empty Oct 25, 2018
@mbauman
Copy link
Member

mbauman commented Oct 25, 2018

collect(reverse(UInt16(0):UInt16(10))) is empty

I'd say that one is definitely a bug. What you're saying when you write UInt16( 2 ) : -UInt16( 1 ) : UInt16( 0 ) is actually: 0x0002:0xffff:0x0000, which I would indeed expect to be empty. Now you can write UInt16( 10 ) : -Int16( 1 ) : UInt16( 0 ) and get UInt16s back:

julia> collect( UInt16( 10 ) : -Int16( 1 ) : UInt16( 0 ) )
11-element Array{UInt16,1}:
 0x000a
 0x0009
 0x0008
 0x0007
 0x0006
 0x0005
 0x0004
 0x0003
 0x0002
 0x0001
 0x0000

@mbauman mbauman changed the title Unsigned ranges with negative step are empty Unsigned ranges with "negative" step are empty Oct 25, 2018
@chethega
Copy link
Contributor

0x0002:0xffff:0x0000, which I would indeed expect to be empty

Was there ever an exhaustive discussion of the semantics of unsigned ranges? Because one could argue that they should wrap, i.e. unsigned unit ranges should never be empty and should be covariant under addition (add x to both start and stop == add x to each element).

@mbauman
Copy link
Member

mbauman commented Oct 25, 2018

Oh, well the reverse bug is #29576. Let's use this issue to discuss those overflow semantics — although I'm still searching for others.

@mbauman
Copy link
Member

mbauman commented Oct 25, 2018

Supporting overflow here could indeed make sense, but there's a bad smell coming from the generic definition of length. It's basically just (stop - start + step) ÷ step. Of course, dividing by such a pseudo-negative step is always going to be zero. If we had a definition where this just "fell out" of the implementation, I'd be more on board here. I'm not sure there is such a definition without doing special signed/unsigned trickery.

@chethega
Copy link
Contributor

Ok, let's have this discussion. Just to be clear: The ultimate judge of desired semantics should be usefulness in practical contexts, not mathematical elegance or consistency. I am going to argue the mathematically consistent model, just for the sake of trying it on; I am honestly very unsure what semantics I would prefer.

Consider unitranges of UIntx first. What should it mean for a number x to lie in start:stop? It would mean that x-start <= stop-start. This would clearly be translation covariant. This is the correct model if we e.g. consider TCP windows: A stream starts at a random 32 bit sequence number and then wraps around. For anti-unit ranges, we want reverse(start:stop)=(stop:-1:start) (all numbers in the same UInt type). So, clearly x in start:-1:stop means x-stop <= start-stop. This is, again, translation covariant.

Now, step ranges. An natural viewpoint is to consider start+ n*step, as long as we are in the corresponding unit range. This has the advantage that, for small ranges and small absolute value numbers k we get (start:step:stop)*k = (start*k: step*k : stop*k), and the entire thing is also translation covariant.

Now what about length? In order to compute the length, we will need to distinguish between cases. If stop-start and step have different signs (after reinterpret to signed integer type), equivalently if step > stop-start, then the range has one element (start). Otherwise, use the same formula 1 + div((stop-start) %Intx, step % Intx) or 1 + div(stop-start) %Intx, -step%Intx).

The fact that we need to do "trickery" is nothing bad; consider the "trickery" performed in base/Int.jl to deliver the current implementation of div.

On the other hand, many UInt64 ranges will overflow length. This is OK and already happening in length(typemin(Int):typemax(Int)).

@mbauman
Copy link
Member

mbauman commented Oct 25, 2018

The ultimate judge of desired semantics should be usefulness in practical contexts, not mathematical elegance or consistency.

I'd argue that both must be considered and weighted against each other. That's why I simply say it's a bad smell — it's just one signal to consider. We in fact do do all sorts of signed/unsigned/negative/positive trickery for length in the case of XInt and XInt64 because of overflow.

If you take the stance that x is in start:stop iff x-start <= stop-start, then that means that -42 is in the range 0x00:0x10. So I don't really find that semantic helpful. You only argued implies and not the biconditional, but once you move to step ranges I think the argument needs the biconditional.

In terms of user expectation, I think it's valuable to note that we support translatability (including modN overflow) up until the resulting range straddles a wraparound. That is,

julia> (Int8(126):Int8(127)) .+ Int8(0)
126:127

julia> (Int8(126):Int8(127)) .+ Int8(1)
ERROR: InexactError: trunc(Int8, 128)

julia> (Int8(126):Int8(127)) .+ Int8(2)
-128:-127
# -------------------------------------------
julia> (Int8(63):Int8(64)) .* Int8(1)
63:1:64

julia> (Int8(63):Int8(64)) .* Int8(2)
ERROR: InexactError: trunc(Int8, 128)

julia> (Int8(63):Int8(64)) .* Int8(3)
-67:3:-64

I'd argue that 0x10:0xff:0x00 similarly straddles a wraparound — in fact it straddles many. This is important because if we allowed full translatability (or straddling wraparounds) then we could no longer express empty ranges.

More succinctly: 0xff:0xff:0x00 is just as empty as 0x01:0x00. And if 0xff:0xff:0x00 is empty, then so is 0x01:0xff:0x00.

@chethega
Copy link
Contributor

chethega commented Oct 26, 2018

If you take the stance that x is in start:stop iff x-start <= stop-start, then that means that -42 is in the range 0x00:0x10

julia> -42%UInt8 <= 0x10-0x00
false

So it's not in there.

The idea of this thing is: We have the set UIntx_, and integers acting on this set by addition. There is no translation invariant ordering on UIntx_. But there is a good ternary comparison operation on UIntx_, defined by (x-start) %UIntx <= (stop-start) %UIntx, i.e. ranges. That is because we have an orientation: Unless we are in the i1 case of a single bit, there is a difference between incrementing and decrementing.

then we could no longer express empty ranges.

That is correct, and an unfortunate side-effect of ranges being inclusive on both sides, which is an unfortunate side-effect of one-based indexing.

Btw, is the following a new issue or the same? Pointer ranges with negative step do make sense, after all, and there is no wrap happening.

julia> start_=reinterpret(Ptr{Nothing},20); stop_=reinterpret(Ptr{Nothing},10);
julia> collect(stop_:2:start_)
6-element Array{Ptr{Nothing},1}:
 Ptr{Nothing} @0x000000000000000a
...

julia> collect(reverse(stop_:2:start_))
ERROR: InexactError: check_top_bit(UInt64, 9223372036854775814)

Cf also #29810. I think the example of pointers makes my thinking clearer: The crucial property of ranges is a ternary comparison on the domain, plus an abelian group action of step.

@mbauman
Copy link
Member

mbauman commented Oct 26, 2018

-42%UInt8 <= 0x10-0x00

Well that's a different operation. We don't do conversions when checking in.

In any case, I'm now convinced we cannot support this meaning.

@chethega
Copy link
Contributor

But do we want to support reverse on ranges of unsigned integers? Because, apart from the segfault, something needs to be done, at least if we want collect and reverse to commute.

@mbauman
Copy link
Member

mbauman commented Oct 26, 2018

Yes, that's #29576 — which was immediately marked as a bug. It's an easy fix… I just need to write a few tests and push it.

@vtjnash
Copy link
Member

vtjnash commented Nov 11, 2021

I was going to close this, as we already have #29576, but it turns out the OP counter-example is now also broken:

julia> @test length(  UInt16( 10 ) : -UInt16( 1 ) : UInt16( 0 ) ) == 0
Test Failed at REPL[9]:1
  Expression: length(UInt16(10):-(UInt16(1)):UInt16(0)) == 0
   Evaluated: 1 == 0
ERROR: There was an error during testing

julia> @test isempty(  UInt16( 10 ) : -UInt16( 1 ) : UInt16( 0 ) )
Test Passed
  Expression: isempty(UInt16(10):-(UInt16(1)):UInt16(0))
   Evaluated: isempty(0x000a:0xffff:0x0009)

@vtjnash vtjnash added the regression Regression in behavior compared to a previous version label Nov 11, 2021
@vtjnash vtjnash changed the title Unsigned ranges with "negative" step are empty Unsigned ranges with "negative" step are (semi)empty Nov 11, 2021
@vtjnash vtjnash changed the title Unsigned ranges with "negative" step are (semi)empty Unsigned ranges with large ("negative") step are (semi)empty Nov 11, 2021
vtjnash added a commit that referenced this issue Nov 11, 2021
vtjnash added a commit that referenced this issue Nov 12, 2021
KristofferC pushed a commit that referenced this issue Nov 15, 2021
daviehh pushed a commit to daviehh/julia that referenced this issue Nov 16, 2021
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this issue Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants