-
-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing KeyIterator used for JuMPArray to work also when index sets are non indexable. #836
Conversation
KeyIterator{JA}(d::JA) = KeyIterator{JA}(d) | ||
|
||
function Base.start(it::KeyIterator) | ||
it.state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the 4-space convention used in JuMP
I haven't dug into the code but it seems reasonable. @joehuchette, do we use iteration in any performance-critical places? |
An alternative solution would be to keep the pervious iterator and apply |
I would like to do some benchmarks to compare the current KeyIterator, the one I posted and the improvement I am trying using My question is are we interested more by the performance with Julia 0.4 or the dev version 0.5? |
We shouldn't use iteration in performance-critical code, and if we do, we probably shouldn't. I would be interested to see the performance here; based on my reading, it doesn't look like like the new implementation for
I'd focus on 0.5, since the release is imminent, and we won't have a new JuMP release targeted at only 0.4. |
- now the functions start next done do not modify arguments - less allocation thanks to caching and @generated functions
This was not completely true. it works fine only for iterables that implement the iteration interface in a "clean" way. for more details see http://stackoverflow.com/questions/39181313/julia-iteration-start-next-done-side-effects/ Edge Iterator in LightGraphs for instance did not have a "clean" implementation but we fixed it in sbromberger/LightGraphs.jl#423 In the last version I have just committed:
On this small example:
The results with the old version of iterators that only worked for indexable collections
The results with the last version I submitted.
As you can see there are still more allocations unfortunately... EDIT: The tests were done with julia 0.5 as suggested by @joehuchette |
I have added the tupple wrapper Now it is used for keys(::JuMPArray) This wrapper is immutable so it doesn t impact the performance and the test above gives almost same result:
Hence the following two loops are now equivalent for
If you are ok with these modifications I have done on |
Current coverage is 86.85% (diff: 86.48%)
|
@joehuchette, @mlubin, if you prefer to keep the old good performance for indexable collections we can specialize the iteration process. But I'm not sure if we can check efficiently that a type has the function getindex() http://stackoverflow.com/questions/39410299/check-if-a-type-implements-an-interface-in-julia Any ideas? |
@@ -35,13 +35,26 @@ end | |||
end | |||
|
|||
Base.getindex(d::JuMPArray, ::Colon) = d.innerArray[:] | |||
|
|||
immutable JuMPKey | |||
jk::Tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does performance improve if this is instead
immutable JuMPKey{T<:Tuple}
jk::T
end
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it doesn't @joehuchette . The wrapper is the second modification I have made and it is the first one that had a negative impact on the performance, namely using iterators instead of indexes. Maybe I confused you, by not opening another PR. I thought it is related ...
Should I open a new PR for the JuMPKey wrapper?
0.143376 seconds (110.83 k allocations: 2.241 MB)
0.009154 seconds (60.00 k allocations: 937.559 KB)
0.009054 seconds (60.00 k allocations: 937.559 KB)
0.008394 seconds (60.00 k allocations: 937.559 KB)
0.008086 seconds (60.00 k allocations: 937.559 KB)
0.009080 seconds (60.00 k allocations: 937.559 KB)
0.008189 seconds (60.00 k allocations: 937.559 KB)
0.008068 seconds (60.00 k allocations: 937.559 KB)
0.008026 seconds (60.00 k allocations: 937.559 KB)
0.009072 seconds (60.00 k allocations: 937.559 KB)
0.008065 seconds (60.00 k allocations: 937.559 KB)
0.008093 seconds (60.00 k allocations: 937.559 KB)
0.009081 seconds (60.00 k allocations: 937.559 KB)
0.021784 seconds (60.00 k allocations: 937.559 KB)
0.016168 seconds (60.00 k allocations: 937.559 KB)
0.014673 seconds (60.00 k allocations: 937.559 KB)
0.012351 seconds (60.00 k allocations: 937.559 KB)
0.011812 seconds (60.00 k allocations: 937.559 KB)
0.010432 seconds (60.00 k allocations: 937.559 KB)
0.010156 seconds (60.00 k allocations: 937.559 KB)
when having indexable sets
Unfortunately I don't think there is a way to check if an iterator supports indexing besides using something like @IssamT could you provide a (practical) example where this behavior is a usability improvement? I'm a bit concerned about performance regressions here. |
@joehuchette, thanks for the followup. I don't know if you saw the last commit (of yesterday), but right now there is no regression. the old Let me summarize for you the current state: 1) I reproduce the same performance on the benchmark above, as before my patch, in the case where indexsets are indexable (the only case for which 2) The new specialized
3) This is somehow a related but separate modification (As I have mentioned in a previous comment I don't know if I should put it in a separate PR). Having
but now can also be replaced with the more intuitive version:
|
Ahh, ok, thanks for the detailed explanation. This is really nice, especially point 3. I agree that might be better suited as a separate PR, if it's not too difficult to extricate it. I'll have a deeper look over the code in the next few days. |
end | ||
end | ||
|
||
KeyIterator{JA}(d::JA) = KeyIterator{JA}(d) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method necessary with the inner constructor above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleting line 196 gives this error
ERROR: LoadError: MethodError: Cannot convert
an object of type JuMP.JuMPArray{Float64,2,Tuple{UnitRange{Int32},UnitRange{Int32}}} to an object of type JuMP.KeyIterator{JA<:JuMP.JuMPArray}
This may have arisen from a call to the constructor JuMP.KeyIterator{JA<:JuMP.JuMPArray}(...),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, strange. OK, thanks.
KeyIterator{JA}(d::JA) = KeyIterator{JA}(d) | ||
|
||
immutable IndexableIndexsets end | ||
immutable NotIndexableIndexsets end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using types here, a simple true/false should suffice. Using method_exists
below means indexability
will lead to dynamic function calls anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that traits are cleaner for dispatch. I agree that now it isn't impacting performance because the dispatch of the function next
is done using the type of k that is either a Tuple
for the new iterator or an Integer
for the old one. But if we manage to change the new iterator to use an Integer
state as well, traits will be needed. I don't mind deleting them right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I would prefer booleans, as this code is already a bit difficult for me to parse. We can always add complexity later on as-needed
cartesian_key = JuMPKey(_next(it.x, k)) | ||
pos = -1 | ||
for i in 1:it.dim | ||
if(!done(it.x.indexsets[i], next(it.x.indexsets[i], k[i+1])[2] ) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style point: it's preferred to use whitespace, not parentheses, in if statements:
if !done(it.x.indexsets[i], next(it.x.indexsets[i], k[i+1])[2])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. I will fix it in my next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, just a minor point
cartesian_key, tuple(it.next_k_cache...) | ||
end | ||
|
||
function Base.done(it::KeyIterator, k::Tuple) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just be
Base.done(it::KeyIterator, k::Tuple) = (k[1] == 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I will fix that as well on my next commit.
end | ||
end | ||
if pos == - 1 | ||
return cartesian_key, (1,1) #The second 1 is needed to ambiguity with ints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain this further please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first element of the tuple is used for indicating that we are done. The second 1 here is added because otherwise the dispatch of the function n̶e̶x̶t̶ done
fails and the function n̶e̶x̶t̶ done
with Integer
state is called instead of the one with Tuple
state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this mess with the type stability of the function? Could there a cleaner way to implement this without using a sentinel value return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You re right. It s not very clean. The following modification doesn't change the performance (I tested it with several containers as index sets) but I will still do it to keep the exact return type everywhere
- return cartesian_key, (1,1) #The second 1 is needed to ambiguity with ints
+ it.next_k_cache[1] = 1
+ return cartesian_key, tuple(it.next_k_cache...)
- removed traits for indexability of indexsets - added a test with sets indexsets of JuMPArray
@joehuchette Is this ready for the merge? or would you like to add something more ? |
I'll take another look over it in the next few days, thanks for adding tests! Any chance you could run your performance benchmarks again on the final version? |
Here are the results when using the final version. 0.137093 seconds (78.45 k allocations: 1.692 MB) |
This is great, thanks for pushing it through, @IssamT! Do open the PR with the JuMPKey changes when you get a chance. |
You're welcome. My thanks to you for the followup |
This just a beginning to solving issues raised #646
I think it is better to not merge it yet. Maybe just give some feedback especially I m pretty sure the iterator methods could be implemented better.
The new
KeyIterator
works fine as long as the index sets are iterable (including Iterators andSet
s). They do not need to be indexable anymore