-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix to Read and Write Sets #1678
Conversation
As expected the CI now fails, with two errors.
Note, CI without commit 3748c03, i.e. the |
Before in this case the transformation was not able to be applied. The reason was because of the behaviour of the `SDFGState.read_and_write_sets()` function. However, now with the fix of the [PR#1678](spcl#1678) the transformation became applicable.
I now modified the test to expect success, since the bug with the |
…n does not change the behaviour of teh output.
The only test that is still failing is The initial SDFG looks as follows. The SDFG after the transformation looks like this The map is now outside and the loop is inside a nested SDFG. I think that the test itself is wrong.
In either case I added code to ensure that the SDFG outputs the same before and after the transformation. |
Follow up on the discussion in #1678. Supports `src[expr] -> dst[expr]`, `src[expr] -> [expr]`, and `[expr] -> dst[expr]` initializations for memlets. Also improves memlet label printouts. @philip-paul-mueller @phschaad the expression mentioned in the other PR will now be printed as `[0, 0] -> B[0]` for clarity and can be reparsed.
Follow up on the discussion in #1678. Supports `src[expr] -> dst[expr]`, `src[expr] -> [expr]`, and `[expr] -> dst[expr]` initializations for memlets. Also improves memlet label printouts. @philip-paul-mueller @phschaad the expression mentioned in the other PR will now be printed as `[0, 0] -> B[0]` for clarity and can be reparsed.
The proposal on the second test seams reasonable to me, it appears to me too, that there is nothing wrong with applying the transformation there. And already having |
dace/sdfg/state.py
Outdated
# exists a memlet at the same access node that writes to the same region, then | ||
# this read is ignored, and not included in the final read set, but only | ||
# accounted fro in the write set. See also note below. | ||
# TODO: Handle the case when multiple disjoint writes are needed to cover the read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume from the code that the failure mode here is 'conservative', i.e., we overapproximate something as a read if we have disjoint writes that would cover it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
To be clear, lets say e1
writes 0:10
and e1
writes 10:20
, but at the same node e3
reads 5:15
then it would still be considered as a read and a write and not just as a write.
In that sense it is an improvement.
I made the description a bit clearer.
dace/sdfg/state.py
Outdated
# set because it was written first is still included if it is read | ||
# in another subgraph | ||
for data, accesses in rs.items(): | ||
# TODO: This only works if every data descriptor is only once in a path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the failure mode here if it appears multiple times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took my some time to understand it again, essentially each access node is processed individually and all are combined.
An example: Assume the first time the data is encountered the range 0:10
is written and read, so the read set is empty and the write set is 0:10
for that AN.
The second time, the subset 10:20
is written and 2:6
is read, therefore the read set is 2:6
and the write set is 10:20
.
The final write set is {0:10, 0:20}
and the read set is {0:6}
.
However, in this case it could the read set could be determined as {}
.
But the more I think about this could actually catch some edge case were this would be incorrect.
In the end it is an over approximation.
I turned the TODO into a NOTE and added a better description.
…more_than_a_map()` test. The test now assumes that it can be applied, as it was discussed on github this should be the correct behaviour. Furthermore, there is also a test which ensures that the same values are computed. This commit also adds an additional test (``tests/transformations/move_loop_into_map_test.py::test_more_than_a_map_4()`) which is essentially the same. However, in one of its Memlet it is a bit different and therefore the transformation can not be applied.
I updated the PR. Regarding the test:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comments and changes! Looks good to me now if tests pass :-)
The tests do not pass. |
Currently we only see the error in `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` if the auto optimizations are enabled.
This is _just_ a proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per discussion
This is something I have posted in the Matermost channel before. I had some time so I trying to fix the last error inside the read and write sets, I have traced the error back to In red you can see the initial subsets and in green what has changed (if nothing has changed it is still red). I have conformed that the problem is that I confirmed that if the top I have come up with the following proposals.
If we want to keep the filtering, then we will have to fully define the behaviour of the function.
From the above suggestions I would choose the first one, i.e. removing the filtering. |
To see what happens I have now disabled the mentioned filtering. |
Because of the removed filtering, `B` is now part of the read set as well.
Because of the removed filtering `B` is no longer removed from the read set. However, because now the test has become quite useless. We also see that the test was specifically constructed in such a way that the filtering applies. Otherwise it would never triggered.
Because of the removed filtering `B` is no longer removed from the read set. However, because now the test has become quite useless. We also see that the test was specifically constructed in such a way that the filtering applies. Otherwise it would never triggered.
…n_a_map`. Because the filtering is now not applied, like it was originally, the transformation does no longer apply. However, this is the historical expected behaviour.
Because of the removed filtering the transformat can no longer apply. However, I originally added these tests to demonstrate this inconsistent behaviour in the first place. So removing them is now the right choice. This commit also combines them to `prune_connectors_test.py::test_read_write`.
Ensured that the return value is always accassable in the same way. Also ensured that the `test_rna_read_and_write_sets_different_storage()` test verifies that it still gives the same result.
…oule_use` I realized that the transformation should not apply. The reason is because of all arguments to the nested SDFG element `0` is accessed. This means it can not be adjusted.
bf5e4aa
to
e103924
Compare
I have now fixed the tests that were failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, please see the comment in code.
from dace.sdfg import utils # Avoid cyclic import | ||
|
||
# Ensures that the `{src,dst}_subset` are properly set. | ||
# TODO: find where the problems are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very ugly hack, but I can replicate / encounter the issue too. I am fine with leaving this in, but please make sure there is an issue that keeps track of this TODO somewhere.
During my work on the new map fusion I discovered a bug in
SDFGState._read_and_write_set()
.Originally I solved it there, but it was decided to move it into its own PR.
Lets look at the first, super silly example, that is not useful on its own.
The main point here, is that the
data
attribute of the Memlet does not refer to the source of the connection but of the destination.BTW: The Webviewer outputs something like
B[0] -> [0, 0]
however, the parser of the Memlet constructor does not understand this, it must be written asB[0] -> 0, 0
, i.e. the second set of brackets must be omitted, this should be changed!From the above we would expect the following sets:
A
:[Range (0, 0)]
B
: Should not be listed in this set, because it is fully read and written, thus it is excluded.B
:[Range (0)]
C
:[Range (0, 0), Range (1, 1)]
However, the current implementation gives us:
{'A': [Range (0)], 'B': [Range (1, 1)]}
{'B': [Range (0)], 'C': [Range (1, 1), Range (0)]}
The current behaviour is wrong because:
A
is a2x2
array, thus the read set should also have two dimensions.B
inside the read set, it is a scalar, but the range has two dimensions, furthermore, it is present at all.C
the first member of the write set (Range(1, 1)
) is correct, while the second (Range(0)
) is horrible wrong.The second example is even more simple.
From the SDFG we expect the following sets:
A
:[Range(0, 0)]
B
:[Range(0)]
It is important that in the above example
other_subset
isNone
anddata
is set toA
, so it is not one of these "crazy" non standard Memlets we have seen in the first test.However, the current implementation gives us:
{'A': [Range (0, 0)]}
{'B': [Range (0, 0)]}
This clearly shows, that whatever the implementation does is not correct.