Fix to Read and Write Sets #1678

philip-paul-mueller · 2024-10-11T06:52:58Z

During my work on the new map fusion I discovered a bug in SDFGState._read_and_write_set().
Originally I solved it there, but it was decided to move it into its own PR.

Lets look at the first, super silly example, that is not useful on its own.
The main point here, is that the data attribute of the Memlet does not refer to the source of the connection but of the destination.

BTW: The Webviewer outputs something like B[0] -> [0, 0] however, the parser of the Memlet constructor does not understand this, it must be written as B[0] -> 0, 0, i.e. the second set of brackets must be omitted, this should be changed!

From the above we would expect the following sets:

Reads:
- A: [Range (0, 0)]
- B: Should not be listed in this set, because it is fully read and written, thus it is excluded.
Writes
- B: [Range (0)]
- C: [Range (0, 0), Range (1, 1)]

However, the current implementation gives us:

Reads: {'A': [Range (0)], 'B': [Range (1, 1)]}
Write: {'B': [Range (0)], 'C': [Range (1, 1), Range (0)]}

The current behaviour is wrong because:

A is a 2x2 array, thus the read set should also have two dimensions.
B inside the read set, it is a scalar, but the range has two dimensions, furthermore, it is present at all.
C the first member of the write set (Range(1, 1)) is correct, while the second (Range(0)) is horrible wrong.

The second example is even more simple.

From the SDFG we expect the following sets:

Reads:
- A: [Range(0, 0)]
Writes:
- B: [Range(0)]

It is important that in the above example other_subset is None and data is set to A, so it is not one of these "crazy" non standard Memlets we have seen in the first test.
However, the current implementation gives us:

Reads: {'A': [Range (0, 0)]}
Writes: {'B': [Range (0, 0)]}

This clearly shows, that whatever the implementation does is not correct.

philip-paul-mueller · 2024-10-11T09:19:07Z

As expected the CI now fails, with two errors.

tests/transformations/prune_connectors_test.py::test_read_write_2 This test was introduced by me and should show cast the problems the IF I removed in commit 3748c03 caused. To fix it, we just remove it.
tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map this error I know from my previous work on this topic. However, I am unable to fix it, as my knowledge is limited in this area. However, since the removal of the if just ensured that the function does what it is supposed to do (at least I think that) it is probably a bug that depends on the previous behaviour.

Note, CI without commit 3748c03, i.e. the if still there, passes. It is marked as failed since I cancelled the runs on Pauli.

@phschaad
@tbennun

Before in this case the transformation was not able to be applied. The reason was because of the behaviour of the `SDFGState.read_and_write_sets()` function. However, now with the fix of the [PR#1678](spcl#1678) the transformation became applicable.

philip-paul-mueller · 2024-10-11T09:32:27Z

I now modified the test to expect success, since the bug with the IF is now solved.
However, the second failure is still there and as mentioend above, I am still unable to fix it.

…n does not change the behaviour of teh output.

philip-paul-mueller · 2024-10-11T11:20:42Z

The only test that is still failing is tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map I took a look at it.
The error is that the transformation now applies, but the test expects that transformation not to apply.

The initial SDFG looks as follows.

The SDFG after the transformation looks like this

The map is now outside and the loop is inside a nested SDFG.
After some thinking I do not see a reason why this is not valid to do.
The mean reason is that before the map runs, the content of out is overwritten by A.
Further, we do not write directly into out but first into tmp.
Since neither A nor B is overwritten, the map will produce the same in every loop iteration.

I think that the test itself is wrong.
The description also says that "possible WR conflict".
So I propose the following things:

We change the test such that it expects that the transformation applies.
We modify the SDFG such that the transformation becomes invalid (but there is already such a test the _1 version).

In either case I added code to ensure that the SDFG outputs the same before and after the transformation.
Which is the case for me locally.

What are you guys thinking?
@phschaad
@tbennun

@philip-paul-mueller

Follow up on the discussion in #1678. Supports `src[expr] -> dst[expr]`, `src[expr] -> [expr]`, and `[expr] -> dst[expr]` initializations for memlets. Also improves memlet label printouts. @philip-paul-mueller @phschaad the expression mentioned in the other PR will now be printed as `[0, 0] -> B[0]` for clarity and can be reparsed.

@philip-paul-mueller

Follow up on the discussion in #1678. Supports `src[expr] -> dst[expr]`, `src[expr] -> [expr]`, and `[expr] -> dst[expr]` initializations for memlets. Also improves memlet label printouts. @philip-paul-mueller @phschaad the expression mentioned in the other PR will now be printed as `[0, 0] -> B[0]` for clarity and can be reparsed.

phschaad · 2024-10-14T09:40:50Z

The proposal on the second test seams reasonable to me, it appears to me too, that there is nothing wrong with applying the transformation there. And already having test_more_than_a_map_1 means we do not need to include the inverse of the test.

phschaad · 2024-10-14T09:45:10Z

dace/sdfg/state.py

+                #  exists a memlet at the same access node that writes to the same region, then
+                #  this read is ignored, and not included in the final read set, but only
+                #  accounted fro in the write set. See also note below.
+                # TODO: Handle the case when multiple disjoint writes are needed to cover the read.


I assume from the code that the failure mode here is 'conservative', i.e., we overapproximate something as a read if we have disjoint writes that would cover it?

Yes.
To be clear, lets say e1 writes 0:10 and e1 writes 10:20, but at the same node e3 reads 5:15 then it would still be considered as a read and a write and not just as a write.
In that sense it is an improvement.

I made the description a bit clearer.

phschaad · 2024-10-14T09:45:31Z

dace/sdfg/state.py

-            # set because it was written first is still included if it is read
-            # in another subgraph
-            for data, accesses in rs.items():
+            # TODO: This only works if every data descriptor is only once in a path.


What is the failure mode here if it appears multiple times?

It took my some time to understand it again, essentially each access node is processed individually and all are combined.
An example: Assume the first time the data is encountered the range 0:10 is written and read, so the read set is empty and the write set is 0:10 for that AN.
The second time, the subset 10:20 is written and 2:6 is read, therefore the read set is 2:6 and the write set is 10:20.
The final write set is {0:10, 0:20} and the read set is {0:6}.
However, in this case it could the read set could be determined as {}.

But the more I think about this could actually catch some edge case were this would be incorrect.
In the end it is an over approximation.

I turned the TODO into a NOTE and added a better description.

…more_than_a_map()` test. The test now assumes that it can be applied, as it was discussed on github this should be the correct behaviour. Furthermore, there is also a test which ensures that the same values are computed. This commit also adds an additional test (``tests/transformations/move_loop_into_map_test.py::test_more_than_a_map_4()`) which is essentially the same. However, in one of its Memlet it is a bit different and therefore the transformation can not be applied.

philip-paul-mueller · 2024-10-14T11:19:33Z

I updated the PR.
I mostly clarified the some comments.

Regarding the test:

I changed the test to expect that it works.
I added an additional test, with a different Memlet, this time it can not be applied.

phschaad

Thank you for the comments and changes! Looks good to me now if tests pass :-)

philip-paul-mueller · 2024-10-14T14:22:04Z

The tests do not pass.
The error is probably the same that is mentioned in issue#1643.
I am currently investigate it, but I do not have a lot of time.

Currently we only see the error in `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` if the auto optimizations are enabled.

This is _just_ a proposal.

phschaad

As per discussion

philip-paul-mueller · 2024-10-22T12:57:12Z

This is something I have posted in the Matermost channel before.
It is replicated here for completeness:

I had some time so I trying to fix the last error inside the read and write sets, I have traced the error back to RefineNestedAccess.
I now found the minimum example.

In red you can see the initial subsets and in green what has changed (if nothing has changed it is still red).

I have conformed that the problem is that T1 appear twice.
According to the definition of read and write sets the first T1 AccessNode is not considered as a read node since everything that is read is also written.
Because of the second AccessNode for T1 it is considered as a write access.

I confirmed that if the top T1 is replaced by another array, let's say T2 then it works.
Furthermore, if the filtering is disabled (i.e. T1 is part of the read set) it works as well.

I have come up with the following proposals.

Considering the fact that the filtering never really worked correctly in the first place (since we get the errors after I fixed them), I would propose to scrap it, without any replacement.
If the filtering should be kept, then we must precisely define what happens in cases such as above, i.e. there is a path in a subgraph that visits a data container multiple times.

If we want to keep the filtering, then we will have to fully define the behaviour of the function.
Since even now the function does not address all cases, such as

Multiple memlets are needed to cover a read (for we read the range 5:15; However, Memlet 1 writes 0:10 and Memlet 2 writes 10:20, thus the combination of both Memlets is needed to filter the range).
What happens if the a data container appears multiple time in a subgraph? (this is our problem.)
What happens if a data container appear multiple time, but each time in a different (concurrent) component?

From the above suggestions I would choose the first one, i.e. removing the filtering.
The main reason is, this is the traditional defacto behaviour.

philip-paul-mueller · 2024-10-22T12:58:20Z

To see what happens I have now disabled the mentioned filtering.
This was done in commit b546b07

Because of the removed filtering, `B` is now part of the read set as well.

Because of the removed filtering `B` is no longer removed from the read set. However, because now the test has become quite useless. We also see that the test was specifically constructed in such a way that the filtering applies. Otherwise it would never triggered.

…n_a_map`. Because the filtering is now not applied, like it was originally, the transformation does no longer apply. However, this is the historical expected behaviour.

Because of the removed filtering the transformat can no longer apply. However, I originally added these tests to demonstrate this inconsistent behaviour in the first place. So removing them is now the right choice. This commit also combines them to `prune_connectors_test.py::test_read_write`.

Ensured that the return value is always accassable in the same way. Also ensured that the `test_rna_read_and_write_sets_different_storage()` test verifies that it still gives the same result.

…oule_use` I realized that the transformation should not apply. The reason is because of all arguments to the nested SDFG element `0` is accessed. This means it can not be adjusted.

philip-paul-mueller · 2024-10-23T12:57:25Z

I have now fixed the tests that were failing.
Each test is fixed in its own commit and in the description you can found a justification for this.

phschaad

This looks good to me, please see the comment in code.

phschaad · 2024-10-23T14:55:49Z

dace/sdfg/state.py

+        from dace.sdfg import utils  # Avoid cyclic import
+
+        # Ensures that the `{src,dst}_subset` are properly set.
+        #  TODO: find where the problems are


This is a very ugly hack, but I can replicate / encounter the issue too. I am fine with leaving this in, but please make sure there is an issue that keeps track of this TODO somewhere.

philip-paul-mueller added 3 commits October 11, 2024 08:48

Added tests for the _read_and_write_sets().

145c0ea

Added the fix from my MapFusion PR.

38e748b

Now made read_and_write_sets() fully adhere to their own definition.

3748c03

philip-paul-mueller requested a review from phschaad October 11, 2024 06:57

Added code to test_more_than_a_map to ensure that the transformatio…

b4feddf

…n does not change the behaviour of teh output.

tbennun mentioned this pull request Oct 11, 2024

Improve memlet label and string initialization #1680

Merged

philip-paul-mueller added 2 commits October 14, 2024 09:26

Merge remote-tracking branch 'spcl/master' into read-write-sets

e1c25b2

Added the new memlet creation syntax.

70fa3db

phschaad reviewed Oct 14, 2024

View reviewed changes

philip-paul-mueller added 2 commits October 14, 2024 13:15

Modified some comments to make them clearer.

b187a82

philip-paul-mueller requested a review from phschaad October 14, 2024 11:19

phschaad approved these changes Oct 14, 2024

View reviewed changes

philip-paul-mueller added 3 commits October 22, 2024 08:53

Merge branch 'master' into read-write-sets

b5fc16f

Added a test to highlights the error.

b7fe242

Currently we only see the error in `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` if the auto optimizations are enabled.

I now removed the filtering inside the read and write set.

b546b07

This is _just_ a proposal.

phschaad requested changes Oct 22, 2024

View reviewed changes

philip-paul-mueller added 2 commits October 23, 2024 09:32

Fixed state_test.py::test_read_and_write_set_filter.

ae20590

Because of the removed filtering, `B` is now part of the read set as well.

philip-paul-mueller added 5 commits October 23, 2024 09:40

Fixed `move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_tha…

cb80f0b

…n_a_map`. Because the filtering is now not applied, like it was originally, the transformation does no longer apply. However, this is the historical expected behaviour.

General improvements to some tests.

f74d6e8

Ensured that the return value is always accassable in the same way. Also ensured that the `test_rna_read_and_write_sets_different_storage()` test verifies that it still gives the same result.

Updated `refine_nested_access_test.py::test_rna_read_and_write_sets_d…

e103924

…oule_use` I realized that the transformation should not apply. The reason is because of all arguments to the nested SDFG element `0` is accessed. This means it can not be adjusted.

philip-paul-mueller force-pushed the read-write-sets branch from bf5e4aa to e103924 Compare October 23, 2024 12:51

phschaad approved these changes Oct 23, 2024

View reviewed changes

phschaad enabled auto-merge October 23, 2024 14:56

phschaad added this pull request to the merge queue Oct 23, 2024

Merged via the queue into spcl:master with commit 380554f Oct 23, 2024
10 checks passed

philip-paul-mueller mentioned this pull request Oct 24, 2024

Invalid Cache State of Memlets #1703

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix to Read and Write Sets #1678

Fix to Read and Write Sets #1678

philip-paul-mueller commented Oct 11, 2024

philip-paul-mueller commented Oct 11, 2024

philip-paul-mueller commented Oct 11, 2024

philip-paul-mueller commented Oct 11, 2024 •

edited

Loading

phschaad commented Oct 14, 2024

phschaad Oct 14, 2024

philip-paul-mueller Oct 14, 2024

phschaad Oct 14, 2024

philip-paul-mueller Oct 14, 2024

philip-paul-mueller commented Oct 14, 2024

phschaad left a comment •

edited

Loading

philip-paul-mueller commented Oct 14, 2024 •

edited

Loading

phschaad left a comment

philip-paul-mueller commented Oct 22, 2024

philip-paul-mueller commented Oct 22, 2024

philip-paul-mueller commented Oct 23, 2024

phschaad left a comment

phschaad Oct 23, 2024

Fix to Read and Write Sets #1678

Fix to Read and Write Sets #1678

Conversation

philip-paul-mueller commented Oct 11, 2024

philip-paul-mueller commented Oct 11, 2024

philip-paul-mueller commented Oct 11, 2024

philip-paul-mueller commented Oct 11, 2024 • edited Loading

phschaad commented Oct 14, 2024

phschaad Oct 14, 2024

Choose a reason for hiding this comment

philip-paul-mueller Oct 14, 2024

Choose a reason for hiding this comment

phschaad Oct 14, 2024

Choose a reason for hiding this comment

philip-paul-mueller Oct 14, 2024

Choose a reason for hiding this comment

philip-paul-mueller commented Oct 14, 2024

phschaad left a comment • edited Loading

Choose a reason for hiding this comment

philip-paul-mueller commented Oct 14, 2024 • edited Loading

phschaad left a comment

Choose a reason for hiding this comment

philip-paul-mueller commented Oct 22, 2024

philip-paul-mueller commented Oct 22, 2024

philip-paul-mueller commented Oct 23, 2024

phschaad left a comment

Choose a reason for hiding this comment

phschaad Oct 23, 2024

Choose a reason for hiding this comment

philip-paul-mueller commented Oct 11, 2024 •

edited

Loading

phschaad left a comment •

edited

Loading

philip-paul-mueller commented Oct 14, 2024 •

edited

Loading