Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix to Read and Write Sets #1678

Merged
merged 19 commits into from
Oct 23, 2024
Merged

Conversation

philip-paul-mueller
Copy link
Collaborator

During my work on the new map fusion I discovered a bug in SDFGState._read_and_write_set().
Originally I solved it there, but it was decided to move it into its own PR.

Lets look at the first, super silly example, that is not useful on its own.
The main point here, is that the data attribute of the Memlet does not refer to the source of the connection but of the destination.

test_1

BTW: The Webviewer outputs something like B[0] -> [0, 0] however, the parser of the Memlet constructor does not understand this, it must be written as B[0] -> 0, 0, i.e. the second set of brackets must be omitted, this should be changed!

From the above we would expect the following sets:

  • Reads:
    • A: [Range (0, 0)]
    • B: Should not be listed in this set, because it is fully read and written, thus it is excluded.
  • Writes
    • B: [Range (0)]
    • C: [Range (0, 0), Range (1, 1)]

However, the current implementation gives us:

  • Reads: {'A': [Range (0)], 'B': [Range (1, 1)]}
  • Write: {'B': [Range (0)], 'C': [Range (1, 1), Range (0)]}

The current behaviour is wrong because:

  • A is a 2x2 array, thus the read set should also have two dimensions.
  • B inside the read set, it is a scalar, but the range has two dimensions, furthermore, it is present at all.
  • C the first member of the write set (Range(1, 1)) is correct, while the second (Range(0)) is horrible wrong.

The second example is even more simple.

test_2

From the SDFG we expect the following sets:

  • Reads:
    • A: [Range(0, 0)]
  • Writes:
    • B: [Range(0)]

It is important that in the above example other_subset is None and data is set to A, so it is not one of these "crazy" non standard Memlets we have seen in the first test.
However, the current implementation gives us:

  • Reads: {'A': [Range (0, 0)]}
  • Writes: {'B': [Range (0, 0)]}

This clearly shows, that whatever the implementation does is not correct.

@philip-paul-mueller
Copy link
Collaborator Author

As expected the CI now fails, with two errors.

  • tests/transformations/prune_connectors_test.py::test_read_write_2 This test was introduced by me and should show cast the problems the IF I removed in commit 3748c03 caused. To fix it, we just remove it.
  • tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map this error I know from my previous work on this topic. However, I am unable to fix it, as my knowledge is limited in this area. However, since the removal of the if just ensured that the function does what it is supposed to do (at least I think that) it is probably a bug that depends on the previous behaviour.

Note, CI without commit 3748c03, i.e. the if still there, passes. It is marked as failed since I cancelled the runs on Pauli.

@phschaad
@tbennun

Before in this case the transformation was not able to be applied.
The reason was because of the behaviour of the `SDFGState.read_and_write_sets()` function.
However, now with the fix of the [PR#1678](spcl#1678) the transformation became applicable.
@philip-paul-mueller
Copy link
Collaborator Author

I now modified the test to expect success, since the bug with the IF is now solved.
However, the second failure is still there and as mentioend above, I am still unable to fix it.

…n does not change the behaviour of teh output.
@philip-paul-mueller
Copy link
Collaborator Author

philip-paul-mueller commented Oct 11, 2024

The only test that is still failing is tests/transformations/move_loop_into_map_test.py::MoveLoopIntoMapTest::test_more_than_a_map I took a look at it.
The error is that the transformation now applies, but the test expects that transformation not to apply.

The initial SDFG looks as follows.

more_than_map

The SDFG after the transformation looks like this

more_than_map_after

The map is now outside and the loop is inside a nested SDFG.
After some thinking I do not see a reason why this is not valid to do.
The mean reason is that before the map runs, the content of out is overwritten by A.
Further, we do not write directly into out but first into tmp.
Since neither A nor B is overwritten, the map will produce the same in every loop iteration.

I think that the test itself is wrong.
The description also says that "possible WR conflict".
So I propose the following things:

  • We change the test such that it expects that the transformation applies.
  • We modify the SDFG such that the transformation becomes invalid (but there is already such a test the _1 version).

In either case I added code to ensure that the SDFG outputs the same before and after the transformation.
Which is the case for me locally.

What are you guys thinking?
@phschaad
@tbennun

tbennun added a commit that referenced this pull request Oct 11, 2024
Follow up on the discussion in #1678.

Supports `src[expr] -> dst[expr]`, `src[expr] -> [expr]`, and `[expr] -> dst[expr]` initializations for memlets. Also improves memlet label printouts.

@philip-paul-mueller @phschaad the expression mentioned in the other PR will now be printed as `[0, 0] -> B[0]` for clarity and can be reparsed.
github-merge-queue bot pushed a commit that referenced this pull request Oct 12, 2024
Follow up on the discussion in #1678.

Supports `src[expr] -> dst[expr]`, `src[expr] -> [expr]`, and `[expr] ->
dst[expr]` initializations for memlets. Also improves memlet label
printouts.

@philip-paul-mueller @phschaad the expression mentioned in the other PR
will now be printed as `[0, 0] -> B[0]` for clarity and can be reparsed.
@phschaad
Copy link
Collaborator

The proposal on the second test seams reasonable to me, it appears to me too, that there is nothing wrong with applying the transformation there. And already having test_more_than_a_map_1 means we do not need to include the inverse of the test.

# exists a memlet at the same access node that writes to the same region, then
# this read is ignored, and not included in the final read set, but only
# accounted fro in the write set. See also note below.
# TODO: Handle the case when multiple disjoint writes are needed to cover the read.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume from the code that the failure mode here is 'conservative', i.e., we overapproximate something as a read if we have disjoint writes that would cover it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
To be clear, lets say e1 writes 0:10 and e1 writes 10:20, but at the same node e3 reads 5:15 then it would still be considered as a read and a write and not just as a write.
In that sense it is an improvement.

I made the description a bit clearer.

# set because it was written first is still included if it is read
# in another subgraph
for data, accesses in rs.items():
# TODO: This only works if every data descriptor is only once in a path.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the failure mode here if it appears multiple times?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took my some time to understand it again, essentially each access node is processed individually and all are combined.
An example: Assume the first time the data is encountered the range 0:10 is written and read, so the read set is empty and the write set is 0:10 for that AN.
The second time, the subset 10:20 is written and 2:6 is read, therefore the read set is 2:6 and the write set is 10:20.
The final write set is {0:10, 0:20} and the read set is {0:6}.
However, in this case it could the read set could be determined as {}.

But the more I think about this could actually catch some edge case were this would be incorrect.
In the end it is an over approximation.

I turned the TODO into a NOTE and added a better description.

…more_than_a_map()` test.

The test now assumes that it can be applied, as it was discussed on github this should be the correct behaviour.
Furthermore, there is also a test which ensures that the same values are computed.

This commit also adds an additional test (``tests/transformations/move_loop_into_map_test.py::test_more_than_a_map_4()`) which is essentially the same.
However, in one of its Memlet it is a bit different and therefore the transformation can not be applied.
@philip-paul-mueller
Copy link
Collaborator Author

I updated the PR.
I mostly clarified the some comments.

Regarding the test:

  • I changed the test to expect that it works.
  • I added an additional test, with a different Memlet, this time it can not be applied.

Copy link
Collaborator

@phschaad phschaad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comments and changes! Looks good to me now if tests pass :-)

@philip-paul-mueller
Copy link
Collaborator Author

philip-paul-mueller commented Oct 14, 2024

The tests do not pass.
The error is probably the same that is mentioned in issue#1643.
I am currently investigate it, but I do not have a lot of time.

Currently we only see the error in `tests/numpy/ufunc_support_test.py::test_ufunc_add_accumulate_simple` if the auto optimizations are enabled.
Copy link
Collaborator

@phschaad phschaad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per discussion

@philip-paul-mueller
Copy link
Collaborator Author

This is something I have posted in the Matermost channel before.
It is replicated here for completeness:

I had some time so I trying to fix the last error inside the read and write sets, I have traced the error back to RefineNestedAccess.
I now found the minimum example.

Nested_min_example

In red you can see the initial subsets and in green what has changed (if nothing has changed it is still red).

I have conformed that the problem is that T1 appear twice.
According to the definition of read and write sets the first T1 AccessNode is not considered as a read node since everything that is read is also written.
Because of the second AccessNode for T1 it is considered as a write access.

I confirmed that if the top T1 is replaced by another array, let's say T2 then it works.
Furthermore, if the filtering is disabled (i.e. T1 is part of the read set) it works as well.

I have come up with the following proposals.

  1. Considering the fact that the filtering never really worked correctly in the first place (since we get the errors after I fixed them), I would propose to scrap it, without any replacement.
  2. If the filtering should be kept, then we must precisely define what happens in cases such as above, i.e. there is a path in a subgraph that visits a data container multiple times.

If we want to keep the filtering, then we will have to fully define the behaviour of the function.
Since even now the function does not address all cases, such as

  • Multiple memlets are needed to cover a read (for we read the range 5:15; However, Memlet 1 writes 0:10 and Memlet 2 writes 10:20, thus the combination of both Memlets is needed to filter the range).
  • What happens if the a data container appears multiple time in a subgraph? (this is our problem.)
  • What happens if a data container appear multiple time, but each time in a different (concurrent) component?

From the above suggestions I would choose the first one, i.e. removing the filtering.
The main reason is, this is the traditional defacto behaviour.

@philip-paul-mueller
Copy link
Collaborator Author

To see what happens I have now disabled the mentioned filtering.
This was done in commit b546b07

Because of the removed filtering, `B` is now part of the read set as well.
Because of the removed filtering `B` is no longer removed from the read set.
However, because now the test has become quite useless.

We also see that the test was specifically constructed in such a way that the filtering applies.
Otherwise it would never triggered.
Because of the removed filtering `B` is no longer removed from the read set.
However, because now the test has become quite useless.

We also see that the test was specifically constructed in such a way that the filtering applies.
Otherwise it would never triggered.
…n_a_map`.

Because the filtering is now not applied, like it was originally, the transformation does no longer apply.
However, this is the historical expected behaviour.
Because of the removed filtering the transformat can no longer apply.
However, I originally added these tests to demonstrate this inconsistent behaviour in the first place.
So removing them is now the right choice.

This commit also combines them to `prune_connectors_test.py::test_read_write`.
Ensured that the return value is always accassable in the same way.
Also ensured that the `test_rna_read_and_write_sets_different_storage()` test verifies that it still gives the same result.
…oule_use`

I realized that the transformation should not apply.
The reason is because of all arguments to the nested SDFG element `0` is accessed.
This means it can not be adjusted.
@philip-paul-mueller
Copy link
Collaborator Author

I have now fixed the tests that were failing.
Each test is fixed in its own commit and in the description you can found a justification for this.

Copy link
Collaborator

@phschaad phschaad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, please see the comment in code.

from dace.sdfg import utils # Avoid cyclic import

# Ensures that the `{src,dst}_subset` are properly set.
# TODO: find where the problems are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very ugly hack, but I can replicate / encounter the issue too. I am fine with leaving this in, but please make sure there is an issue that keeps track of this TODO somewhere.

@phschaad phschaad enabled auto-merge October 23, 2024 14:56
@phschaad phschaad added this pull request to the merge queue Oct 23, 2024
Merged via the queue into spcl:master with commit 380554f Oct 23, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants