Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unity][TVMScript] Avoid dangling reference when printing Call attrs #15923

Merged
merged 1 commit into from
Oct 13, 2023

Conversation

Lunderberg
Copy link
Contributor

Prior to this commit, the tvm::script::printer::AttrPrinter class took the attribute path as a const ObjectPath&. In both places where an AttrPrinter is called, the temporary object n_p->Attr("attrs") is passed for this argument. While binding a temporary object to a const reference can extend the lifetime of the temporary, this requires the const reference to be in the same scope as the temporary, and does not apply in this case (see this stackoverflow post). Therefore, this reference is only valid through the construction of AttrPrinter printer, and is invalid during its usage on the following line.

This dangling reference has caused segfaults in CI for unrelated changes (example), and can be reproduced with the following test case.

import pytest

from tvm.script import relax as R

@pytest.mark.parametrize("iter", range(10000))
def test_argmax_without_specified_axis(iter):
    @R.function
    def func(x: R.Tensor((1, 2, 3, 4), "float32")):
        return R.argmax(x)

    func.script(show_meta=True)

This test case is not included in this commit, as the reproduction is not consistent, with failure requiring on the order of 10k iterations to trigger. In addition, reproduction was sensitive to the following conditions.

  • The function being printed must contain at least one relax::Call node, with an operation that has attributes.

  • TVM must be built with optimization enabled. In gcc, the -ftree-dse optimization, which is part of -O1, is required to trigger the bug.

  • Python's default allocation must be used. If PYTHONMALLOC=malloc is set to instead use the system's malloc, the segfault was no longer triggered.

This commit updates AttrPrinter to accept the ObjectPath by value. With the change applied, the above test ran 100k times without error.

Prior to this commit, the `tvm::script::printer::AttrPrinter` class
took the attribute path as a `const ObjectPath&`.  In both places
where an `AttrPrinter` is called, the temporary object
`n_p->Attr("attrs")` is passed for this argument.  While binding a
temporary object to a const reference can extend the lifetime of the
temporary, this requires the const reference to be in the same scope
as the temporary, and does not apply in this case (see [this
stackoverflow post](https://stackoverflow.com/a/2784304)).  Therefore,
this reference is only valid through the construction of `AttrPrinter
printer`, and is invalid during its usage on the following line.

This dangling reference has caused segfaults in CI for unrelated
changes ([example](https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-unity/detail/PR-15904/3/pipeline)),
and can be reproduced with the following test case.

```python
import pytest

from tvm.script import relax as R

@pytest.mark.parametrize("iter", range(10000))
def test_argmax_without_specified_axis(iter):
    @R.function
    def func(x: R.Tensor((1, 2, 3, 4), "float32")):
        return R.argmax(x)

    func.script(show_meta=True)
```

This test case is not included in this commit, as the reproduction is
not consistent, with failure requiring on the order of 10k iterations
to trigger.  In addition, reproduction was sensitive to the following
conditions.

* The function being printed must contain at least one `relax::Call`
  node, with an operation that has attributes.

* TVM must be built with optimization enabled.  In gcc, the
  `-ftree-dse` optimization, which is part of `-O1`, is required to
  trigger the bug.

* Python's default allocation must be used.  If `PYTHONMALLOC=malloc`
  is set to instead use the system's `malloc`, the segfault was no
  longer triggered.

This commit updates `AttrPrinter` to accept the `ObjectPath` by value.
With the change applied, the above test ran 100k times without error.
@masahi masahi merged commit 559e94a into apache:unity Oct 13, 2023
1 check passed
@Lunderberg Lunderberg deleted the unity_remove_dangling_reference branch October 13, 2023 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants