-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Unexpected crash by tricky parameters in TVM's Python API #8988
Conversation
I have to agree that failing, even on adversarially constructed cases by a fuzzer, is actual failing. Thanks for pointing this out 😃🙏 On the other hand, I feel this particular change to the object system might be a bit aggressive, given in TVM nullability by default is allowed by design, and enforcing nullptr checks every time might be contradictory with the assumptions in some pre-historic pieces of the codebase (I'm sure there isn't many though). Just wanted to share some recent efforts on improving nullability related stuff:
However, the effort was paused because the support is not perfect for composite TVM objects, i.e. TVM objects that contain other TVM objects, and it's less prioritized task given it only helps with adversarially constructed cases, we haven't seen too much progress on this side. I am happy to chat more if you are interested! More fundamental contribution is warmly welcome ❤️ |
@junrushao1994 Thanks for the comments. Yes, they are just some tricky edge cases that users would not intentionally meet in most cases (though it is possible by API misuse). I think they are only useful in some extreme and rare security cases, e.g., some hostile server attack to model servers using tvm. As you said, a more fundamental and elegant workaround would be applying Since the conclusion seems to be clear (I know you guys are busy and such issues are minor and should not be prioritized), I think we can close this PR and simply record this proposal on the channel. Always tyty for the instant reply and patience ❤️ |
const ObjectName* operator->() const { \ | ||
auto ptr = static_cast<const ObjectName*>(data_.get()); \ | ||
ICHECK(nullptr != ptr) << "Calling `->` to <" #ObjectName ">(null)"; \ | ||
return ptr; \ | ||
} \ | ||
const ObjectName* get() const { return static_cast<const ObjectName*>(data_.get()); } \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TVM nullability by default is allowed by design, and enforcing nullptr checks every time might be contradictory with the assumptions...
BTW, I don't think it violates the assumption. Since here ptr->SomeThing
is called, crash happens if it's a nullptr, given that the macro here is TVM_DEFINE_OBJECT_REF_METHODS
not TVM_DEFINE_**NOTNULLABLE**_OBJECT_REF_METHODS
. The checking here can help convert the crash into a readable exception and this can already fix 95% crash resulted from unintentional parameters.
For TVM_DEFINE_NOTNULLABLE_OBJECT_REF_METHODS
objects, it is still the same (not violating the assumption by applying redundant checking). :-)
Background
This is a fixing proposal to Discussion#10988 and #8979 .
Approach
This PR fixed unexpected C++ crash by:
nullptr
before usingoperator->
(in this case, it must crash);None
containers (Array) being capable of range-based for; (otherwise crash);ICHECK(false) << ... << statement_causing_TVMError
will result in crash);This PR passed test cases contributed in #8979 (the crashes that can be trigger by giving
None
parameters in TVM's core python APIs)Comparison with other options
According to the discussion, I also tried other proposals but don't think they are good approaches:
(1)
NOTNULLABLE
andtvm::runtime::Optional
suggested by @junrushao1994 and @Hzfengsy ; -> still requires lots of code changes on legacy codes;(2) Add checkers for all those
.set_body*
statements; -> not fundamental that checkers will scale when new interface coming in;This PR fixed a fundamental issue causing invalid memory access crashes: avoid
nullptr
objects using->
operator because it must result in program crash; It only requires ~40 lines of edit and very few future engineering (since the fundamental issue is resolved).Known issue
Statements like
ICHECK(false) << ... << statement_causing_TVMError
will still crash so that we have to manually add new checkers before it. e.g., line 112 inBlockReadWriteDetector::operator()
.