-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: add guard against NodeStatus #22
Conversation
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
POST MORTEMNodes are defined as a map[string]NodeStatus. This combined with zero values returns resulted in the bug our customer encountered of container sets causing random failure. This confirms my (later stated) hunch, it's actually not the implementation of container sets that is causing issues but interactions of various components that weren't anticipated. Unfortunately this brings to the point I had made to @JPZ13 earlier, making changes to Argo Workflows does requires a near 1 to 1 mapping of the Workflows codebase into memory, at the size of the codebase right now, that is an impossible task for most (but almost certainly everyone involved in the project). To add to this as an example, I encountered a case where a bug prevented a bug from going through in the first place, that is they cancelled each other out, if you look at here, there was no mention of what to do if node was not present in the hashmap, but an invalid access resulted in an empty struct, this meant the boolean variables here get changed correctly. This is to me fairly terrifying because it is impossible to distinguish between bug and intentional behaviour without knowing what the original author intended. I think the code should be as obvious as possible, when an explicit check is made against a map, the author also asserts the intended behaviour explicitly. To generalise on the above comments, the core problem is that the codebase isn't as modular as it really should be, different sections of the codebase assume behaviour from other sections of the codebase. This is what makes the codebase hard to debug, hard to add features to and hard to change, ultimately I believe the container set issues were only just a symptom of the state of the codebase, the core issues lie within Argo Workflows itself. Putting time and effort into simplifying Workflows as much as possible will prevent future "whack-a-mole" type situations. What slowed me downI ended up chasing a wild goose of sorts, I assumed the error originated from boundaryID being "". This is actually an intended value for some nodes, but given zero values are a thing in Go, it is impossible to distinguish between:
In reality the actual strongly typed type of boundaryID is really Prevention of this particular type of issue
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't commented on every use of logrus, but they should all change.
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Signed-off-by: isubasinghe <[email protected]>
Fixes #TODO Please do not open a pull request until you have checked ALL of these: * [ ] Create the PR as draft . * [ ] Run `make pre-commit -B` to fix codegen and lint problems. * [ ] Sign-off your commits (otherwise the DCO check will fail). * [ ] Use [a conventional commit message](https://www.conventionalcommits.org/en/v1.0.0/) (otherwise the commit message check will fail). * [ ] "Fixes #" is in both the PR title (for release notes) and this description (to automatically link and close the issue). * [ ] Add unit or e2e tests. Say how you tested your changes. If you changed the UI, attach screenshots. * [ ] Github checks are green. * [ ] Once required tests have passed, mark your PR "Ready for review". If changes were requested, and you've made them, dismiss the review to get it reviewed again. --------- Signed-off-by: isubasinghe <[email protected]>
Fixes #TODO Please do not open a pull request until you have checked ALL of these: * [ ] Create the PR as draft . * [ ] Run `make pre-commit -B` to fix codegen and lint problems. * [ ] Sign-off your commits (otherwise the DCO check will fail). * [ ] Use [a conventional commit message](https://www.conventionalcommits.org/en/v1.0.0/) (otherwise the commit message check will fail). * [ ] "Fixes #" is in both the PR title (for release notes) and this description (to automatically link and close the issue). * [ ] Add unit or e2e tests. Say how you tested your changes. If you changed the UI, attach screenshots. * [ ] Github checks are green. * [ ] Once required tests have passed, mark your PR "Ready for review". If changes were requested, and you've made them, dismiss the review to get it reviewed again. --------- Signed-off-by: isubasinghe <[email protected]>
Fixes #TODO
Please do not open a pull request until you have checked ALL of these:
make pre-commit -B
to fix codegen and lint problems.If changes were requested, and you've made them, dismiss the review to get it reviewed again.