Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check whether prevOpts is nil in WipeoutBuildPathIfBuildOptionsChanged's Run #1118

Merged
merged 8 commits into from
May 10, 2021

Conversation

d-a-v
Copy link
Contributor

@d-a-v d-a-v commented Dec 29, 2020

Please check if the PR fulfills these requirements

  • The PR has no duplicates (please search among the Pull Requests
    before creating one)
  • The PR follows
    our contributing guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)
  • UPGRADING.md has been updated with a migration guide (for breaking changes)
  • What kind of change does this PR introduce?

arduino-1.8.10 is able to run esp8266/Arduino CI tests, but not anymore since 1.8.11..1.8.13 (example).
This PR fixes it.

  • What is the current behavior?
panic: runtime error: invalid memory address or nil pointer dereference
  • Other information:

I haven't tried to understand what happened between arduino-ide 1.8.10 and 1.8.11.

Copy link
Collaborator

@matthijskooijman matthijskooijman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, do you know when this situation occurs exactly? It looks like a non-existing build options JSON file is handled already:

if ctx.BuildOptionsJsonPrevious == "" {
return nil
}

So apparently the JSON contents is non-empty, but is invalid JSON or something, resulting in a nil prevOpts? Any idea what's in there?

In any case: handling invalid build options JSON by doing a cleanup seems like a good approach.

I would suggest printing a message when this happens, to help debugging weird situations like these, but it seems that currently no log messages are being printed from this context:

// FIXME: this should go outside legacy and behind a `logrus` call so users can
// control when this should be printed.
// logger.Println(constants.LOG_LEVEL_INFO, constants.MSG_BUILD_OPTIONS_CHANGED)

Maybe add a TODO comment about logging a message?

@d-a-v
Copy link
Contributor Author

d-a-v commented Dec 30, 2020

I can add a message in this PR when this occurs.

@d-a-v
Copy link
Contributor Author

d-a-v commented Dec 30, 2020

Hm, do you know when this situation occurs exactly?

esp8266/Arduino CI's builder script didn't recently receive significant change. We are able to reproduce the issue locally by replaying esp8266/Arduino github CI steps and one need to find which commit of arduino-builder or arduino-cli introduced the issue.

To answer your question, here's the loop over tested sketches that is run by our CI.

Maybe that this piece of script needs to be updated, and I think it should happen after arduino-builder is segfault-proof.

@matthijskooijman
Copy link
Collaborator

I can add a message in this PR when this occurs.

Cool! I like the structure of the code better as well like this (harder to accidentally introduce a nil-pointer reference again later). I'm not entirely sure about the message itself, "unusual state to investigate" is a bit vague and broad, maybe use something like:

const MSG_BUILD_OPTIONS_CHANGED = "build.options.json invalid, rebuilding all"

But maybe @cmaglie or one of the other maintainers have a more relevant opinion on that :-)

Maybe that this piece of script needs to be updated, and I think it should happen after arduino-builder is segfault-proof.

Yeah, this should certainly be fixed in arduino-cli, regardless of changes on your side, I agree.

Looking at your script, it does indeed mess around with build.options.json, so I suspect that the regex used indeed makes the JSON invalid (my first thought is that maybe the JSON is no longer prettyprinted with one property per line, which your regex seems to assume).

To fix this on your end, I would suggest looking at the jq command, which is a commandline JSON processor with its own command language, it should be rather trivial to write "Replace the sketchLocation field with some string" using jq. I recently wrote a script which does something similar, maybe it can serve as inspiration: https://github.com/meetjestad/mjs_boards/blob/cd96d9adadda8d22b338d1fe2b2fce82d498c44e/.github/workflows/publish-releases.yml#L128-L134.

@d-a-v
Copy link
Contributor Author

d-a-v commented Dec 30, 2020

Thanks alot. We'll certainly have a try with updating our script (cc @earlephilhower FYI).

I'm not entirely sure about the message itself, "unusual state to investigate" is a bit vague and broad,

Thanks for the suggestion. I just pushed an update with your proposal.

@per1234
Copy link
Contributor

per1234 commented Dec 31, 2020

Thanks d-a-v. This looks like a nice improvement to Arduino CLI!

The current approach:


assumes that a corrupted build.options.json will always result in prevOpts being nil, but it sounds like json.Unmarshal() is designed to populate v with as much of the data as it is able to unmarshal, and indicate there were problems by returning an error:
https://golang.org/pkg/encoding/json/#Unmarshal

If a JSON value is not appropriate for a given target type, or if a JSON number overflows the target type, Unmarshal skips that field and completes the unmarshaling as best it can. If no more serious errors are encountered, Unmarshal returns an UnmarshalTypeError describing the earliest such error. In any case, it's not guaranteed that all the remaining fields following the problematic one will be unmarshaled into the target object.

I wonder if it would be better practices to do something like this instead:

index 57be5983..d5052d54 100644
--- a/legacy/builder/wipeout_build_path_if_build_options_changed.go
+++ b/legacy/builder/wipeout_build_path_if_build_options_changed.go
@@ -42,7 +42,10 @@ func (s *WipeoutBuildPathIfBuildOptionsChanged) Run(ctx *types.Context) error {
        var opts *properties.Map
        var prevOpts *properties.Map
        json.Unmarshal([]byte(buildOptionsJson), &opts)
-       json.Unmarshal([]byte(previousBuildOptionsJson), &prevOpts)
+       if err := json.Unmarshal([]byte(previousBuildOptionsJson), &prevOpts); err != nil {
+               ctx.GetLogger().Println(constants.LOG_LEVEL_DEBUG, constants.MSG_BUILD_OPTIONS_INVALID, constants.BUILD_OPTIONS_FILE)
+               return doCleanup(ctx.BuildPath)
+       }

        // If SketchLocation path is different but filename is the same, consider it equal
        if filepath.Base(opts.Get("sketchLocation")) == filepath.Base(prevOpts.Get("sketchLocation")) {

Of course, it's unlikely that the non-nil prevOpts value resulting from a corrupted build.options.json would cause a problem and my approach would still result in a panic if build.options.json is a valid JSON document that unmarshals to nil, in which case json.Unmarshal() will not return an error:

null

@matthijskooijman
Copy link
Collaborator

@per1234 good catch, your proposal sounds good to me. The null case could also be caught explicitly, by checking for err != nil || prevOpts == nil in the if? The error message is still reasonable then: It's valid JSON, but not a valid build.options.json file.

@ubidefeo
Copy link

thank you @matthijskooijman and @per1234 for chiming in.
With our main Go devs on holiday it's very hard for me to approve a PR, but thankfully Per is getting good at this game and his knowledge of our platform I'm extremely happy he dug into the issue and found this one.

let's see how this develops :)

@d-a-v d-a-v requested a review from per1234 January 13, 2021 09:31
@@ -40,9 +42,16 @@ func (s *WipeoutBuildPathIfBuildOptionsChanged) Run(ctx *types.Context) error {
previousBuildOptionsJson := ctx.BuildOptionsJsonPrevious

var opts *properties.Map
if err := json.Unmarshal([]byte(buildOptionsJson), &opts); err != nil || opts == nil {
ctx.GetLogger().Println(constants.LOG_LEVEL_DEBUG, constants.MSG_BUILD_OPTIONS_INVALID, constants.BUILD_OPTIONS_FILE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message is confusing:

build.options.json invalid

Because it's not the build.options.json file that's invalid, but rather the internal build options data.

Since this isn't something that should occur under any normal circumstances, I would lean toward just doing a panic(err). Probably @silvanocerza would be able to tell us what is the preferred approach though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine like this, it seems to me that ctx.BuildOptionsJson is populated with data from the build.options.json file only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build.options.json file is created from ctx.BuildOptionsJson:

ctx.BuildPath.Join(constants.BUILD_OPTIONS_FILE).WriteFile([]byte(ctx.BuildOptionsJson))

It's ctx.BuildOptionsJsonPrevious that is populated with data from the build.options.json file:
ctx.BuildOptionsJsonPrevious = string(bytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I leave it as it currently is,
or change the message,
or replace it by panic(constants.BUILD_OPTIONS_FILE + " is invalid") ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to get @silvanocerza's response to my previous reply just to make sure everyone is on the same page here. After that, I'd recommend going with whatever Silvano think is best.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot about this.
@per1234 is right, probably a message like build options are invalid might be less confusing, it's not exactly transparent like this either but at least it won't confuse the user if the build.options.json file doesn't exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@silvanocerza @per1234 I just proceeded with your suggestion.

@silvanocerza
Copy link
Contributor

@d-a-v could you write an integration test or a list of steps to reproduce the issue this PR fixes?

@d-a-v
Copy link
Contributor Author

d-a-v commented Jan 26, 2021

@silvanocerza I made a PR in esp8266/Arduino that will reproduce the original issue.
It can be watched in CI logs, or reproduced locally by checking it out and making it run.
esp8266/Arduino#7842

edit in this link, Arduino-IDE-nightly is used. Does it use arduino-cli nightly ?

@per1234
Copy link
Contributor

per1234 commented Jan 27, 2021

@silvanocerza here are minimal instructions to reproduce the issue:

mkdir -p /tmp/FooBuild
echo foo > /tmp/FooBuild/build.options.json  # Create an invalid JSON document that will result in error when unmarshalled
arduino-cli sketch new /tmp/FooSketch
arduino-cli compile -b arduino:avr:uno --build-path /tmp/FooBuild /tmp/FooSketch
echo null > /tmp/FooBuild/build.options.json  # Create a valid JSON document that will unmarshall to `nil`
arduino-cli compile -b arduino:avr:uno --build-path /tmp/FooBuild /tmp/FooSketch

Without the changes made in this PR, both the above arduino-cli compile commands would result in a panic and the situation would not be recoverable without clearning the build folder manually because the invalid data remains in build.options.json. After the changes in this PR, Arduino CLI recovers from an invalid build.options.json gracefully.

The above only exercises the changes made to the unmarshalling of ctx.BuildOptionsJsonPrevious. I don't know how an invalid ctx.BuildOptionsJson could be simulated via an integration test.

@silvanocerza
Copy link
Contributor

@d-a-v I pushed the tests and rebased the branch to solve some issues, you'll have conflicts for sure if you fetch the branch.

If all tests pass I'll merge it, if there are no issue you'll probably see it really soon in the 0.18.2 release.

@d-a-v
Copy link
Contributor Author

d-a-v commented May 10, 2021

Thanks !

@silvanocerza silvanocerza merged commit d1b1b03 into arduino:master May 10, 2021
@silvanocerza
Copy link
Contributor

Merged, thanks @d-a-v for the contribution!

silvanocerza added a commit that referenced this pull request May 10, 2021
…1118)

* check whether prevOpts is nil in WipeoutBuildPathIfBuildOptionsChanged's Run

* + debug message

* update from review: improve user message

* fix per review

* add check to buildOptionsJson too

* panic'ing instead of keeping on with mess

* fix use of a constant

Co-authored-by: per1234 <[email protected]>

* [skip changelog] Add compile with invalid build.options.json test

Co-authored-by: per1234 <[email protected]>
Co-authored-by: Silvano Cerza <[email protected]>
@d-a-v d-a-v deleted the checkprevopts branch May 10, 2021 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement Proposed improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants