Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sfdx no longer repeats "pending" message #1839

Closed
AndrewRayCode opened this issue Dec 9, 2022 · 5 comments
Closed

sfdx no longer repeats "pending" message #1839

AndrewRayCode opened this issue Dec 9, 2022 · 5 comments
Labels
bug Issue or pull request that identifies or fixes a bug investigating We're actively investigating this issue

Comments

@AndrewRayCode
Copy link

The latest version of sfdx no longer repeats the pending message in output

Command:

sfdx force:source:deploy --wait 302 -u STAGING --testlevel RunLocalTests -x manifest/package.xml --checkonly

output:

=== Status: Pending


No components deployed

This causes CI systems to break who have wait timeouts for outputs.

Upgrading sfdx should not break existing CI systems.

@AndrewRayCode AndrewRayCode added the investigating We're actively investigating this issue label Dec 9, 2022
@github-actions
Copy link

github-actions bot commented Dec 9, 2022

Thank you for filing this issue. We appreciate your feedback and will review the issue as soon as possible. Remember, however, that GitHub isn't a mechanism for receiving support under any agreement or SLA. If you require immediate assistance, contact Salesforce Customer Support.

@AndrewRayCode
Copy link
Author

AndrewRayCode commented Dec 13, 2022

@cromwellryan I think this is another example of a breaking change (and CI breaking change) not in the release notes https://github.com/forcedotcom/cli/blob/main/releasenotes/sfdx/README.md

This change is currently blocking our CI systems if tests are put into the queue and are pending more than 10 minutes, which all of them are. We bumped the sfdx version to get the latest node/npm versions in the docker image, so that we could do sfdx commands and eslint linting using the same image for convenience

@cromwellryan this sounds like an incident, a breaking change / bug. So we can understand the process for handling critical in the beta commands, can you share a post mortem on how this breaking change got shipped?

@iowillhoit
Copy link
Contributor

Good morning @AndrewRayCode,
Thanks for reporting this and thanks for being patient for a response. @cromwellryan, the team, and I all discussed this yesterday.

I completely understand this is frustrating. CI scripts are often not trivial to test/debug/update and when they suddenly break because of someone else’s code change, it puts the burden on you.

This change was initially made as part of our multi-year transition to become 100% Open Source (you can read more here, here, and part 3 is coming soon as our journey comes to an end!) As we split commands out of our monorepo into individual plugins, we used it as an opportunity to do some housecleaning. This change was made in an effort to improve the user experience and not print pages and pages of noise.

The CLI team generally does not make any commitment to keep stdout consistent. We are constantly changing outputs to improve clarity and provide better error messaging. We do, however, try very hard to not break JSON contracts. We have a ton of integration tests that compare JSON output (see for yourself 😄) to prevent regressions.

That said, this change was more than a stdout update. It did in fact change the functionality of the CLI, even if that change was indirect. Sometimes seemingly simple changes can be very difficult to predict downstream effects.

We can add some sort of output that would be a good compromise, perhaps a spinner or a message a minute. That should appease CI timeouts and not clutter logs.

I do want to call out that in addition to our (ever increasing) suite of integration tests, we create a release candidate every week and will soon be releasing a nightly build. We have several internal and external customers that use the latest-rc build in the CI systems to help us detect breaking changes before they are released to the general public. A shoutout to @daveespo who recently caught a regression that would have had a large impact! We were able to patch this in latest-rc before it was shipped. This happens fairly often thanks to the watchful eye and care of these advocates.

I also want to take a moment share a few suggestions that could help. We have taken advantage of some of these ourselves to improve the reliability of long running CI scripts that have been flakey.

  • Take advantage of the JSON output and poll for the status.
    • Running deploy with -w 0 will return the Deploy ID and exit.
    • You could then poll the status with sfdx force:source:deploy:report -i THE_ID
    • Here is a similar example where we needed to poll the Communities API.
  • Implement retries into your CI scripts. Here is an example of how we retry our integration tests in GitHub Actions.
  • Bump timeouts in your CI system. Most CI systems will have an environment variable that will increase the wait time before killing a job due to lack of output.
  • Consider taking advantage of RC builds in your CI and help us make this product great for everyone ❤️

At the end of the day, we’re human and mistakes will be made. But I assure you, this is a fantastic team that truly cares about the quality of our product. It is an honor to be a part of it.

@iowillhoit iowillhoit added the bug Issue or pull request that identifies or fixes a bug label Dec 16, 2022
@git2gus
Copy link

git2gus bot commented Dec 16, 2022

This issue has been linked to a new work item: W-12234469

@cristiand391
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue or pull request that identifies or fixes a bug investigating We're actively investigating this issue
Projects
None yet
Development

No branches or pull requests

3 participants