Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scripts/xtensa-build-zephyr.py: Allow for alternate toolchain versions #9736

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

andyross
Copy link
Contributor

Currently the build scripts force a single "blessed" Cadence toolchain for each platform, which prohibits the use of
upgraded/fixed/standardized tooling by downstreams. Let XTENSA_TOOLS_VERSION be set in the environment and use the PlatformConfig value as a fallback if unspecified.

@andyross
Copy link
Contributor Author

Originally submitted in #9731 and split out for further discussion

@kv2019i:

This is potentially a scary change. Whether hard coding the versions in this python file is good or bad, it's been in use for a long time and this has specifically been done to ensure all developers and the various build systems, use the same configuration to build. And more importantly, any change is done with a git commit that is tested with the same test process as other pull requests. If we now open a new getenv() path to modify the build, it's hard to ensure there are no CI builders out there, that have foobar XTENSA_TOOLS_VERSION contents and in worst-case, don't fail immediate but start generating slightly different binaries (and test runners test those).

So I'd like to make this more explicit. Use a different name for the environment variable, or at minimum print a prominent warning to log file that non-standard toolchain is used in the build. I don't think this is now printed at all.

That's fair. For sure it makes sense to log the exact toolchain as part of the build, we should probably always have been doing that (Zephyr will log its SDK version in use, for example). We could renaming it something like "SOF_CADENCE_TOOLS_OVERRIDE" or somesuch if you like? Alternatively we could use a command line option like "--alt-toolchain=" or similar.

The use case here is that I'd like to be able to unify builds to a single tools version where practical, for validation and sanity reasons (also disk space, heh, these are very large). And in practice the "official toolchain" has always been a little ambiguous. Licenses don't transfer between partners, so different companies buy different things from Cadence. And occasionally we've seen respins of toolchains for the same hardware, leading to confusion.

Obviously there's always risk when tracking dependencies but in general it seems low and manageable.

@andyross
Copy link
Contributor Author

Also just to be blunt: partners (ahem) tend to get stuck using extremely old software on a regular basis, for no particular reason except inertia. AMD Rembrandt is on a five year old toolchain, Tiger Lake builds with seven year stale tools (and is still on xcc). Panther Lake isn't remotely close to release and is already two years behind the current toolchian. Vendors will surely support what they sell, but they're likely to be puzzled at the resistance of customers to use their new features. I (vaguely) remember when @marc-hb was working on the deterministic builds feature there was a minor compiler (preprocessor?) bug where the response was basically "Why don't you just upgrade?"

Currently the build scripts force a single "blessed" Cadence toolchain
for each platform, which prohibits the use of
upgraded/fixed/standardized tooling by downstreams.  Let
XTENSA_TOOLS_VERSION be set in the environment and use the
PlatformConfig value as a fallback if unspecified.

Also log the resulting choice (and also XTENSA_TOOLS_ROOT, which is
likewise external input to the script) for build tracking, so any
deployment downstreams can reconstruct what was done.

Signed-off-by: Andy Ross <[email protected]>
@andyross
Copy link
Contributor Author

Push a new version that implements the logging bit, e.g.:

Building using Cadence Xtensa Tools
  XTENSA_TOOLS_VERSION = RJ-2024.3-linux
  XTENSA_TOOLS_ROOT = /xtensa/XtDevTools

Let me know what you'd like to see for the API. I think regardless of whether we can agree that it's a good or bad idea to switch toolchain versions, we should all agree that it should be easy, to permit upgrades and regression analysis.

@kv2019i
Copy link
Collaborator

kv2019i commented Dec 18, 2024

@andyross Thanks, let's see what the CI runs say. I can do some manual checks with the logging you added, and while doing that, let's give others opportunity to weigh in.

W.r.t. the toolchains, ack, this topic has popped up many times in the past and we've poked at possibility to update more often, but currently, my personal take is that I don't see that happening. I.e. at least for Intel SOF platforms, the xtensa toolchain version is picked when we go public with a platform and it will stay the same until end-of-life. Not ideal for us developers always, but I don't currently see that this will change.

I do agree it should be possible to override the toolchain version with a clear, deterministic interface, so this PR does make sense.

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 18, 2024

@marc-hb was working on the deterministic builds feature there was a minor compiler (preprocessor?) bug where the response was basically "Why don't you just upgrade?"

That was:

And don't get me started on this one :-D

Also just to be blunt: partners (ahem) tend to get stuck using extremely old software on a regular basis, for no particular reason except inertia.

Actually, @singalsu tried an toolchain upgrade experiment for TGL and it did not go well. I think the firmware stopped booting... Oh, wait! Could this have been another instance of the rimage heisenbug #9340? Worth a new attempt maybe.

regardless of whether we can agree that it's a good or bad idea to switch toolchain versions, we should all agree that it should be easy, to permit upgrades and regression analysis.

That makes sense to me too. Logging is critical though because you can never trust anything else.

Should the help text at line 200 be updated accordingly?

Copy link
Contributor

@marcinszkudlinski marcinszkudlinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, all we do risk is couple of questions "why is not compiling for me"?
we just need to ensure our build machines don't set this env variable for any (forgotten) reason

@marc-hb
Copy link
Collaborator

marc-hb commented Dec 19, 2024

we just need to ensure ...

"Everyone should just..." = fail. There is always someone somewhere.

When I was young and very ignorant, I didn't think reproducing a build configuration and environment was a problem. Now I know it's one of the hardest thing in the world. Especially in automation where "CI secrets" and hacks are littered and hidden in every dark corner to make things work and pass at any cost. "Bonus" points when the automation is not even public but visible to a single company. Or worse: restricted even inside the company.

https://martinfowler.com/articles/continuousIntegration.html#PutEverythingInAVersionControlledMainline <= you wish!

Just think how hard it can be to reproduce a failure reported elsewhere.

So, every build or automation script I touch now LOGS absolutely everything. Build logs = the only thing you can rely on.

tl;dr: do not restrict configuration flexibility because it's futile, people can always find a hack and way around it. But on the contrary: while giving flexibility, piggy-back and LOG profusely everything that happens. I barely looked at the change but I think it is what this PR does.

Now do not add flexibility that only one or two persons will use, that's just extra maintenance for no good reason. They can just keep rebasing that.


Some long and non-exclusive list of places where I learned this:

#9340, #6920 and #7114 were also "interesting".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants