-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scripts/xtensa-build-zephyr.py: Allow for alternate toolchain versions #9736
base: main
Are you sure you want to change the base?
Conversation
Originally submitted in #9731 and split out for further discussion
That's fair. For sure it makes sense to log the exact toolchain as part of the build, we should probably always have been doing that (Zephyr will log its SDK version in use, for example). We could renaming it something like "SOF_CADENCE_TOOLS_OVERRIDE" or somesuch if you like? Alternatively we could use a command line option like "--alt-toolchain=" or similar. The use case here is that I'd like to be able to unify builds to a single tools version where practical, for validation and sanity reasons (also disk space, heh, these are very large). And in practice the "official toolchain" has always been a little ambiguous. Licenses don't transfer between partners, so different companies buy different things from Cadence. And occasionally we've seen respins of toolchains for the same hardware, leading to confusion. Obviously there's always risk when tracking dependencies but in general it seems low and manageable. |
Also just to be blunt: partners (ahem) tend to get stuck using extremely old software on a regular basis, for no particular reason except inertia. AMD Rembrandt is on a five year old toolchain, Tiger Lake builds with seven year stale tools (and is still on xcc). Panther Lake isn't remotely close to release and is already two years behind the current toolchian. Vendors will surely support what they sell, but they're likely to be puzzled at the resistance of customers to use their new features. I (vaguely) remember when @marc-hb was working on the deterministic builds feature there was a minor compiler (preprocessor?) bug where the response was basically "Why don't you just upgrade?" |
Currently the build scripts force a single "blessed" Cadence toolchain for each platform, which prohibits the use of upgraded/fixed/standardized tooling by downstreams. Let XTENSA_TOOLS_VERSION be set in the environment and use the PlatformConfig value as a fallback if unspecified. Also log the resulting choice (and also XTENSA_TOOLS_ROOT, which is likewise external input to the script) for build tracking, so any deployment downstreams can reconstruct what was done. Signed-off-by: Andy Ross <[email protected]>
06e5011
to
ad155c9
Compare
Push a new version that implements the logging bit, e.g.:
Let me know what you'd like to see for the API. I think regardless of whether we can agree that it's a good or bad idea to switch toolchain versions, we should all agree that it should be easy, to permit upgrades and regression analysis. |
@andyross Thanks, let's see what the CI runs say. I can do some manual checks with the logging you added, and while doing that, let's give others opportunity to weigh in. W.r.t. the toolchains, ack, this topic has popped up many times in the past and we've poked at possibility to update more often, but currently, my personal take is that I don't see that happening. I.e. at least for Intel SOF platforms, the xtensa toolchain version is picked when we go public with a platform and it will stay the same until end-of-life. Not ideal for us developers always, but I don't currently see that this will change. I do agree it should be possible to override the toolchain version with a clear, deterministic interface, so this PR does make sense. |
That was: And don't get me started on this one :-D
Actually, @singalsu tried an toolchain upgrade experiment for TGL and it did not go well. I think the firmware stopped booting... Oh, wait! Could this have been another instance of the rimage heisenbug #9340? Worth a new attempt maybe.
That makes sense to me too. Logging is critical though because you can never trust anything else. Should the help text at line 200 be updated accordingly? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, all we do risk is couple of questions "why is not compiling for me"?
we just need to ensure our build machines don't set this env variable for any (forgotten) reason
"Everyone should just..." = fail. There is always someone somewhere. When I was young and very ignorant, I didn't think reproducing a build configuration and environment was a problem. Now I know it's one of the hardest thing in the world. Especially in automation where "CI secrets" and hacks are littered and hidden in every dark corner to make things work and pass at any cost. "Bonus" points when the automation is not even public but visible to a single company. Or worse: restricted even inside the company. https://martinfowler.com/articles/continuousIntegration.html#PutEverythingInAVersionControlledMainline <= you wish! Just think how hard it can be to reproduce a failure reported elsewhere. So, every build or automation script I touch now LOGS absolutely everything. Build logs = the only thing you can rely on. tl;dr: do not restrict configuration flexibility because it's futile, people can always find a hack and way around it. But on the contrary: while giving flexibility, piggy-back and LOG profusely everything that happens. I barely looked at the change but I think it is what this PR does. Now do not add flexibility that only one or two persons will use, that's just extra maintenance for no good reason. They can just keep rebasing that. Some long and non-exclusive list of places where I learned this: |
Currently the build scripts force a single "blessed" Cadence toolchain for each platform, which prohibits the use of
upgraded/fixed/standardized tooling by downstreams. Let XTENSA_TOOLS_VERSION be set in the environment and use the PlatformConfig value as a fallback if unspecified.