-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add static code analyzer integration #51212
Conversation
Add a generic way to integrate any static code analyzer into the cmake build script. - Writing the selected toolchain paths to the sca_export.json - Reading the static code analyzer toolchain paths from a sca_import.json and replacing the CMAKE_C_COMPILER settings with the values read from the json - Executing static code analyzer customized scripts updating the analyzers configuration (with the information in sca_export.json) - Support for a in tree and a out of tree location, allowing to integrate analyzers as part of Zephyr itself and as part of a customer environment using Zephyr as its base. Signed-off-by: Roman Kellner <[email protected]>
As mentioned in the TSC, is there a reason not to use: |
@carlescufi I feel there are several arguments against it:
From https://gitlab.inria.fr/sed-bso/heat/-/blob/master/CMakeLists.txt:
But..Instead of implementing such an integration ourselfs, it might make sense to raise a change request toward cmake to provide a generic solution. I added a commant to: https://gitlab.kitware.com/cmake/cmake/-/issues/23945 |
@romkell thanks for this proposal, it surely would be beneficial to have better support for SCA tools in Zephyr. There are a couple of questions from me. First, why is it needed to start exporting CMake cache variables in a json format, when those settings are directly available inside CMake itself, but also in the CMakeCache for external tools to read ? zephyr/cmake/sca_integration.cmake Line 45 in 0d98433
to the update tool. But a json format is really only very useful for the python script, not the bat and bash which needs very customized handling to understand json. This again leads me to wonder, what exactly is the purpose of the update.[bat|sh|py] script ?Can you make an example. You do write:
but why the complexity of going through an export json, isn't such configuration quite static meaning it can be prepared in advance ? For the import file: https://github.com/zephyrproject-rtos/zephyr/blob/0d98433a73ac74b2c639a1a2d9e6a6c2e452035c/scripts/sca_integration/sca_import.json Already today we have an established infrastructure for toolchain / compiler / linker: did you ever consider making use of that infrastructure ? Such an approach has the benefit of being aligned with existing toolchain handling and not rely on new config / json files. The main thing i'm unsure about is an actual example for your |
@tejlmand what is the recommended way for external tools to read the CMakeCache? I searched the internet and only found deprecated ways. To sparse zephyr code in production I ended copying and pasting a bit of Zephyr Cmake code which felt very awkward: |
I don't know if there exists any recommended ways, but the format of the cache follows the format of setting cache variables from the command line when using full signature, that is: One can of course read the CMakeCache directly as a txt file, but another alternative is to simply invoke: In Zephyr we have a dedicated python module which can read the CMake cache. zephyr/scripts/west_commands/zcmake.py Lines 113 to 129 in 1374415
The cache types can be seen here: |
From bash, |
sparse does not know or care about CMake but it (unfortunately) needs a value from Zephyr's
I looked at this option and then I gave up because I didn't want to parse the output (filter out all the noise, stripping the type, checkout for the weird NOT-FOUND and what not)
Very interesting; if I had seen this maybe I would have used it. However it looks like this is only a library meaning a new python script has to be created to import and use it. So this is still not at the level of PS: I forgot which other, "deprecated" method I found sorry. Maybe I got confused. |
Thanks for looking at it. The notion here is to provide a generic interface for a Zephyr OS customer/user to
without having to change anything in the cmake build system itself.
Since Zephyr OS selects the compiler via the defined board, it is rather dynamic and not static. It is different whether I compile native_posix, nucleo_g071rb or sifive_... .
I did not know about cmakecache.txt and its format. Is it valid to read there directly (cmakecache.txt being some kind of database, which's format might change at kitewares disposal also if they might not have done in years)? Should one not use the cmake API set(), unset() and $CACHE(varname), which are cmake internals? Who to access with an external tool?
We could use find_program and the program name only. Providing the full path to the tool is unambiguous. I do not insist on absolute paths. Both could be allowed. The more option the more complex.
I used json because it seemed the only format that cmake supports out of the box (no yml, no ini ...) and it is a standard.
There are so many programming languages that have excellent support for json. bat and bash probably not. but I rather saw bat and bash as the most common scripts on Windows and Unixes, which very likely then can call any other "program" of users choice.
Since the cmake called compiler change with the selected board, the SCA tool needs to know that information before it executes. Eventually the configuration file for the SCA tool has to be adjusted accordingly. Every SCA tool will be different in that respect.
I saw all of that. From what I understood it is the toolchain / compiler / linker etc. that the Zephyr OS project maintains and even bundles in the SDK. Meanwhile I feel that even calling it a "SCA" integration interface is narrowing it down too much, since you can integrate any tool not only SCAs with the option mentioned at the top of this response. I plan to attend the architecture WG meeting next week (22.11.22, unless cancelled) and would be happy to explain and discuss. Can we put that on the agenda? |
This comment was marked as off-topic.
This comment was marked as off-topic.
Do you have examples of how this works with tools like cppcheck, clang-tidy, iwyu? I have used built-in support for such tools utilizing Using the json part and scripting does seem to be awkward and I agree with Torsten there is already infrastrucutre in place to support different toolchains, linkers and so on, so adding a launcher configuration using the same model seems the right and obvious way to go. |
I had quick look at I still feel we should not only support the integration of one SCA but as many supportive tools as the user likes, with an option to select one when building (west build, cmake).
Something as Personally I do not like to clutter the env with all kind of env variables. I prefer a config file (json, yml, ini, I do not care that much, as long there is good support from script languages) which I can easily put under version control. JSON seems to be cmake's choice: https://cmake.org/cmake/help/latest/manual/cmake-presets.7.html. |
JSON is also IPLD's choice https://ipld.io/docs/codecs/ and many others'. It's the standard, I don't think it has any serious competition in that particular niche, has it?
Environment variables are a "build code smell". Probably their biggest sin is to be nearly invisible and to make it very hard to reproduce issues because of unsuspected differences. Numerous tools clean the environment (sudo, docker, bitbake,...). CMake cannot define build-time environment variables, see for instance sparse static analysis commit 3ebb18b EDIT: yet another issue with environment variables: zephyrproject-rtos/west#613 Also, no one can ever tell where environment variables came from and no one knows who's going to read them. They're "super-global" variables. Environment variables are used when every other, proper solution is unavailable or failed. |
note, i'm not against json, my problem with json in this particular case and the proposal in this PR is that it is not following any of our existing ways of configuring the build system or integrate extra tools. For example we use variables for specifying toolchains, those can be set either in enviroment or on command line, such as We use yaml files in Zephyr modules to integrate Zephyr modules with the build system. We have Kconfig to enable / disable features in the build and control compile options. Therefore I don't believe we should also start to use Json for integrating SCA tools, especially not when I believe the existing build infrastructure can easily be extended to support such tools in a way that is aligned with what we already have in place.
Did anyone request this to be done using environment variables ?
Only the Zephyr SDK, Zephyr supports other toolchains besides that one, oneAPI, espressif, gnuarmemb, arcmwdt, etc. So users can already today chose one of the officially supported toolchains or hook into the build system using CMake directly for any toolchain they want to use.
but if you integrate support for the SCA tool using CMake, then you have all this information available without the need for exporting to json, and you can much easier decide how to configure the SCA tool based on the board, arch, compiler, etc.
But CMake already has a lot of built-in stuff for generating files based on settings. |
@carlescufi, @tejlmand, @marc-hb @nashif : can we make that a agenda point on the architecture WG possibly the next week. I feel a verbal communication is more productive than this chat conversation. Everyone has already written quite a lot. I guess we need to discuss the base concept before the details how to do it. |
Sure, let's discuss this in the Architecture WG. |
Architecture WG:
|
As promised, see #52671 for example on how this can be done. @romkell please take a look and see how this fits into your needs. |
PR closed in favor of #52671. |
@romkell I wanted to integrate Klocwork in zephyr. Where do i need to make the changes in ? |
I withdrew my approach in favour of #52671 You can find https://docs.zephyrproject.org/latest/develop/sca/index.html in the doc Zephyr OS > 3.3.0. |
FWIW we've been using https://docs.zephyrproject.org/latest/develop/sca/index.html in https://github.com/thesofproject/sof/blob/fe83601406f32f/.github/workflows/sparse-zephyr.yml for about a month now and we didn't find any issue with it. |
How to enable klocwork SCA into zephyr version 3.0.99. |
Brief
This is a unfinished draft implementation of a static code analyzer (SCA) integration implementation into Zephyr's cmake build system. Its not a full integration than rather an interface:
Mechansim and architecture
The idea behind is to replace the actual toolchain in the CMAKE_C_COMPILER etc. variables by the SCA executable's at a late configuration stage of the build system (vai import json file). The actual toolchain information is exported (export json file) and a hook script started, allowing to configure the SCA accordingly.
Multiple SCA's can be added in the import json file. Their configuration getting referenced by SCA_VARIANT=<sca-variant-name>.
The hook scripts are organized in a <sca-integration>/hooks/<sca-variant> subfolder.
By using json type files for the import and export format, the mechanism can easily be extended.
Integration into build system
There is one main cmake script handling the sca_integration
This script need to be hooked into the build system at the right position/location.
Configuration folders
There are two equally structured folders, which are considered by the script:
Open points / questions
CMAKE_<xyz>_COMPILER
variables empty (unlessCMAKE_C_COMPILER
) at the current location where the cmake file is currently called?Call examples
Even though not really different, two examples for in and out of tree:
In tree
Out of tree
To make it even more convenient, a west option --sca /-s or similar could be implemented avoiding the
-- -DSCA_VARIANT=<sca-variant-name>
Back porting
The implementation is intended to be back-ported to
2.7-auditable-branch
.Existing integrations
The suggestion would be to integrate any analyzer tool where the actual toolchain/compiler need to be replaced by the analyzer in the build system, which then calls the toolchain/compiler before starting its own analysis.
The mechansim would also work for analyzers which intercept the call to the toolchain/compiler at an OS level (as virus scanner do), telling those analyzers which executable to intercept. In that case the import file section for such an analyzer remains basically empty (C_COMPILER, CXX_COMPILER etc. will be missing in the import json).
I would assume that also
sparse
(#43776) could be integrated with that mechanism.Further information
Commit comment