Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optionally use the standard library module #2575

Closed
davidhunter22 opened this issue Nov 21, 2022 · 7 comments
Closed

Optionally use the standard library module #2575

davidhunter22 opened this issue Nov 21, 2022 · 7 comments
Labels
Building and Packaging Issues affecting build/packaging scripts and utilities

Comments

@davidhunter22
Copy link

davidhunter22 commented Nov 21, 2022

Recently Microsoft released their latest preview of VS which includes an implementation of the standard library modules std and std.compat. Other compilers/standard libraries will presumably follow suit, probably over the next year or so is my guess.
This feature request is to try to build catch2 using this standard library, and as a side effect allow users who are using the standard library module in their code to use catch2 headers. There are also benefits internally to catch itself, such as enforced code hygiene.

The goals

  1. Allow users who are using, or want to use, the standard library module to use catch2 without work arounds
  2. The changes should leave the library behaving by default exactly as it does now. You would have to explicitly opt into the new stuff.
  3. Keep the changes as minimal as possible
  4. See the sections below on using std versus std.module and why the current code can’t be used by consumers
  5. This is not a request to create a catch2 module. There is a discussion about that here Add modules support #2299
  6. This proposal does not suggest using module headers in any way I.E. import <string>
  7. This proposal does not address CI or other future support issues

Past experience

The following methodology is based on migrating a fairly large, ~100K LOC private code base and also the type_safe library at https://github.com/foonathan/type_safe. I would be extremely grateful if people suggest a better way, of improvements to this way of doing this, even if Catch rejects doing this at all. I will post a PR soon with the changes as proposed here so people can take a look or even try it out, my fork is https://github.com/davidhunter22/Catch2/tree/std_module.

What code changes

  1. Use a macro to def out inclusions of almost all standard library headers like <string> and import std; instead
  2. Fix usage of symbols that were previously global that now are not. For example change strncmp to std::strncmp. Note that this can de done selectively so it is possible to #include <stdint.h> instead. In practice the current code base is already fairly clean so there are not that many changes. Note using #include <stdint.h> is less tragic in an implementation file rather than a public header file.
  3. Some standard library headers are special cases, mostly because of macros, so still need to be included.
    • <cassert> asserts are macros. Ideally the code would never assert in header files via inline implementations, however catch does do this sometimes. My view is that implementations that assert should never be inline but I'm a pedant!
    • <version> feature test macros are also macros.
    • <cerrno> errno is a macro
    • <stdio.h> There are a number of functions like printf_s and gmtime_s. These are defined in Annex K of the C standard, https://open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm. They are explicity excluded from the std namespace and therefore not in the std module
  4. The current core library code is very macro clean, the test cases less so. I only found the following macros being used in the core library DBL_MAX_10_EXP, EOF. I left these and included the relevant standard library header.

The implementation in practice

The code is being built be generating a VS solution using the following command

cmake -DCATCH_DEVELOPMENT_BUILD=ON -DCATCH_BUILD_TESTING=ON -DCATCH_BUILD_EXAMPLES=ON -DCATCH_USE_STDLIB_MODULE=ON -DCATCH_ENABLE_WERROR=OFF path_to_src

The code has to be built with VS 2022, 17.5 Preview 1 or above, this is the firt version that supplies the std module,
The code base after the changes has optional code based on a single macro. The code, mostly the sections that do #include … are changed to look like the following

#if CATCH_USE_STDLIB_MODULE
    import std;
#else
    #include <chrono>
    #include <ratio>
#endif

This change effects almost every file. If Catch had a central header that ervy other header include the import std; could be done ther but it does not
The only other change are some decorations of types so uint32_t becomes std::uint32_t. There are very few of these, maybe 10 or so.

The CATCH_USE_STDLIB_MODULE macro is defined as follows.
The root CMakeList.txt text has the following added

option(CATCH_USE_STDLIB_MODULE "Build using standard library module rather than standard library headers i.e. <vector>" OFF)

This result in a macro being defined in the Catch2 project's CMakeLists.txt.

if(CATCH_USE_STDLIB_MODULE)
  target_compile_definitions(Catch2 PUBLIC CATCH_USE_STDLIB_MODULE)
endif()

The CATCH_USE_STDLIB_MODULE can be used by end users via the magic of CMake of manually if they want to use the std module version.

The Catch2 project's CMakeLists.txt also enforces C++23 usage as that is the minium require the the std module

if(CATCH_USE_STDLIB_MODULE)
  target_compile_features(Catch2 PUBLIC cxx_std_23)
endif()

Why use the std module rather than std.compat

Using std rather than std.compat allows you to be selective about what types are in the global namespace. If you use std.compat you get a lot of the garbage in the global namespace that old style headers, I.E. <vector> put there.
Note <vector> may not directly put things in the global namespace but it does transitively, for instance it uses size_t which comes from <cstddef> which then includes <stddef.h> etc… all with lots of fun macros being defined along the way.
Using std.compat at least controls the macro spam and avoids lots of internal types/functions/… being made visible. When using the std module, unlike std.compat you can choose to include certain headers, typically things like stdint.h if you don't want to change code like uint32_t to std::uint32_t. The choice to pollute the global namespace is then very explicit.
My view is that a library like Catch that is intended to be used by many consumers should try to minimize, as much as possible, the global/macro pollution that happens to users when they “#include <catch_....>. Also as its name suggests std.compatis a compatibility bridge, if it’s not much harder to do so why not go straight tostd`

Why doesn't it all just work now

Why macro guard headers and do other changes shouldn't it all just work as is?
If a consumer does an import std; in theory using catch2 as is should just work but as STL, of MS fame, mentions here microsoft/STL#3195 in a discussion about this

Mixing #include and import std; in any order should work, according to the Standard. However, this is an extremely stressful scenario for the compiler... I've been told by our modules dev @cdacamar to expect that this will totally fail until major compiler work happens

Currently when consumers of catch2 try to import std and then #include <catch_foo.h> they will get compiler errors because the #include catch_....h header will do things like #include <string>. I suspect mixing old style headers with standard library modules will be a source of bugs for many years so why not avoid the issue. Note that not all headers trigger an error <cstdint> does not while <cmath> does. In general my feeling is if you do import std; then use headers like <stdint.h> rather than <cstdint> if you want global namespace “C” thing like sprintf. You will already have the std::sprintf version via the module so why confuse the issue.

MS compiler issue with the global namespace

The Microsoft compiler has an issue where it will sometimes, under conditions that are not at all obvious, make some std namespace types visible in the global namespace without any using statements or declarations. See here https://developercommunity.visualstudio.com/t/Possible-bug-in-VS-2022-175-Previews-1/10203863 for a discussion.
So the following code compiles

import std;

void Foo( std::uint64_t );

uint64_t x = 0; // This line should not compile

On discord Cameron DaCamara, a Microsoft compiler dev, said that this is a compiler issue to do with modules. He commented

Yes, this is a problem the compiler has struggled with for quite some time now. The underlying technical problem cannot be easily solved without rearchitecting how name lookup works, which is an insanely risky change but necessary to fix the problem

So, this problem is unlikely to go away any time soon. This problem is actually not that bad, it just brings types like std::uint8_t into the global namespace. However, these types were already in the global namespace before via headers like . Using old style standard library headers it was pretty much impossible to avoid global namespace pollution. This compiler problem just means it is not always possible to clean up the usage of these global namespace types if that is even desired. When I say clean up, I mean the compiler should give you an error when you use a global type like ::uint8_t which you can then fix, this issue just mean that won’t always happen.

@davidhunter22
Copy link
Author

davidhunter22 commented Nov 21, 2022

A very recent check in, I believe this one, 291c502 broke the C++ 20 and above build in Visual Studio.

It would be a good idea to add a VS 2022 C++20 build to CI, I would help to do this but know nothing about how that magic works!

@horenmar
Copy link
Member

I see I never got back to this.

Thanks for the (significant) effort put into the linked PR, but I am not interested in maintaining the results to support buggy prototype of modules inside MSVC.

@horenmar horenmar added the Building and Packaging Issues affecting build/packaging scripts and utilities label Jan 23, 2023
@cdacamar
Copy link

Hi @horenmar,

Regarding this:

Thanks for the (significant) effort put into the linked PR, but I am not interested in maintaining the results to support buggy prototype of modules inside MSVC.

The only real bug in MSVC is being slightly to permissive when it comes to declarations in the global module, which should still retain all of the throughput benefit of C++ modules. I am also curious where 'prototype' came from? The modules implementation is far from a prototype, it adheres to everything specified in the standard (modulo a few bugs). The problems solved by the PR above are indeed problems that GCC and Clang will hit once their support for C++ modules matures.

Please consider taking a second look.

@horenmar
Copy link
Member

@cdacamar I was under the impression that mixing include and import in one file should be work, but does not, and that's why this PR exists: so that including catch2 does not toss in bunch includes of stdlib.

@davidhunter22
Copy link
Author

Just to add it appears that Clang is also going to have problems mixing import std; and #include <some_std_header>. As @cdacamar mentioned they would. My view is that even if a compiler claimed to support it why would I want my code to be a messy mixture of import and #include that will probably confuse users.

@kelteseth
Copy link

I have linked this issue to https://arewemodulesyet.org/ so we can track modules support in the c++ eco system.

@davidhunter22
Copy link
Author

Note https://arewemodulesyet.org/ is not a site about migrating to import std; it is about creating modules for existing libraries which is quite a different, and much larger, problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Building and Packaging Issues affecting build/packaging scripts and utilities
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants