-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User-defined metafunctions based on dynamic library loading #1261
Conversation
Co-authored-by: Johel Ernesto Guerrero Peña <[email protected]>
Co-authored-by: Max Sagebaum <[email protected]>
Co-authored-by: Johel Ernesto Guerrero Peña <[email protected]>
Very internal to cppfront, to decide if we later want this to be usable by end users.
`get_alias` unused, but left for completeness in case its useful later. Co-authored-by: Edoardo Lolletti <[email protected]>
To Do
|
There's still plenty to do, but I figured I would open the PR now to start getting reviews, as the main ideas are implemented and the basic case works. Expect lots of rebases as I go (even tho they are extra annoying due to self-hosting), since I like having a clean history. I apologize for taking so long for this, after all, I made #907 (comment) ages ago by now. Going forward I should be able to contribute code more regularly, not just by participating in discussions. |
For this
I am not too sure how to proceed. I will probably not handle that for this PR. |
Regarding
Both directions have their pros and cons. First, for "move existing metafunctions to a separate header":
For "somehow load them via DLLs":
If its up to me, I would go with the former, seems like a more natural choice; The "standard library" is just code, and would only ever be included if you are authoring metafunctions, and I don't even think that the additional compile time would be that much. |
Nice to see the update. I would have a closer look, but I am currently on vecation. Maybe I will have time at the beginning of September. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked over it and most seems good to me.
Just two general remark that I also hinted at in some comments:
- I do not understand why we have to support
in
andinout
calling. Wouldn't beinout
calling enough. Is there a reason for this? - Some places are quite hardcoded for type and function metafunctions. If we want to support metafunctions on other objects like namespaces, variable declarations, using declarations, etc. Would it not be better to generalize this now. I suggested an enum and tuple approach.
- On that note. What if somebody wants to have a meta function for everything. He would need to implement all different interface function. How about a wildcard interface function?
@hsutter What is you opinion about these three points?
// | ||
// The verbose name is on purpose, lets try to give it a better name if it | ||
// can be generalized into something useful for all developers. | ||
foreign_interface_pseudovalue: (inout t : meta::type_declaration) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to have this new metafunction? A regular @interface should do the same. Cppfront can then extend from this interface and override everything.
What did I miss?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have explained this better in the PR summary, but this is not just for an interface, we are doing more than that.
The current API uses value semantics, not pointer/reference semantics. For example, when you call declaration.as_type()
, you are not getting a reference to a declaration, you are getting a value representation of the type declaration (type_declaration
, not type_declaration&
or type_declaration*
). This presents an issue since in order to have a full value instance like this, you must know the entire definition, which in turn would mean spilling all the implementation guts of the compiler to the users of the reflection API.
Instead, by using this specific metafunction, we avoid the issue entirely by making the definitions "wrappers" that point to the actual implementation, which lives within Cppfront (or anything else that might implement the reflection API). You can't do that with just a regular @interface
, since you'd need to cast to a pointer, and modify every metafunction to conform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I understand the problem. So the options are:
- Have this metafunction that basically is a value wrapped interface. (Maybe rename it to something like this.)
- Change the call syntax of metafunctions to either use pointers or
inout
style.
My own choice would be 2 with inout
style. But I can see also the advantages of option 1. Since the compiler services are already provided by pointer maybe we can move everything to pointer semantics.
I just thought about the reason why Herb might have chosen value semantics here. The objects are constructed on the fly. Using pointers or references would not work since the objects are only temporary and would be destroyed. This could only be solved with smartpointers, which we do not want to have for such an interface. @hsutter is this correct?
If so, your solution should be generalized.
// | ||
// meta - mark f as a metafunction and generate registration code for it | ||
// | ||
meta: (inout f: meta::function_declaration) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need both in
passing style and inout
passing style. inout
would also cover all cases. Since both exist it can not be because of security or speed reasons.
I would suggest to drop thee in
passing style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's currently a metafunction like print
that uses in
parameter style since it doesn't modify the declaration. whether or not we want to support those is up to Herb really, but since in
is the default parameter I feel like I can't leave it out for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know. I am just asking the question: Was the in
style because of some special thoughts or because it was just the minimal required definition for this function.
// * function_declaration or type_declaration (implicitly deduced to be the above), AND, | ||
// * either in or inout paramater passing style, AND, | ||
// * to be publicly accessible, AND, | ||
// * to not be on header file, AND, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They could be in a header file. If this header is just included in one translation unit it would be valid code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, tbh I am not sure if we should allow it or not, in any case we can detect multiple registration of the same function, which is a clear ODR violation, but it can be entirely avoided if we just prevent users from writing stuff in headers in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Maybe have extra file extension for defining metafunctions.
Registering a function twice should not be a problem if the function pointer is the same. Having the same name with a different function pointer should be a problem.
else { | ||
error( "unrecognized metafunction name: " + name ); | ||
error( "unrecognized metafunction name: '" + name + "' for type declaration" ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the detection logic should be changed. We should have two lists:
- Internal metafunctions
- External metafunctions
First look up the internal ones, afterwards the external ones. If both are not found create an error that lists all.
This would reduce the number of places which need to change when a new metafunction is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we should dump every single metafunction that could be registered, for big programs the list could grow quite big. This does look first on internals and then externals second though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we print out a list of all metafunctions. Printing out all external metafunctions would be a logical consequence. It also helps the user to debug. If he does not find the metafunction in the list it is not loaded. He might had a typo and can see how the metafunction is actually called.
- Printing out everything could be enabled via a compiler flag
- Or a flag can force cppfront to print out all registered metafunctions.
// TODO(DyXel): Do full/better name look-up. | ||
auto search = functions.find(name); | ||
if(search == functions.end()) return false; | ||
auto& entry = search->second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leave an empty line here. The early out is hardly recognizeable. Maybe also add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out, it could be great to have #1253 so I can stop worrying about stylistic choices. In any case, this code is throwaway since I need to implement full registration and look-up.
auto visitor = [&](auto&& v) -> bool { | ||
if constexpr(requires { v(decl); }) { | ||
v(decl); | ||
return true; | ||
} | ||
return false; | ||
}; | ||
if constexpr(std::is_same_v<meta::type_declaration, CPP2_TYPEOF(decl)>) | ||
return std::visit(visitor, entry.type); | ||
else | ||
return std::visit(visitor, entry.func); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty involved. Why can you not call entry.type
directly.
Would it be better to have two functions try_apply_type
and try_apply_function
. The lookup could be refactored into its own method and then the correct method could directly be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like I mentioned, this code is throwaway but I personally don't feel like its involved. This code is very tame compared to the lambda overload trick I had before 😛 (visit is just a mess to use, what can I say).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. ;)
friend meta::register_metafunction; | ||
|
||
void register_one(std::string name, auto* func) { | ||
// TODO(DyXel): Check if already registered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that a metafunction can be written for functions and types. A check should check only of the function pointer is null, not the existence of the entry in the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks for pointing it out, I am gonna delete most of this when I get full name look-up.
void register_one(std::string name, auto* func) { | ||
// TODO(DyXel): Check if already registered | ||
auto& entry = functions[name]; | ||
if constexpr(requires { entry.type = func; }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this requires { entry.type = func; }
hard to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it? this code just means "if we can assign to the type
variant, assign to it, otherwise assign to the func
variant. Again, the alternative would be a complicated lambda overload trick mess to assign the value to the correct variant. Same observation as above applies, this is all throwaway code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With respect to the lambda it is probably simpler to read. I might just need to get used to it.
struct function_entry | ||
{ | ||
std::variant< | ||
std::monostate, | ||
meta::mf_sign_type_in*, | ||
meta::mf_sign_type_inout* | ||
> type = {}; | ||
std::variant< | ||
std::monostate, | ||
meta::mf_sign_func_in*, | ||
meta::mf_sign_func_inout* | ||
> func = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach is very type safe, but requires all the other methods to distinguish between function and and type metafunctions. Would it be possible to have an enum for type, function, etc. We could then define a tuple that could be accessed via the enum value.
There could even be a consexpr lookup from the argument type of try_apply
to the index in the tuple.
This would remove a lot of code duplication and would make it simpler to add new metafunction for different kinds, e.g. using decl, variable decl, member function decl, member variable decl, namespace dcl, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we need a better approach what would avoid a lot of code duplication for what is essentially "overload handling". I'll keep it in mind when designing the full name look-up solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
[&]( | ||
std::string_view caller, | ||
std::string const& name, | ||
meta::type_declaration& decl | ||
) -> bool { return libraries.try_apply(caller, name, decl); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly provide the libraries object?
Maybe because reflect_libraries would then need to be defined in cpp2reflect_api.h
?
If so, would it make sense to move apply_metafunction
out of cpp2reflect_api into reflect_libraries
and just call a function once in cpp2reflect_api
which registers all the internal metafunctions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I follow... The main reason we do this wrapping and pass a lambda instead of passing the object, is to entirely decouple the existence of reflect_libraries
with the rest of the implementation, it has nothing to do with exposing things in reflect_api or anything like that. Same thing is done for printing errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I can understand the decoupling argument.
Let me more broadly answer your questions Max, we still also need feedback from Herb of course:
Technically, we don't need it. When you call a metafunction you always have a mutable reference of the declaration available, and in fact, you wouldn't be able to have At the compiler code level, we can't cast a
I wouldn't be against a composition-over-inheritance approach, but this does mean big changes for the API, and subsequently, to the existing metafunctions. If we decide to go ahead with big API changes, I would also suggest completely decoupling
I am not sure a "metafunction for everything" is the right approach, we don't want to directly compete with the upcoming reflection support... To be discussed further I guess. Regarding "wildcard interface function", the |
Why would this be a big API change? We would only change the lookup logic. The interface of the metafunctions would not be changed.
This would then require pointer or reference semantics, I think. |
Thanks! As mentioned in #1287, I'm very interested in this kind of thing, but won't have the bandwidth to look at it for a while. So for now I'll close this, but do feel free to keep trying out the idea in a clone of the repo! Thanks again for understanding. |
This is a follow-up of #797, #907, #909 and all other related issues/discussions regarding metafunctions I might have missed. It is not necessary to have all of the previous context to follow this PR, although it would help a lot.
Special thanks to @JohelEGP for the original idea and implementation (see references above), @MaxSagebaum for #809 which is included in this PR and @edo9300 for helping with the DLL loading implementation.
Introduction
A metafunction is a special kind of function which is invoked on a declaration's reflection, and participates in defining its meaning (more in the documentation). This PR implements a mechanism that allows users to define and apply these kind of functions, as opposed to being only writable within cppfront's code.
The basic idea behind this design is that the definition of a metafunction can be compiled separately to a dynamic library (
.so
on Unix-like,.dll
on Windows), targeting the reflection API, which can then be loaded and used by cppfront when processing further code. The major advantage of doing it this way is that metafunctions are regular code, and thus are not limited to the usual C++ compile-time evaluation restrictions, this makes it possible to use third-party libraries or execute any arbitrary code (e.g.: Open and write files, call a remote service, etc.).Of course there are caveats as well, one in particular is that you can't define a metafunction and use it in the same compilation step (e.g.: In the same file), as the code would have needed to be compiled by the C++ compiler before cppfront can load it.
Basic Usage
Here is an example with GCC as base C++ compiler, build systems can automate the tedious parts:
metafunctions.cpp2
:main.cpp2
Steps
Compile
cppfront
with-rdynamic
:g++ cppfront.cpp -std=c++20 -rdynamic -o cppfront
Transpile
metafunctions.cpp2
:./cppfront metafunctions.cpp2
Compile
metafunctions.cpp
to a dynamic library namedlibmetafunctions.so
:Transpile
main.cpp2
:CPPFRONT_METAFUNCTION_LIBRARIES=./libmetafunctions.so ./cppfront main.cpp2
Compile
main.cpp
:g++ -std=c++20 -I../include/ main.cpp
Run the resulting binary (
./a.out
), which outputs:Implementation Details
Features
The main ingredients are a set of orthogonal features that combine to provide the full application:
Reflection API to generate non-member declarations
append_declaration_to_translation_unit
is added, which allows a metafunction to generate code outside of the declaration that it is being applied to. Useful for things like factory functions and more.Support applying metafunctions to functions and the
@meta
metafunctionThe ability to apply a metafunction to a function declaration is enabled, and a
@meta
metafunction is added to mark would-be metafunctions in order to perform automatic registration.Reflection API to obtain a declaration's fully qualified name
fully_qualified_name()
is added, which allows obtaining the unique and final identifier of any declaration node, which can be used to refer to a specific entity from any place in code.Public reflection API header and the internal foreign interface metafunction
The declaration of the reflection API is entirely moved to a public place that can be used by everybody including the compiler itself, it can be thought of as a "synopsis", but it is compilable code.
An internal compiler metafunction is added
@_internal_foreign_interface_pseudovalue
(obnoxious name on purpose), which automatically wraps the aforementioned declarations of the public header into something that allows for the full definition of the API to live somewhere else outside the library (i.e.: Inside cppfront).The compiler's internal reflection code is refactored to provide the implementation of the interfaces created by the processing of the public header, allowing complete decoupling while keeping value semantics for the API, which also means not having to modify existing metafunctions.
Dynamic library loading and look-up mechanism
A standalone header (
source/dll.h
) is added that permits loading dynamic libraries on demand. Additional code that ties the loading of the dynamic libraries with the registration of metafunctions, as well as looking up names when attempting to apply a metafunction is added.Putting All Together
When
@meta
is applied, it does a few things:cpp2reflect_api.h
.The generated entity is of type
register_metafunction
, which is declared withincpp2reflect_api.h
, and its constructor will be called when the library is loaded. The implementation of the overloads for this constructor are provided within cppfront.Before loading a library, cppfront sets itself up so when the implementations of
register_metafunction
are called, they dispatch to a internal object which does bookkeeping and ensures the metafunctions can be found and called when needed.Once cppfront tries to apply a metafunction, it will create the internal context (
compiler_services_base
) and a reflection API object, by filling in the interfaces with its internal implementation (provided by the renamed_impl
classes inreflect.h2
). In essence, the object is made up of "vtables", and the function pointers are all provided by cppfront at runtime. Prepped with this, it can proceed to do name look-up as before, it first attempts to use the internal metafunctions (the ones with nice name), followed by looking up in the loaded libraries.