Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider making generated TokenStream public #1916

Open
alexxbb opened this issue Nov 8, 2020 · 8 comments
Open

Consider making generated TokenStream public #1916

alexxbb opened this issue Nov 8, 2020 · 8 comments

Comments

@alexxbb
Copy link

alexxbb commented Nov 8, 2020

There's a need for an API for manipulating generated items according to some open issues here. For a library I'm working on, I run an extra step which generates some glue code on top of the bindgen output.

Unfortunately, I have to write out the bindings to a file first and then re-read them back, because tokens are made private here:
https://github.com/rust-lang/rust-bindgen/blob/master/src/lib.rs#L1914

Please consider making it public (maybe under a feature flag with re-exporting of proc_macro2::TokenStream) This is a safe change and will make it possible to manipulate the tokens (or just reuse them to generate some extra code ).

@alexxbb alexxbb changed the title Consider making generated TokenTree public Consider making generated TokenStreem public Nov 8, 2020
@alexxbb alexxbb changed the title Consider making generated TokenStreem public Consider making generated TokenStream public Nov 8, 2020
@emilio
Copy link
Contributor

emilio commented Nov 13, 2020

Not really opposed to this, I think... But why can't you use Bindings::to_string rather than writing to a file and back?

If that somehow doesn't work, I think a PR with either fn token_stream(&self) -> &TokenStream or fn into_token_stream(self) -> TokenStream would be reasonable.

@alexxbb
Copy link
Author

alexxbb commented Nov 14, 2020

I must be overlooked Bindings::to_string I'll check it out and maybe submit a PR. Thanks!

@Ciantic
Copy link

Ciantic commented Feb 25, 2021

My use case for this is to convert all function declarations to types (for dynamic loading instead of linking):

extern "C" {
    pub fn Everything_SetSearchW(lpString: LPCWSTR);
}

Becomes:

pub type Everything_SetSearchW = unsafe extern "C" fn(lpString: LPCWSTR);

Currently I have to do this with the string manipulation as the TokenStream is not there.

P.S. I figured I wouldn't have to do this if there was way to get typeof of the function.

@ljbade
Copy link

ljbade commented Nov 1, 2022

This would also be useful for using syn crate to manipulate the generated bindings to implement custom traits and such for generated types.

@pvdrz
Copy link
Contributor

pvdrz commented Nov 9, 2022

I have two questions/comments about this topic:

  • Given that TokenStream implements FromStr. TokenStream would be exposed to avoid the performance overhead of turning the already existing TokenStream into a String to just tokenize it back, right?
  • If TokenStream were added to the public API. How would versioning work if someone wants a feature from a more recent proc-macro2 and the dependency needs to be updated?

@ju1ius
Copy link

ju1ius commented Jul 12, 2023

This would also be useful for using syn crate to manipulate the generated bindings to implement custom traits and such for generated types.

Another case where this would prove useful is when the bound library has an online searchable documentation (or OpenGrok instance, etc).

This would allow auto-linking the sys crate docs to the library ones, by adding a doc-comment on top of each type, i.e:

extern "C" {
    /// See: <https://libfoo.example/docs/search?definition=libfoo_init>
    pub fn libfoo_init();
}

Currently, doing that requires reparsing thousands of LOCs which hurts build-times a lot...

@pvdrz
Copy link
Contributor

pvdrz commented Jul 12, 2023

not opposed to implementing this but maybe @emilio has opinions on it

@workingjubilee
Copy link
Member

Given that TokenStream implements FromStr. TokenStream would be exposed to avoid the performance overhead of turning the already existing TokenStream into a String to just tokenize it back, right?

Effectively, yes. This is probably an extremely low-hanging fruit for build times for what must now be hundreds of projects like pgrx which don't simply take the code as-given but instead massage it with their own handling. We are generating north of 40KLOC at build time, and that's after I've made several efforts to reduce the amount of code we emit (before it was around 70KLOC). That's a lot of work to reparse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants