diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 38bb16f20..7fe8d59c4 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -6,7 +6,7 @@ Copyright (c) Herb Sutter SPDX-License-Identifier: CC-BY-NC-ND-4.0 See [License](LICENSE) -[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](code_of_conduct.md) +[![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg)](CODE_OF_CONDUCT.md) # Contributor Covenant Code of Conduct diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 70294cdf6..21dcdc5a4 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,15 +1,11 @@ ## Contributing to cppfront -cppfront is the sole personal project of Herb Sutter. +cppfront is an personal experimental project of Herb Sutter. -NO commercial use. NO forks/derivatives. +At this time, the license is limited to no commercial use and no distributed forks/derivatives. -Please DO NOT open issues or submit PRs on this repo at this time. -The following is for the future when I may accept PRs. - - ## Contributor License Agreement By contributing content to cppfront (i.e., submitting a pull request for inclusion in this repository): - You warrant that your material is original, or you have the right to contribute it. diff --git a/README.md b/README.md index 7343a1585..bf10f3e7e 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ I'm sharing this work because I hope to start a conversation about what could be ### History 2015-present I did most of the 'syntax 2' design work in 2015-16. Since then, every evolution proposal paper I've brought to ISO C++, and every conference talk I've given, has come from this work — just presented as a standalone proposal under today's syntax, usually with a separate standalone prototype implementation, to help validate and refine one part of the design, then another, then another. - + For a list of papers and conference talks that have come from this work, see below. I started writing this cppfront compiler in mid-2021 as another step to prototype all the parts together as a whole as originally intended, now including the alternative 'syntax 2' for C++ that enables their full designs including otherwise-breaking changes. This step is to let me try out the full set of coordinated improvements in one place, and free of concerns about breaking any existing code. @@ -34,13 +34,13 @@ I started writing this cppfront compiler in mid-2021 as another step to prototyp ## What's different about this experiment? This is one of many experiments going on across the industry looking at ways to accomplish a major C++ evolution. Several of those other experiments' designers have seen this 'syntax 2' work privately since 2016, and if they've found parts of this useful in their own experiments then I think that's great, we should learn from each other and I look forward to seeing how their experiments work out too. - + What makes this experiment different from the others? Two main things... ### 1) This is about C++20/23/... — not about something else For me, ISO C++ is the best tool in the world today to write the programs I want and need. I want to keep writing code in C++... just "nicer": - + - with less complexity to remember; - with fewer safety gotchas; and @@ -48,13 +48,13 @@ What makes this experiment different from the others? Two main things... - with the same level of tool support other languages enjoy. If you're a C++ programmer, this might resonate with you... I want "C++, the powerful and expressive parts" without "C++, the cumbersome and dangerous parts." That C++ is an awesome language. More like that please. - + We've already been improving C++'s safety and ergonomics with every ISO C++ release, but they have been "10%" improvements. We haven't been able to do a **"10x"** improvement primarily because we have to keep 100% syntax backward compatibility. What if we could have our compatibility cake, and eat it too — by having: - 100% seamless **link compatibility always** (no marshaling, no thunks, no wrappers, no generated 'compatibility modules' to import/export C++ code from/to a different world); and - + - 100% seamless **backward source compatibility always _available_**, including 100% SFINAE and macro compatibility, **but only pay for it when we use it**... that is, apply C++'s familiar "zero-overhead principle" also to backward source compatibility? What does this mean in practice in cppfront? That you have two option always available: @@ -64,14 +64,14 @@ What does this mean in practice in cppfront? That you have two option always ava - _Write only syntax 2 in a particular source file._ This gives you a whole source file that aims to let you program in a 10x simpler C++, where code is type-safe and memory-safe by construction (and in the future ideally with faster builds and better tooling tuned for the simpler language, if this project succeeds). You still have seamless interoperability with all C++ code via module `import,` but not via `#include` because that's part of the text preprocessor which is eliminated in pure syntax 2; you can still use `#include` and everything else in syntax 1, so to use those things just write a mixed 1/2 source file, and so you pay for it only if you do use it. I want to encourage us to look for ways to push the boundaries to bring C++ itself forward and double down on C++ — not to switch to something else. - + I want us to aim for major C++ evolution directed toward things that will make us better C++ programmers — not programmers of something else. - + ### 2) This is about improvements to safety, simplicity, and toolability — not about green-field design or random drive-by improvements - + My specific goal is to explore the question: Can we make C++ **10x safer, simpler, and more toolable** if C++ had an alternative "syntax #2," within which we could be completely free to **improve semantics** by applying 30 years' worth of C++ language experience without any backward source compatibility constraints? We want each proposed improvement to address a known C++ pain point, and in a measurable way (e.g., reduce a class of CVEs (vulnerabilities) by some quantifiable %, reduce the amount of guidance we have to teach by some quantifiable %). - + An alternative syntax would be a cleanly demarcated "bubble of new code" that would let us do things that we can never do in today's syntax without breaking the world, such as to: - fix defaults (e.g., make `[[nodiscard]]` the default); @@ -81,10 +81,10 @@ An alternative syntax would be a cleanly demarcated "bubble of new code" that wo - eliminate 90% of the guidance we have to teach about today's complex language (e.g., make common guidance the language default, eliminate irregular special cases through generalization, refactor the language into a smaller number of regular composable features); - make it easy to write a parser (e.g., have a context-free grammar); and - make it easy to write refactoring and other tools (e.g., have order-independent semantics). - + The cppfront compiler is an experiment to try to develop a proof of concept that evolution along these lines may be possible. For example, this repo's `parse.h` is a standalone context-free parser that is growing month by month as I implement more of the "syntax #2" experiment, and I hope for it to become a standalone context-free parser for a flavor of C++ that has full parity with the expressive power of today's C++. -> **Important disclaimer: This isn't about 'just a pretty syntax,' it's about fixing semantics.** The unambiguous alternative syntax is just a means to an end, a gateway that lets us access a new open space beyond it — and sure, if we build a gate, then the gate ought to look nice too, so we build it with good boards and paint it nice colors. But the gate is the doorway, the portal, not the goal... the real payoff is gaining access to that new open space in C++ that's free of backward source compatibility constraints where we can (finally) fix semantics — order-independence, great defaults, regular composable semantic meanings — as we see fit. +> **Important disclaimer: This isn't about 'just a pretty syntax,' it's about fixing semantics.** The unambiguous alternative syntax is just a means to an end, a gateway that lets us access a new open space beyond it — and sure, if we build a gate, then the gate ought to look nice too, so we build it with good boards and paint it nice colors. But the gate is the doorway, the portal, not the goal... the real payoff is gaining access to that new open space in C++ that's free of backward source compatibility constraints where we can (finally) fix semantics — order-independence, great defaults, regular composable semantic meanings — as we see fit. Scores of people have given valuable feedback and many are listed below, but I specifically want to thank Bjarne Stroustrup, Mads Torgersen, Anders Hejlsberg, Tim Sweeney, Joe Duffy, Andrew Sutton, Gabriel Dos Reis, Gor Nishanov, Chris McKinsey, Daniel Frampton, Jared Parsons, Walter Bright, and Andrei Alexandrescu for their help and valuable feedback on this work over the years — especially when they disagreed with me. Dave Abrahams, David Sankel, Lee Howes, Nathan Sidwell, Pavel Curtis, and others also contributed valuable feedback broadly; I apologize for the names I have forgotten. Many more people are listed below for their help with specific parts of the design and those proposals/prototypes. Thank you all, again. @@ -92,17 +92,17 @@ Scores of people have given valuable feedback and many are listed below, but I s Cppfront builds with any major C++20 compiler. - + #### MSVC build instructions - + cl cppfront.cpp -std:c++20 -EHsc - + #### GCC build instructions - + g++-10 cppfront.cpp -std=c++20 -o cppfront - + #### Clang build instructions - + clang++-12 cppfront.cpp -std=c++20 -o cppfront That's it. @@ -118,7 +118,7 @@ Just run `cppfront your.cpp2`, then run the generated `your.cpp` through any maj - Clang would be: `clang++-12 your.cpp -std=c++20` That's it. - + Of course, if your code wants to use features that are under additional switches provided by your compiler, then you'll need to add those switches. For example, if you're using post-C++20 features you may need to specify `-std=c++2a` or `-std:c++latest`, if you're using OpenMP you may need to specify something like `/openmp` or `-fopenmp`, and so forth. This especially comes up in mixed Cpp1/Cpp2 source files, because of course the Cpp1 code can use anything your C++ compiler understands, including nonstandard extensions, `#pragmas`, etc. @@ -130,15 +130,15 @@ Here's where to find out more about my 'syntax #2' experiment: - **My CppCon 2022 talk, "C++ simplicity, safety, and toolability: Can C++ be 10x simpler and safer ...?**" [link coming soon] — this is the primary documentation right now. See also every talk I've given and paper I've written since 2015, each of which details an individual part of this design experiment, but presented in today's C++ syntax as a standalone C++ evolution proposal. - **The [cppfront regression tests](https://github.com/hsutter/cppfront/tree/main/regression-tests/test-results)** which show dozens of working examples, each with a`.cpp2` file and the `.cpp` file it is translated to. Each filename briefly describes the language features the test demonstrates (e.g., contracts, parameter passing, bounds safety, type-safe `is` queries and `as` casts, initialization safety, and generalized value capture including in function expressions ('lambdas'), postconditions, and string interpolation). - + ## List of my papers and talks since 2015 (all from this work, but presented in today's syntax) All of the ISO C++ papers and CppCon conference talks I've given since 2015 have been derived from this work. I've spent the last seven years bringing each individual experimental design improvement of the 'syntax 2' experiment as a standalone proposal in today's syntax, to validate that the committee and community agreed with the problem to be solved and the solution direction, and to further flesh out each part individually... thanks very much to all of you who have given valuable feedback! - + Here is a list of those papers and talks, in the order that I brought each individual design forward. Most of the details in the following papers and talks are still current with only incremental updates, apart from the specific syntax of course. - + ### 2015: Lifetime safety - + - [**CppCon 2015**: "Writing good C++14... _by default_"](https://youtu.be/hEx5DNLWGgA) particularly [from 29:00 onward](https://youtu.be/hEx5DNLWGgA?t=1757) shows the Lifetime analysis with live demos in a Visual Studio prototype. - [**CppCon 2018**: "Thoughts on a more powerful _and_ simpler C++ (#5 of N)](https://youtu.be/80BZxujhY38): - [The section starting at 18:00](https://youtu.be/80BZxujhY38?t=1097) is an update on the Lifetime status with live demos in a Clang prototype. @@ -147,9 +147,9 @@ Here is a list of those papers and talks, in the order that I brought each indiv - [**P1179**: Lifetime Safety: Preventing common dangling](https://wg21.link/p1179) is the same analysis in the WG 21 paper list. Much of this Lifetime analysis has been implemented and shipped in Visual Studio and in CLion, and initial small parts have been upstreamed and shipped in Clang. The implementations have exposed bugs in existing C++ code that were not caught before. My experiment in 'syntax 2' is to make these safety rules the default and mandatory. (Note: Most of this is not yet implemented in cppfront.) - + I want to again thank many people, including especially Matthias Gehre, Gabor Horvath, Neil MacIntosh, and Kyle Reed for their help in implementing the Lifetime static analysis design in Visual Studio and a Clang fork. Thanks also to all of the following for their input and feedback on the specification: Andrei Alexandrescu, Steve Carroll, Pavel Curtis, Gabriel Dos Reis, Joe Duffy, Daniel Frampton, Anna Gringauze, Chris Hawblitzel, Nicolai Josuttis, Ellie Kornstaedt, Aaron Lahman, Ryan McDougall, Nathan Myers, Gor Nishanov, Andrew Pardoe, Jared Parsons, Dave Sielaff, Richard Smith, Jim Springfield, and Bjarne Stroustrup. - + ### 2016: Garbage-collected memory arena - [**CppCon 2016**: "Leak-freedom in C++... _by default_"](https://www.youtube.com/watch?v=JfmTagWcqoE) particularly [from 59:00 onward](https://youtu.be/JfmTagWcqoE?t=3558) where I show the strawman prototype I wrote of a tracing garbage-collection memory arena. @@ -158,18 +158,18 @@ I want to again thank many people, including especially Matthias Gehre, Gabor Ho I welcome a real GC expert to collaborate with on bringing this forward to become a "real" usable tracing GC memory arena that C++ code can opt into. As always, we still prefer scopes first (no tracking needed), and if that's not sufficient then `unique_ptr` (minimal tracking needed), then if that's not sufficient `shared_ptr` (more tracking needed), and then if that's not sufficient this tracing GC arena (suitable for cases where the existing smart pointers aren't enough, such as when you really cannot statically know enough about lifetimes to use the existing smart pointers). ### 2017: Spaceship operator for comparisons, `<=>` - + - [**CppCon 2017 (just the intro, first 6 minutes)**: "Meta: Thoughts on generative C++"](https://www.youtube.com/watch?v=4AfRAVcThyA). - [**P0515**: Consistent comparison](https://wg21.link/p0515) is the proposal in today's syntax that I proposed, and was adopted, for C++20. This is the first feature from my Cpp2 work that is now in the ISO C++ standard as part of C++20, and with legacy comparison interoperability improvements in C++23. I had not initially been planning to make this one an ISO C++ proposal yet, but after C++17 shipped the committee continued actively discussing ways to improve comparisons, so since I had a design in my back pocket I submitted it as a proposal and, to my surprise, it was approved in a single meeting. (But see the notes about bug fixes, two paragraphs below.) - + This is the only feature in the history of ISO C++ where we _added_ a feature to ISO C++ that made the whole standard _smaller_: It took about a dozen pages of core language wording to specify, but it was also applied throughout the standard library which reduced the C++ standard library's specification by about twice that many pages because we removed something like hundreds of comparison operators. I take this as a data point that validates the core hypothesis of 'syntax 2,' that it is possible to simplify today's C++ code (even the standard library's own specification) by thoughtfully adding the right kinds of features to the language. - + This is also the only feature from the Cpp2 work that I proposed without first having a prototype implementation, and so the proposal had bugs in two main areas that were discovered and fixed later: Keeping the `==` optimization, which was known but in the initial proposal was easy to lose accidentally; and smoother interoperation with existing pre-`<=>` types (which is important because there are _a lot_ of those). Thank you again to everyone who helped land this in the Standard in C++20 and improved in C++23, including: Walter Brown, Lawrence Crowl, Cameron DaCamara, Gabriel Dos Reis, Jens Maurer, Barry Revzin, Richard Smith, and David Stone. This shows the importance of prototype experience. ### 2017: Reflection, generation, and metaclasses - + - [**ACCU 2017**: "Thoughts on metaclasses"](https://www.youtube.com/watch?v=6nsyX37nsRs) is the first talk I gave about this. - [**CppCon 2017**: "Meta: Thoughts on generative C++"](https://www.youtube.com/watch?v=4AfRAVcThyA) from after the intro, [from 6:00 onward](https://youtu.be/4AfRAVcThyA?t=393). - [**CppCon 2018**: "Thoughts on a more powerful _and_ simpler C++ ("Simplifying C++" #5 of N)](https://youtu.be/80BZxujhY38): @@ -178,13 +178,13 @@ This is also the only feature from the Cpp2 work that I proposed without first h - [**P0707**: Metaclass functions: Generative C++](https://wg21.link/p0707) is the paper I brought to the ISO C++ committee. The ACCU talk started with something I've never done before: A live mini-"usability study" with unprepared subjects in front of a live audience. (It was not a proper usability study because of conference talk constraints; for example, to save time I allowed myself to use some leading questions. In a real usability study you wouldn't do that.) The reason I did this was because I had already run this design (and parameter passing, below) through actual usability studies with C++ programmers and saw how they consistently reacted to it, and I wanted the ACCU audience to see what I had already seen, namely how real C++ developers who have never seen it before react to it, and how quickly they can understand and learn it. This was a totally legit demonstration... the audience members who came on-stage really had never seen it before and I had never spoken with them about it before. - + cppfront does not yet have 'syntax 2' user-defined types (classes) or metaclasses. I look forward to starting to implement this in cppfront over the fall and winter... wish me luck! I anticipate that using the AST that cppfront has, which is much closer to a parse tree than a bound tree, is ideal for most metaclass applications which really are about a mental model of "generating source code"... some of what has made previous metaclass prototypes difficult was that they were working on fully bound trees which meant they had to remove work already done, whereas my original design of metaclasses was much closer to the source code level and that's what I aim to (try to) implement. - + I want to again thank Andrew Sutton and his colleagues Wyatt Childers and Jennifer Yao for their help in implementing the Clang-based prototypes of this proposal, and everyone else who contributed feedback on the design. ### 2018: Updates to Lifetime and Metaclasses (see above) - + ### 2019: Zero-overhead deterministic exceptions: Throwing values - [**ACCU 2019**: "De-fragmenting C++: Making exceptions more affordable and usable](https://www.youtube.com/watch?v=os7cqJ5qlzo). @@ -194,9 +194,9 @@ I want to again thank Andrew Sutton and his colleagues Wyatt Childers and Jennif I'll just say that when I brought this to the ISO C++ committee, I was amazed that in the Library subgroups a repeated reaction to some (not all) of the library-focused suggestions was "yup, that's a direction we've already decided we want the standard library to move toward..." Except possibly for `<=>` comparisons, this is the only time in my 25 years in WG 21 that I've made a proposal to the committee where I expected to have to do a lot of selling and suddenly had the feeling that I was pushing hard on an open door. (Disclaimer: In the Language subgroups there was more resistance, particularly to make sure pointers to functions would not be bifurcated in the type system. I believe I have an answer to that (thanks to input from Ville Voutilainen in particular), but I still need to prototype it in cppfront, and it still needs to be brought back to the committee to see if they find the results acceptable. There's real work still ahead and a possibility it might not pan out as expected... that's why we use the word "experiment.") Note: Besides `<=>`, this is the other of the Cpp2-derived proposals that has not yet been implemented, and implementation experience is important before standardizing something like this. I hope to gain experience with it in cppfront, though this will be the trickiest part of this work to implement in a Cpp2->Cpp1 compiler like cppfront because it needs to be coordinated with stack unwinding details deep inside the existing C++ compiler and the platform ABI; I think it's doable, but I realize I have work ahead of me here. - + ### 2020: Parameter passing - + - **ACCU autumn 2019**: "Quantifying accidental complexity: An empirical look at teaching and using C++" was my first public talk about this, but a "beta" version that was not recorded; you can find the description [here](https://accu.org/conf-previous/2019_autumn/sessions/#XQuantifyingAccidentalComplexityAnEmpiricalLookatTeachingandUsingC). - [**CppCon 2020**: "Quantifying accidental complexity: An empirical look at teaching and using C++"](https://www.youtube.com/watch?v=6lurOCdaj0Y): - The first half of the talk is about how to be rigorous and actually measure that we're making improvements, including to measure the percentage of today's C++ guidance that is about parameter passing and initialization. @@ -204,7 +204,7 @@ Note: Besides `<=>`, this is the other of the Cpp2-derived proposals that has no - [**d0708**: "Parameter passing -> guaranteed unified initialization and value setting](https://github.com/hsutter/708/blob/main/708.pdf) goes into more details than I had time for in the talk, in the second half of the paper. Note: this is a "d"-draft paper I haven't formally brought to ISO C++, because during the pandemic I didn't bring any updates to my major papers as I think those major proposals are best considered when the committee can meet in person. - [**Github.com/hsutter/708**](https://github.com/hsutter/708) is a repo with the paper and demo examples as used in the talk. - [**P2064**: "Assumptions"](https://wg21.link/p2064) is also related to this 'syntax 2' work, because this work includes a contracts design, and assumptions ought to be separate from that. This paper was making the technical argument why assumptions and assertions (contracts) are different things. - + The only change in cppfront is that I've split `in` into `in` (now the whitespace default in 'syntax 2') and `copy`, and implemented automatic-move-from-last-use for `copy` parameters. This is actually consistent with, and rediscovering, what we already teach today, including in my [CppCon 2014 talk at 55:17](https://youtu.be/xnqTKD8uD64?t=3317) where the parameter passing section already distinguishes "in" vs. "in+copy" parameters. _Plus ça change, plus c'est la même chose..._ This is basically all implemented in cppfront, except not the unified `operator=` experiment since I haven't implemented classes yet in 'syntax 2.' @@ -212,43 +212,43 @@ This is basically all implemented in cppfront, except not the unified `operator= ### 2020: "Bridge to NewThingia" In 2020 I also started socializing the ideas of: - + - How do you answer "why is your thing different when others that look like it have all failed"? I had specifically in mind "why is 'syntax 2' different when Cyclone, CCured, and many other efforts to make C or C++ safer have all failed"... the answer is that none of them have tried having seamless C++ compatibility, which goes hand-in-hand with evolving C++ itself (not designing something else and then trying to build a bridge later). - What does it take to be adoptable, including to enable incremental adoption? I had specifically in mind seamless compatibility (without which you always lose 10 years; see the talk for examples) and avoiding the Python 2/3 problem. The talk was "Bridge to NewThingia," presented at: - + - [**DevAroundTheSun**: "Bridge to Newthingia"](https://herbsutter.com/2020/06/14/talk-video-available-bridge-to-newthingia-devaroundthesun/), an initial 26-minute version of the talk that I gave in the early months of the pandemic lockdowns. - [**C++ on Sea**: "Bridge to NewThingia"](https://www.youtube.com/watch?v=BF3qw1ObUyo) which especially [at the end starting near 48:00](https://youtu.be/BF3qw1ObUyo?t=2883) had a slide that directly tackled the "C++ major evolution" scenario, and laid out what I think it would take to have credible answers to the key questions. - + ### 2021: `is`, `as`, and pattern matching - [**CppCon 2021**: "Extending and simplifying C++: Thoughts on pattern matching using `is` and `as`"](https://www.youtube.com/watch?v=raB_289NxBk). - [**P2392**: Pattern matching using `is` and `as`](https://wg21.link/p2392) is the ISO C++ committee paper. Like spaceship `<=>` comparisons, I brought this work to the committee because the committee was actively considering pattern matching proposals, and I had a design in my back pocket and so I asked if they would like to see it and they said yes, so I contributed it. This proposal is actually much more about having a general consistent type query (`is`) and a general consistent type cast (`as`) throughout the language, not just in pattern matching `inspect` statements, and then seeing how it could make the pattern matching also nice to use as well as make its usefulness broadly available instead of just inside `inspect`. - + Cppfront currently has basic support for `is` and `as`, including for dynamic typing in the language (subsuming "dynamic_cast" downcasts) and the standard library's `std::variant`, `std::optional`, and `std::any`. Just before CppCon 2022 I also implemented very basic `inspect` statements and expressions, and I plan to continue fleshing them out. (Side note: This is one of the places where I'm really glad to have C++20 as a baseline, because implementing `inspect` expressions and getting the types right would have been nearly impossible without C++14 generic lambdas, C++17 `if constexpr`, and C++20 concepts `requires` tests. Which might give away the implementation strategy I chose... :) ) - + I still need to validate the `is` and `as` implementations with more cases (I'm sure there are still some that break that I need to fix), flesh out allowing `is` and `as` in `requires`-clauses to constrain templates (as shown in the paper), and non-basic `inspect` pattern matching examples. ### 2022: CppCon 2022 talk and cppfront - **CppCon 2022**: "C++ simplicity, safety, and toolability: Can C++ be 10x simpler and safer ...?" [link coming soon] - This repo. - + # Epilog: 2016 roadmap diagram - + Finally, let me show you a diagram I made in early 2016. A few words have changed, and some of the topics aren't mentioned (e.g, `<=>` was added after this), but it has remained amazingly stable and is still recognizably a roadmap of Cpp2's design approach today. - + I think this diagram is important because it attempts to write down how design decisions lead to each other and support each other... Cpp2 is not a gaggle of unrelated fixes, it is an attempt to do a coordinated refactoring of C++ so that it is still C++, but "simplified through generalization" into a smaller number of regular and combinable features, so we can remove special cases and duplications and confusions. That's almost a direct quote of how Bjarne Stroustrup stated it in the _ACM History of Programming Languages III_ (among other places): - + > "**10% the size of C++** in definition and similar in front-end compiler size. ... **Most of the simplification would come from generalization.**" (B. Stroustrup, ACM HOPL-III, 2007; emphasis added) - + ![image](https://user-images.githubusercontent.com/1801526/189503047-0b0a4f0f-c5e7-42b2-a17d-37d80bef3970.png) I haven't updated this roadmap diagram since 2016, but it shows many of the talks and papers that have come since then from this work, and it's still a pretty up-to-date roadmap of the major parts of Cpp2. As of this writing, cppfront implements much of the top part of this roadmap, and I plan for more to follow. - + I hope you enjoy reading about this personal experiment, and I hope that it might at least start a conversation about what could be possible _**within C++**_'s own evolution to make C++ 10x simpler, safer, and more toolable. - + diff --git a/include/cpp2util.h b/include/cpp2util.h index d8855064d..300e5058e 100644 --- a/include/cpp2util.h +++ b/include/cpp2util.h @@ -35,7 +35,7 @@ #ifndef _MSC_VER // This is the ideal -- note that we just voted "import std;" - // into draft C++23 in late July 2022, so implementers haven't + // into draft C++23 in late July 2022, so implementers haven't // had time to catch up yet. As of this writing (September 2022) // no compiler will take this path yet, but they're on the way... import std; @@ -204,9 +204,9 @@ namespace cpp2 { //----------------------------------------------------------------------- -// +// // contract_group -// +// //----------------------------------------------------------------------- // @@ -255,30 +255,30 @@ class contract_group { std::terminate(); } -auto inline Default = contract_group( - [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { +auto inline Default = contract_group( + [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { report_and_terminate("Contract", msg CPP2_SOURCE_LOCATION_ARG); } ); -auto inline Bounds = contract_group( - [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { +auto inline Bounds = contract_group( + [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { report_and_terminate("Bounds safety", msg CPP2_SOURCE_LOCATION_ARG); - } + } ); -auto inline Null = contract_group( - [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { +auto inline Null = contract_group( + [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { report_and_terminate("Null safety", msg CPP2_SOURCE_LOCATION_ARG); - } + } ); -auto inline Type = contract_group( - [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { +auto inline Type = contract_group( + [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { report_and_terminate("Type safety", msg CPP2_SOURCE_LOCATION_ARG); - } + } ); -auto inline Testing = contract_group( - [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { +auto inline Testing = contract_group( + [](CPP2_MESSAGE_PARAM msg CPP2_SOURCE_LOCATION_PARAM)noexcept { report_and_terminate("Testing", msg CPP2_SOURCE_LOCATION_ARG); - } + } ); constexpr auto contract_group::set_handler(handler h) -> handler { @@ -317,12 +317,12 @@ auto assert_in_bounds(auto&& x, auto&& arg CPP2_SOURCE_LOCATION_PARAM_WITH_DEFAU //----------------------------------------------------------------------- -// +// // Arena objects for std::allocators -// +// // Note: cppfront translates "new" to "cpp2_new", so in Cpp2 code // these are invoked by simply "unique.new" etc. -// +// //----------------------------------------------------------------------- // struct { @@ -346,13 +346,13 @@ template //----------------------------------------------------------------------- -// +// // in For "in" parameter -// +// //----------------------------------------------------------------------- // template -using in = +using in = std::conditional_t < sizeof(T) < 2*sizeof(void*) && std::is_trivially_copy_constructible_v, T const, @@ -361,19 +361,19 @@ using in = //----------------------------------------------------------------------- -// +// // Initialization: These are closely related... -// +// // deferred_init For deferred-initialized local or member variable // // out For out parameter -// +// //----------------------------------------------------------------------- // template class deferred_init { bool init = false; - union { + union { int i; T t; }; @@ -414,7 +414,7 @@ class out { } } - auto construct (auto ...args) -> void { + auto construct (auto ...args) -> void { if (has_t) { *t = T(args...); } @@ -427,7 +427,7 @@ class out { } } - auto construct_list(auto ...args) -> void { + auto construct_list(auto ...args) -> void { if (has_t) { *t = T{args...}; } @@ -443,9 +443,9 @@ class out { //----------------------------------------------------------------------- -// +// // CPP2_UFCS: Variadic macro generating a variadic lamba, oh my... -// +// //----------------------------------------------------------------------- // #define CPP2_UFCS(FUNCNAME,PARAM1,...) \ @@ -468,9 +468,9 @@ class out { //----------------------------------------------------------------------- -// +// // is and as -// +// //----------------------------------------------------------------------- // //------------------------------------------------------------------------------------------------------------- @@ -526,7 +526,7 @@ auto as( X const& x ) -> auto&& { } template< typename C, typename X > -auto as( X const& x ) -> auto +auto as( X const& x ) -> auto requires (!std::is_same_v && requires { C{x}; }) { return C{x}; @@ -640,18 +640,18 @@ constexpr auto as( X const& x ) -> auto&& //----------------------------------------------------------------------- -// +// // A variation of GSL's final_action_success and finally to run only on success // (based on a PR I contributed to Microsoft GSL) -// +// // final_action_success_success ensures something is run at the end of a scope // if no exception is thrown -// +// // finally_success is a convenience function to make a final_action_success_success -// +// //----------------------------------------------------------------------- // - + template class final_action_success { @@ -688,9 +688,9 @@ template //----------------------------------------------------------------------- -// +// // to_string for string interpolation -// +// //----------------------------------------------------------------------- // template @@ -725,18 +725,18 @@ using cpp2::cpp2_new; //----------------------------------------------------------------------- -// +// // A partial implementation of GSL features Cpp2 relies on, // to keep this a standalone header without non-std dependencies -// +// //----------------------------------------------------------------------- // namespace gsl { //----------------------------------------------------------------------- -// +// // An implementation of GSL's narrow_cast -// +// //----------------------------------------------------------------------- // template diff --git a/source/common.h b/source/common.h index 8a0817169..6e8d17f9d 100644 --- a/source/common.h +++ b/source/common.h @@ -31,9 +31,9 @@ namespace cpp2 { //----------------------------------------------------------------------- -// +// // source_line: represents a source code line -// +// //----------------------------------------------------------------------- // struct source_line @@ -43,7 +43,7 @@ struct source_line enum class category { empty, preprocessor, comment, import, cpp1, cpp2 } cat = category::empty; - auto prefix() const -> std::string + auto prefix() const -> std::string { switch (cat) { break;case category::empty: return "/* */ "; @@ -90,12 +90,12 @@ struct comment }; //----------------------------------------------------------------------- -// +// // error: represents a user-readable error message -// +// //----------------------------------------------------------------------- // -struct error +struct error { source_position where; std::string msg; @@ -105,10 +105,10 @@ struct error : where{w}, msg{m}, internal{i} { } - auto print(auto& o, std::string const& file) const -> void + auto print(auto& o, std::string const& file) const -> void { o << file ; - if (where.lineno > 0) { + if (where.lineno > 0) { o << "("<< (where.lineno); if (where.colno >= 0) { o << "," << where.colno; @@ -125,9 +125,9 @@ struct error //----------------------------------------------------------------------- -// +// // Digit classification, with '\'' digit separators -// +// //----------------------------------------------------------------------- // @@ -135,39 +135,39 @@ struct error //G 0 1 //G auto is_binary_digit(char c) -> bool -{ - return c == '0' || c == '1'; +{ + return c == '0' || c == '1'; } //G hexadecimal-digit: one of //G 0 1 2 3 4 5 6 7 8 9 A B C D E F -//G +//G auto is_hexadecimal_digit(char c) -> bool -{ +{ return isxdigit(c); } //G digit: one of //G 0 1 2 3 4 5 6 7 8 9 -//G +//G auto is_digit(char c) -> bool -{ - return isdigit(c); +{ + return isdigit(c); } //G nondigit: { a..z | A..Z | _ } //G auto is_nondigit(char c) -> bool -{ - return isalpha(c) || c == '_'; +{ + return isalpha(c) || c == '_'; }; //G identifier-start: //G nondigit //G auto is_identifier_start(char c) -> bool -{ - return is_nondigit(c); +{ + return is_nondigit(c); } //G identifier-continue: @@ -175,14 +175,14 @@ auto is_identifier_start(char c) -> bool //G nondigit //G auto is_identifier_continue(char c) -> bool -{ - return is_digit(c) || is_nondigit(c); +{ + return is_digit(c) || is_nondigit(c); } //G identifier: identifier-start { identifier-continue }* //G auto starts_with_identifier(std::string_view s) -> int -{ +{ if (is_identifier_start(s[0])) { auto j = 1; while (j < std::ssize(s) && is_identifier_continue(s[j])) { ++j; } @@ -195,20 +195,20 @@ auto starts_with_identifier(std::string_view s) -> int // Example: is_separator_or( is_binary_digit (c) ) // auto is_separator_or(auto pred, char c) -> bool -{ - return c == '\'' || pred(c); +{ + return c == '\'' || pred(c); } //----------------------------------------------------------------------- -// +// // String: A helper workaround for passing a string literal as a // template argument // //----------------------------------------------------------------------- // template -struct String +struct String { constexpr String(const char (&str)[N]) { @@ -251,9 +251,9 @@ auto strip_path(std::string const& file) -> std::string //----------------------------------------------------------------------- -// +// // Command line handling -// +// //----------------------------------------------------------------------- // @@ -280,7 +280,7 @@ class cmdline_processor callback handler; std::string synonym; - flag(int g, std::string_view n, std::string_view d, callback h, std::string_view s) + flag(int g, std::string_view n, std::string_view d, callback h, std::string_view s) : group{g}, name{n}, description{d}, handler{h}, synonym{s} { } }; @@ -301,7 +301,7 @@ class cmdline_processor for (auto flag2 = flag1+1; flag2 != flags.end(); ++flag2) { int i = 0; while ( - i < std::ssize(flag1->name) && + i < std::ssize(flag1->name) && i < std::ssize(flag2->name) && flag1->name[i] == flag2->name[i] ) @@ -350,8 +350,8 @@ class cmdline_processor help_requested = true; std::sort( - flags.begin(), - flags.end(), + flags.begin(), + flags.end(), [](auto& a, auto& b){ return a.group < b.group || (a.group == b.group && a.name < b.name); } ); @@ -380,8 +380,8 @@ class cmdline_processor auto add_flag(int group, std::string_view name, std::string_view description, callback handler, std::string_view synonym) { flags.emplace_back( group, name, description, handler, synonym ); - if (max_flag_length < std::ssize(name)) { - max_flag_length = std::ssize(name); + if (max_flag_length < std::ssize(name)) { + max_flag_length = std::ssize(name); } } struct register_flag { @@ -422,18 +422,18 @@ cmdline_processor::register_flag::register_flag(int group, std::string_view name cmdline.add_flag( group, name, description, handler, synonym ); } -static cmdline_processor::register_flag cmd_help ( +static cmdline_processor::register_flag cmd_help ( 0, "help", - "Print help", + "Print help", []{ cmdline.print_help(); }, "?" ); -static cmdline_processor::register_flag cmd_version( +static cmdline_processor::register_flag cmd_version( 0, - "version", - "Print version information", + "version", + "Print version information", []{ cmdline.print_version(); } ); diff --git a/source/cppfront.cpp b/source/cppfront.cpp index 66399b96d..9db16ee64 100644 --- a/source/cppfront.cpp +++ b/source/cppfront.cpp @@ -34,16 +34,16 @@ auto cmdline_processor::print(std::string_view s, int width) -> void //----------------------------------------------------------------------- -// +// // Stringingizing helpers -// +// //----------------------------------------------------------------------- auto pad(int padding) -> std::string_view { static std::string indent_str = std::string( 1024, ' ' ); // "1K should be enough for everyone" - if (padding < 1) { + if (padding < 1) { return ""; } @@ -55,48 +55,48 @@ auto pad(int padding) -> std::string_view //----------------------------------------------------------------------- -// +// // positional_printer: a Syntax 1 pretty printer -// +// //----------------------------------------------------------------------- // static auto flag_clean_cpp1 = false; -static cmdline_processor::register_flag cmd_noline( +static cmdline_processor::register_flag cmd_noline( 1, - "clean-cpp1", - "Emit clean Cpp1 without #line directives", + "clean-cpp1", + "Emit clean Cpp1 without #line directives", []{ flag_clean_cpp1 = true; } ); static auto flag_cpp2_only = false; -static cmdline_processor::register_flag cmd_cpp2_only( - 1, - "pure-cpp2", - "Allow Cpp2 syntax only", +static cmdline_processor::register_flag cmd_cpp2_only( + 1, + "pure-cpp2", + "Allow Cpp2 syntax only", []{ flag_cpp2_only = true; } ); static auto flag_safe_null_pointers = false; -static cmdline_processor::register_flag cmd_safe_null_pointers( - 2, - "null-checks", - "Enable null safety contract checks", +static cmdline_processor::register_flag cmd_safe_null_pointers( + 2, + "null-checks", + "Enable null safety contract checks", []{ flag_safe_null_pointers = true; } ); static auto flag_safe_subscripts = false; -static cmdline_processor::register_flag cmd_safe_subscripts( - 2, - "subscript-checks", - "Enable subscript bounds safety contract checks", +static cmdline_processor::register_flag cmd_safe_subscripts( + 2, + "subscript-checks", + "Enable subscript bounds safety contract checks", []{ flag_safe_subscripts = true; } ); static auto flag_use_source_location = false; -static cmdline_processor::register_flag cmd_enable_source_info( - 2, - "add-source-info", - "Enable source locations for contract checks", +static cmdline_processor::register_flag cmd_enable_source_info( + 2, + "add-source-info", + "Enable source locations for contract checks", []{ flag_use_source_location = true; } ); @@ -204,7 +204,7 @@ class positional_printer //----------------------------------------------------------------------- // Internal helpers - + // Start a new line if we're not in col 1 already // auto ensure_at_start_of_new_line() -> void @@ -231,7 +231,7 @@ class positional_printer } // Catch up with comment/blank lines - // + // auto flush_comments( source_position pos ) -> void { assert(pcomments); @@ -301,7 +301,7 @@ class positional_printer assert (curr_pos.lineno <= pos.lineno); curr_pos.lineno = pos.lineno; // re-sync } - + // Finally, align to the target column if (curr_pos.lineno == pos.lineno) { pos.colno = std::max( 1, pos.colno + pad_for_this_line ); @@ -455,7 +455,7 @@ class positional_printer // If the last line had a request for this colno, remember its actual offset constexpr int sentinel = -100; auto last_line_offset = sentinel; - for(auto i = 0; + for(auto i = 0; i < std::ssize(prev_line_info.requests) && prev_line_info.requests[i].requested <= pos.colno; ++i ) @@ -490,7 +490,7 @@ class positional_printer //----------------------------------------------------------------------- // Position override control functions - // + // // Use this position instead of the next supplied one // Useful when Cpp1 syntax is emitted in a different order/verbosity @@ -541,8 +541,8 @@ class positional_printer //----------------------------------------------------------------------- // Modal state control functions - // - + // + // In the first pass we will print only declarations (the default) // For the second pass this function enables printing definitions // @@ -587,9 +587,9 @@ class positional_printer //----------------------------------------------------------------------- -// +// // cppfront: a compiler instance -// +// //----------------------------------------------------------------------- // class cppfront @@ -611,21 +611,21 @@ class cppfront bool suppress_move_from_last_use = false; // For lowering - // + // positional_printer printer; bool in_definite_init = false; bool in_parameter_list = false; std::string function_return_name; std::vector function_returns; - parameter_declaration_list_node single_anon; + parameter_declaration_list_node single_anon; // special value - hack for now to note single-anon-return type kind in this function_returns working list std::vector function_requires_conditions; public: //----------------------------------------------------------------------- // Constructor - // + // // filename the source file to be processed // cppfront(std::string const& filename) @@ -696,7 +696,7 @@ class cppfront //----------------------------------------------------------------------- // lower_to_cpp1 - // + // // Emits the target file with the last '2' stripped -> .cpp // auto lower_to_cpp1() -> void @@ -734,7 +734,7 @@ class cppfront auto cpp2_found = false; for ( - lineno_t curr_lineno = 0; + lineno_t curr_lineno = 0; auto const& line : source.get_lines() ) { @@ -744,8 +744,8 @@ class cppfront // If it's a Cpp1 line, emit it if (line.cat != source_line::category::cpp2) { - if (flag_cpp2_only && - !line.text.empty() && + if (flag_cpp2_only && + !line.text.empty() && line.cat != source_line::category::comment && line.cat != source_line::category::import ) @@ -847,14 +847,14 @@ class cppfront //----------------------------------------------------------------------- // // emit() functions - each emits a kind of node - // + // // The body often mirrors the node's visit() function, unless customization // is needed where Cpp1 and Cpp2 have different grammar orders // //----------------------------------------------------------------------- // try_emit - // + // // Helper to emit whatever is in a variant where each // alternative is a smart pointer // @@ -903,7 +903,7 @@ class cppfront // Scan back to find the matching ( auto paren_depth = 1; auto open = pos - 2; - + // "next" in the string is the "last" one encountered in the backwards scan auto last_nonwhitespace = '\0'; @@ -1014,7 +1014,7 @@ class cppfront bool add_std_forward = last_use && last_use->is_forward; - bool add_std_move = + bool add_std_move = !add_std_forward && (in_synthesized_multi_return || (last_use && !suppress_move_from_last_use)); @@ -1052,8 +1052,8 @@ class cppfront printer.print_cpp2(".value()", n.position()); } else if (!in_definite_init && !in_parameter_list) { - if (auto decl = sema.get_declaration_of(*n.identifier); - decl && + if (auto decl = sema.get_declaration_of(*n.identifier); + decl && // note pointer equality: if we're not in the actual declaration of n.identifier decl->identifier != n.identifier && // and this variable was uninitialized @@ -1114,11 +1114,11 @@ class cppfront //----------------------------------------------------------------------- // auto emit( - compound_statement_node const& n, + compound_statement_node const& n, std::vector const& function_prolog = {}, std::vector const& function_epilog = {}, colno_t function_indent = 1 - ) + ) -> void { auto pos = n.open_brace; @@ -1342,7 +1342,7 @@ class cppfront emit(*n.statement); printer.print_cpp2(" while ( ", n.position()); emit(*n.condition); - if (n.next_expression) { + if (n.next_expression) { // Gotta say, this feels kind of nifty... short-circuit eval // and smuggling work into a condition via a lambda, O my... printer.print_cpp2(" && [&]{ ", n.position()); @@ -1372,14 +1372,14 @@ class cppfront // If there's a next-expression, smuggle it in via a nested do/while(false) loop // (nested "continue" will work, but "break" won't until we do extra work to implement // that using a flag and implementing "break" as "__for_break = true; continue;") - if (n.next_expression) { + if (n.next_expression) { printer.print_cpp2(" { do ", n.position()); } assert(n.body->initializer); emit(*n.body->initializer); - if (n.next_expression) { + if (n.next_expression) { printer.print_cpp2(" while (false); ", n.position()); emit(*n.next_expression); printer.print_cpp2("; }", n.position()); @@ -1432,7 +1432,7 @@ class cppfront } first = false; assert(param->declaration->identifier); - + printer.emit_to_string(&stmt); emit(*param->declaration->identifier, true); printer.emit_to_string(); @@ -1501,7 +1501,7 @@ class cppfront if (n.expr.index() == primary_expression_node::declaration) { auto& decl = std::get(n.expr); - + // The usual non-null assertion, plus it should be an anonymous function assert(decl && !decl->identifier && decl->is(declaration_node::function)); @@ -1514,7 +1514,7 @@ class cppfront //----------------------------------------------------------------------- // - auto emit(postfix_expression_node& n, bool for_lambda_capture = false) -> void + auto emit(postfix_expression_node& n, bool for_lambda_capture = false) -> void // note: parameter is deliberately not const because we we will fill // in the capture .str information, and we may also adjust token // column positions when moving operators to prefix notation @@ -1541,7 +1541,7 @@ class cppfront } else { - if (n.ops.front().op->type() == lexeme::PlusPlus || + if (n.ops.front().op->type() == lexeme::PlusPlus || n.ops.front().op->type() == lexeme::MinusMinus || n.ops.front().op->type() == lexeme::LeftBracket ) { @@ -1599,12 +1599,12 @@ class cppfront // and if so use this path to convert it to UFCS if (// there's a single-token expression followed by . and ( n.expr->get_token() && // if the base expression is a single token - std::ssize(n.ops) >= 2 && // and we're of the form: + std::ssize(n.ops) >= 2 && // and we're of the form: n.ops[0].op->type() == lexeme::Dot && // token . id-expr ( expr-list ) n.ops[1].op->type() == lexeme::LeftParen && // and either there's nothing after that, or there's just a $ after that ( - std::ssize(n.ops) == 2 || + std::ssize(n.ops) == 2 || (std::ssize(n.ops) == 3 && n.ops[2].op->type() == lexeme::Dollar) ) ) @@ -1684,10 +1684,10 @@ class cppfront // Handle the Cpp2 postfix operators that are prefix in Cpp1 // - if (i->op->type() == lexeme::MinusMinus || - i->op->type() == lexeme::PlusPlus || - i->op->type() == lexeme::Multiply || - i->op->type() == lexeme::Ampersand || + if (i->op->type() == lexeme::MinusMinus || + i->op->type() == lexeme::PlusPlus || + i->op->type() == lexeme::Multiply || + i->op->type() == lexeme::Ampersand || i->op->type() == lexeme::Tilde ) { @@ -1725,7 +1725,7 @@ class cppfront } else if (i->op_close) { suffix.emplace_back( i->op_close->to_string(true), i->op_close->position() ); - } + } if (i->id_expr) { auto print = std::string{}; @@ -1763,8 +1763,8 @@ class cppfront // If this is an --, ++, or &, don't add std::move on the lhs // even if this is a definite last use (only do that when an rvalue is okay) - if (n.ops.front().op->type() == lexeme::MinusMinus || - n.ops.front().op->type() == lexeme::PlusPlus || + if (n.ops.front().op->type() == lexeme::MinusMinus || + n.ops.front().op->type() == lexeme::PlusPlus || n.ops.front().op->type() == lexeme::Ampersand ) { @@ -1884,7 +1884,7 @@ class cppfront violates_lifetime_safety = true; } else if ( - *n.terms.front().op == "+" || *n.terms.front().op == "+=" || + *n.terms.front().op == "+" || *n.terms.front().op == "+=" || *n.terms.front().op == "-" || *n.terms.front().op == "-=" ) { @@ -1894,7 +1894,7 @@ class cppfront ); violates_bounds_safety = true; } - } + } for (auto const& x : n.terms) { assert(x.op); @@ -1991,8 +1991,8 @@ class cppfront //----------------------------------------------------------------------- // auto emit( - statement_node const& n, - bool can_have_semicolon = true, + statement_node const& n, + bool can_have_semicolon = true, source_position function_body_start = {}, bool function_void_ret = false, std::vector const& function_prolog = {}, @@ -2184,9 +2184,9 @@ class cppfront // if (*n.kind == "post") { auto lambda_intro = build_capture_lambda_intro_for(n.captures, n.position()); - printer.print_cpp2( - "auto post_" + std::to_string(n.position().lineno) + "_" + - std::to_string(n.position().colno) + " = cpp2::finally_success(" + + printer.print_cpp2( + "auto post_" + std::to_string(n.position().lineno) + "_" + + std::to_string(n.position().colno) + " = cpp2::finally_success(" + lambda_intro + "{", n.position() ); @@ -2262,7 +2262,7 @@ class cppfront printer.print_cpp2( " -> void", n.position() ); } } - + else if (n.returns.index() == function_type_node::id) { printer.print_cpp2( " -> ", n.position() ); auto& r = std::get(n.returns); @@ -2289,7 +2289,7 @@ class cppfront { // If this is a function that has multiple return values, // first we need to emit the struct that contains the returns - if (printer.doing_declarations_only() && n.is(declaration_node::function)) + if (printer.doing_declarations_only() && n.is(declaration_node::function)) { auto& func = std::get(n.type); assert(func); @@ -2441,8 +2441,8 @@ class cppfront printer.ignore_alignment( false ); } - emit( - *n.initializer, + emit( + *n.initializer, true, func->position(), n.identifier && func->returns.index() == function_type_node::empty, function_return_locals, function_epilog, n.position().colno ); @@ -2553,7 +2553,7 @@ class cppfront } } - + //----------------------------------------------------------------------- // has_cpp1: pass through // @@ -2561,7 +2561,7 @@ class cppfront return source.has_cpp1(); } - + //----------------------------------------------------------------------- // has_cpp2: pass through // @@ -2581,10 +2581,10 @@ using namespace std; using namespace cpp2; static auto enable_debug_output_files = false; -static cmdline_processor::register_flag cmd_noline( +static cmdline_processor::register_flag cmd_noline( 9, - "debug", - "Emit compiler debug output files", + "debug", + "Emit compiler debug output files", []{ enable_debug_output_files = true; } ); diff --git a/source/lex.h b/source/lex.h index b16af9a06..c45b747e1 100644 --- a/source/lex.h +++ b/source/lex.h @@ -26,9 +26,9 @@ namespace cpp2 { //----------------------------------------------------------------------- -// +// // lexeme: represents the type of a token -// +// //----------------------------------------------------------------------- // @@ -169,19 +169,19 @@ auto as(lexeme l) //----------------------------------------------------------------------- -// +// // token: represents a single token -// +// // Note: by reference, thge test into the program's source lines -// +// //----------------------------------------------------------------------- // -class token +class token { public: - token( - char const* start, - auto count, + token( + char const* start, + auto count, source_position pos, lexeme type ) @@ -192,24 +192,24 @@ class token { } - operator std::string_view() const - { + operator std::string_view() const + { assert (start); - return {start, (unsigned)count}; + return {start, (unsigned)count}; } auto operator== (token const& t) const -> bool - { + { return operator std::string_view() == t.operator std::string_view(); } auto operator== (std::string_view s) const -> bool - { - return s == this->operator std::string_view(); + { + return s == this->operator std::string_view(); } - auto to_string( bool text_only = false ) const -> std::string - { + auto to_string( bool text_only = false ) const -> std::string + { auto text = std::string{start, (unsigned)count}; if (text_only) { return text; @@ -220,11 +220,11 @@ class token } friend auto operator<< (auto& o, token const& t) -> auto& - { - return o << std::string_view(t); + { + return o << std::string_view(t); } - auto position_col_shift( colno_t offset ) -> void { + auto position_col_shift( colno_t offset ) -> void { assert (pos.colno + offset > 0); pos.colno += offset; } @@ -283,7 +283,7 @@ auto lex_line( // auto peek = [&](int num) { return (i+num < std::ssize(line)) ? line[i+num] : '\0'; }; - auto store = [&](int16_t num, lexeme type) + auto store = [&](int16_t num, lexeme type) { tokens.push_back({ &line[i], @@ -316,9 +316,9 @@ auto lex_line( //G auto peek_is_hexadecimal_escape_sequence = [&](int offset) { - if (peek( offset) == '\\' && - peek(1+offset) == 'x' && - is_hexadecimal_digit(peek(2+offset))) + if (peek( offset) == '\\' && + peek(1+offset) == 'x' && + is_hexadecimal_digit(peek(2+offset))) { auto j = 3; while (peek(j+offset) && is_hexadecimal_digit(peek(j+offset))) @@ -334,7 +334,7 @@ auto lex_line( //G \u { hexadecimal-digit }4 //G \U { hexadecimal-digit }8 //G - auto peek_is_universal_character_name = [&](colno_t offset) + auto peek_is_universal_character_name = [&](colno_t offset) { if (peek(offset) == '\\' && peek(1 + offset) == 'u') { auto j = 2; @@ -351,7 +351,7 @@ auto lex_line( while (j <= 9 && is_hexadecimal_digit(peek(j+offset))) { ++j; } if (j == 10) { return j; } errors.emplace_back( - source_position(lineno, i+offset), + source_position(lineno, i+offset), "invalid universal character name (\\U must" " be followed by 8 hexadecimal digits)" ); @@ -363,8 +363,8 @@ auto lex_line( //G hexadecimal-escape-sequence //G simple-escape-sequence //G - auto peek_is_escape_sequence = [&](int offset) - { + auto peek_is_escape_sequence = [&](int offset) + { if (auto h = peek_is_hexadecimal_escape_sequence(offset)) { return h; } return peek_is_simple_escape_sequence(offset); }; @@ -374,7 +374,7 @@ auto lex_line( //G escape-sequence //G basic-s-char //G - //G basic-s-char: + //G basic-s-char: //G any member of the basic source character set except " \ or new-line //G //G c-char: @@ -382,11 +382,11 @@ auto lex_line( //G escape-sequence //G basic-c-char //G - //G basic-c-char: + //G basic-c-char: //G any member of the basic source character set except ' \ or new-line //G - auto peek_is_sc_char = [&](int offset, char quote) - { + auto peek_is_sc_char = [&](int offset, char quote) + { if (auto u = peek_is_universal_character_name(offset)) { return u; } if (auto e = peek_is_escape_sequence(offset)) @@ -400,8 +400,8 @@ auto lex_line( //G any Cpp1-and-Cpp2 keyword //G one of: import module export is as //G - auto peek_is_keyword = [&]() - { + auto peek_is_keyword = [&]() + { // Cpp2 has a smaller set of the Cpp1 globally reserved keywords, but we continue to // reserve all the ones Cpp1 has both for compatibility and to not give up a keyword // Some keywords like "delete" and "union" are not in this list because we reject them elsewhere @@ -446,7 +446,7 @@ auto lex_line( // //----------------------------------------------------- - for ( ; i < ssize(line); ++i) + for ( ; i < ssize(line); ++i) { auto peek1 = peek(1); auto peek2 = peek(2); @@ -455,7 +455,7 @@ auto lex_line( //G encoding-prefix: one of //G u8 u //G - auto is_encoding_prefix_and = [&](char next) { + auto is_encoding_prefix_and = [&](char next) { if (line[i] == next) { return 1; } else if (line[i] == 'u') { if (peek1 == next) { return 2; } @@ -492,7 +492,7 @@ auto lex_line( else { //G token: //G identifier - //G keyword + //G keyword //G literal //G operator-or-punctuator //G @@ -507,7 +507,7 @@ auto lex_line( // /* and // comment starts //G /= / - break;case '/': + break;case '/': if (peek1 == '*') { current_comment = "/*"; current_comment_start = source_position(lineno, i+1); @@ -533,11 +533,11 @@ auto lex_line( //G <<= << <=> <= < break;case '<': - if (peek1 == '<') { + if (peek1 == '<') { if (peek2 == '=') { store(3, lexeme::LeftShiftEq); } else { store(2, lexeme::LeftShift); } } - else if (peek1 == '=') { + else if (peek1 == '=') { if (peek2 == '>') { store(3, lexeme::Spaceship); } else { store(2, lexeme::LessEq); } } @@ -546,11 +546,11 @@ auto lex_line( ////G >>= >> >= > //G >= > break;case '>': - //if (peek1 == '>') { + //if (peek1 == '>') { // if (peek2 == '=') { store(3, lexeme::RightShiftEq); } // else { store(2, lexeme::RightShift); } //} - //else + //else if (peek1 == '=') { store(2, lexeme::GreaterEq); } else { store(1, lexeme::Greater); } @@ -569,7 +569,7 @@ auto lex_line( //G ||= || |= | break;case '|': - if (peek1 == '|') { + if (peek1 == '|') { if (peek2 == '=') { store(3, lexeme::LogicalOrEq); } else { store(2, lexeme::LogicalOr); } } @@ -577,8 +577,8 @@ auto lex_line( else { store(1, lexeme::Pipe); } //G &&= && &= & - break;case '&': - if (peek1 == '&') { + break;case '&': + if (peek1 == '&') { if (peek2 == '=') { store(3, lexeme::LogicalAndEq); } else { store(2, lexeme::LogicalAnd); } } @@ -588,42 +588,42 @@ auto lex_line( // Next, all the other operators that have a compound assignment form //G *= * - break;case '*': + break;case '*': if (peek1 == '=') { store(2, lexeme::MultiplyEq); } else { store(1, lexeme::Multiply); } //G %= % - break;case '%': + break;case '%': if (peek1 == '=') { store(2, lexeme::ModuloEq); } else { store(1, lexeme::Modulo); } //G ^= ^ - break;case '^': + break;case '^': if (peek1 == '=') { store(2, lexeme::CaretEq); } else { store(1, lexeme::Caret); } //G ~= ~ - break;case '~': + break;case '~': if (peek1 == '=') { store(2, lexeme::TildeEq); } else { store(1, lexeme::Tilde); } //G == = - break;case '=': + break;case '=': if (peek1 == '=') { store(2, lexeme::EqualComparison); } else { store(1, lexeme::Assignment); } //G != - break;case '!': + break;case '!': if (peek1 == '=') { store(2, lexeme::NotEqualComparison); } //else { store(1, lexeme::Not); } //G ... . - break;case '.': + break;case '.': if (peek1 == '.' && peek2 == '.') { store(3, lexeme::Ellipsis); } else { store(1, lexeme::Dot); } //G :: : - break;case ':': + break;case ':': if (peek1 == ':') { store(2, lexeme::Scope); } else { store(1, lexeme::Colon); } @@ -632,19 +632,19 @@ auto lex_line( //G { } ( ) [ ] ; , ? $ //G - break;case '{': + break;case '{': store(1, lexeme::LeftBrace); - break;case '}': + break;case '}': store(1, lexeme::RightBrace); - break;case '(': + break;case '(': store(1, lexeme::LeftParen); break;case ')': store(1, lexeme::RightParen); - break;case '[': + break;case '[': store(1, lexeme::LeftBracket); break;case ']': @@ -694,7 +694,7 @@ auto lex_line( } else { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "binary literal cannot be empty (0B must be followed by binary digits)" ); ++i; @@ -707,7 +707,7 @@ auto lex_line( } else { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "hexadecimal literal cannot be empty (0X must be followed by hexadecimal digits)" ); ++i; @@ -725,11 +725,11 @@ auto lex_line( //G decimal-literal: //G digit { ' | digit }* - //G + //G //G floating-point-literal: //G digit { ' | digit }* . digit { ' | digit }* //GTODO full grammar - //G + //G else if (is_digit(line[i])) { auto j = 1; while (is_separator_or(is_digit,peek(j))) { ++j; } @@ -754,10 +754,10 @@ auto lex_line( //G else if (auto j = is_encoding_prefix_and('\"')) { while (auto len = peek_is_sc_char(j, '\"')) { j += len; } - if (peek(j) != '\"') { + if (peek(j) != '\"') { errors.emplace_back( source_position(lineno, i), - "string literal \"" + std::string(&line[i+1],j) + "string literal \"" + std::string(&line[i+1],j) + "\" is missing its closing \"" ); } @@ -768,20 +768,20 @@ auto lex_line( //G else if (auto j = is_encoding_prefix_and('\'')) { auto len = peek_is_sc_char(j, '\''); - if (len > 0) { - j += len; - if (peek(j) != '\'') { + if (len > 0) { + j += len; + if (peek(j) != '\'') { errors.emplace_back( - source_position(lineno, i), - "character literal '" + std::string(&line[i+1],j) + source_position(lineno, i), + "character literal '" + std::string(&line[i+1],j) + "' is missing its closing '" - ); + ); } store(j+1, lexeme::CharacterLiteral); } else { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "character literal is empty" ); } @@ -799,23 +799,23 @@ auto lex_line( store(j, lexeme::Identifier); if (tokens.back() == "NULL") { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "'NULL' is not supported in Cpp2 - for a local pointer variable, leave it uninitialized instead, and set it to a non-null value when you have one" ); } if (tokens.back() == "union") { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "unsafe 'union's are not supported in Cpp2 - use std::variant instead" ); } if (tokens.back() == "delete") { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "'delete' and owning raw pointers are not supported in Cpp2" ); errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), " - use unique.new, shared.new, or gc.new instead (in that order)" ); } @@ -825,7 +825,7 @@ auto lex_line( // else if (!isspace(line[i])) { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), std::string("unexpected text '") + line[i] + "'" ); } @@ -844,13 +844,13 @@ auto lex_line( //----------------------------------------------------------------------- -// +// // tokens: a map of the tokens of a source file -// +// //----------------------------------------------------------------------- // -class tokens +class tokens { std::vector& errors; @@ -858,7 +858,7 @@ class tokens std::map> grammar_map; // All comment tokens go here, which are applied in the lexer - // + // // We could put all the tokens in the same map, but that would mean the // parsing logic would have to remember to skip comments everywhere... // simpler to keep comments separate, at the smaller cost of traversing @@ -868,7 +868,7 @@ class tokens public: //----------------------------------------------------------------------- // Constructor - // + // // errors error list // tokens( @@ -884,7 +884,7 @@ class tokens // // lines tagged source lines // - auto lex( + auto lex( std::vector const& lines ) -> void @@ -909,9 +909,9 @@ class tokens auto current_comment = std::string{}; auto current_comment_start = source_position{}; - for ( - ; - line != std::end(lines) && line->cat == source_line::category::cpp2; + for ( + ; + line != std::end(lines) && line->cat == source_line::category::cpp2; ++line, ++lineno ) { @@ -925,11 +925,11 @@ class tokens } } - + //----------------------------------------------------------------------- // get_map: Access the token map // - auto get_map() const -> auto const& + auto get_map() const -> auto const& { return grammar_map; } @@ -938,7 +938,7 @@ class tokens //----------------------------------------------------------------------- // get_comments: Access the comment list // - auto get_comments() const -> auto const& + auto get_comments() const -> auto const& { return comments; } @@ -947,14 +947,14 @@ class tokens //----------------------------------------------------------------------- // debug_print // - auto debug_print(std::ostream& o) const -> void + auto debug_print(std::ostream& o) const -> void { for (auto const& [lineno, entry] : grammar_map) { o << "--- Section starting at line " << lineno << "\n"; for (auto const& token : entry) { o << " " << token << " (" << token.position().lineno - << "," << token.position().colno << ") " + << "," << token.position().colno << ") " << as(token.type()) << "\n"; } diff --git a/source/load.h b/source/load.h index 81405ef11..0bf73e470 100644 --- a/source/load.h +++ b/source/load.h @@ -30,7 +30,7 @@ namespace cpp2 { //--------------------------------------------------------------------------- // move_next: advances i as long as p(line[i]) is true or the end of line -// +// // line current line being processed // i current index // p predicate to apply @@ -46,7 +46,7 @@ auto move_next(std::string const& line, int& i, auto p) -> bool //--------------------------------------------------------------------------- // peek_first_non_whitespace: returns the first non-whitespace char in line -// +// // line current line being processed // auto peek_first_non_whitespace(std::string const& line) -> char @@ -65,15 +65,15 @@ auto peek_first_non_whitespace(std::string const& line) -> char //--------------------------------------------------------------------------- // is_preprocessor: returns whether this is a preprocessor line starting // with #, and whether it will be followed by another preprocessor line -// +// // line current line being processed // first_line whether this is supposed to be the first line (start with #) // -struct is_preprocessor_ret { - bool is_preprocessor; - bool has_continuation; +struct is_preprocessor_ret { + bool is_preprocessor; + bool has_continuation; }; -auto is_preprocessor(std::string const& line, bool first_line) +auto is_preprocessor(std::string const& line, bool first_line) -> is_preprocessor_ret { // see if the first non-whitespace is # @@ -88,7 +88,7 @@ auto is_preprocessor(std::string const& line, bool first_line) //--------------------------------------------------------------------------- // starts_with_import: returns whether the line starts with "import" -// +// // line current line being processed // auto starts_with_import(std::string const& line) -> bool @@ -108,7 +108,7 @@ auto starts_with_import(std::string const& line) -> bool //--------------------------------------------------------------------------- // starts_with_whitespace_slash_slash: is this a "// comment" line -// +// // line current line being processed // auto starts_with_whitespace_slash_slash(std::string const& line) -> bool @@ -126,10 +126,10 @@ auto starts_with_whitespace_slash_slash(std::string const& line) -> bool //--------------------------------------------------------------------------- // starts_with_whitespace_slash_star_and_no_star_slash: is this a "/* comment" line -// +// // line current line being processed // -auto starts_with_whitespace_slash_star_and_no_star_slash(std::string const& line) +auto starts_with_whitespace_slash_star_and_no_star_slash(std::string const& line) -> bool { auto i = 0; @@ -151,7 +151,7 @@ auto starts_with_whitespace_slash_star_and_no_star_slash(std::string const& line //--------------------------------------------------------------------------- // starts_with_identifier_colon: returns whether the line starts with an // identifier followed by one colon (not ::) -// +// // line current line being processed // auto starts_with_identifier_colon(std::string const& line) -> bool @@ -165,8 +165,8 @@ auto starts_with_identifier_colon(std::string const& line) -> bool // find identifier auto j = starts_with_identifier({&line[i], std::size(line)-i}); - if (j == 0) { - return false; + if (j == 0) { + return false; } i += j; @@ -188,21 +188,21 @@ auto starts_with_identifier_colon(std::string const& line) -> bool //--------------------------------------------------------------------------- // process_cpp_line: just enough to know what to skip over -// +// // line current line being processed // in_comment track whether we're in a comment // in_string_literal track whether we're in a string literal // -struct process_line_ret { - bool all_comment_line; - bool empty_line; +struct process_line_ret { + bool all_comment_line; + bool empty_line; }; auto process_cpp_line( - std::string const& line, - bool& in_comment, + std::string const& line, + bool& in_comment, bool& in_string_literal, std::vector& brace_depth, - lineno_t lineno, + lineno_t lineno, std::vector& errors ) -> process_line_ret @@ -262,7 +262,7 @@ auto process_cpp_line( errors.emplace_back( source_position(lineno, i), "closing } does not match a prior {" - ); + ); } else { brace_depth.pop_back(); @@ -272,7 +272,7 @@ auto process_cpp_line( break;case '*': if (!in_string_literal && prev == '/') { in_comment = true; } - break;case '/': + break;case '/': if (!in_string_literal && prev == '/') { in_comment = false; return r; } break;default: ; @@ -292,19 +292,19 @@ auto process_cpp_line( // - if ; we're done // - if { find matching } // - then there must be nothing else on the last line -// +// // line current line being processed // in_comment whether this line begins inside a multi-line comment // // Returns: whether additional lines should be inspected // auto process_cpp2_line( - std::string const& line, - bool& in_comment, - std::vector& brace_depth, + std::string const& line, + bool& in_comment, + std::vector& brace_depth, bool& found_semi, bool& found_openbrace, - lineno_t lineno, + lineno_t lineno, std::vector& errors ) -> bool @@ -324,12 +324,12 @@ auto process_cpp2_line( else { if (found_end && !isspace(line[i])) { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), std::string("unexpected text '") + line[i] + "' - after the closing ; or } of a definition, the rest" " of the line should contain only whitespace or comments" - ); + ); } switch (line[i]) { @@ -339,9 +339,9 @@ auto process_cpp2_line( break;case '}': if (std::ssize(brace_depth) < 1) { errors.emplace_back( - source_position(lineno, i), + source_position(lineno, i), "closing } does not match a prior {" - ); + ); } else { brace_depth.pop_back(); @@ -356,7 +356,7 @@ auto process_cpp2_line( break;case '*': if (prev == '/') { in_comment = true; } - break;case '/': + break;case '/': if (prev == '/') { in_comment = false; return false; } break;default: ; @@ -371,12 +371,12 @@ auto process_cpp2_line( //----------------------------------------------------------------------- -// +// // source: Represents a program source file -// +// //----------------------------------------------------------------------- // -class source +class source { std::vector& errors; std::vector lines; @@ -391,7 +391,7 @@ class source public: //----------------------------------------------------------------------- // Constructor - // + // // errors error list // source( @@ -407,7 +407,7 @@ class source //----------------------------------------------------------------------- // has_cpp1: Returns true if this file has some Cpp1/preprocessor lines // (note: import lines don't count toward Cpp1 or Cpp2) - // + // auto has_cpp1() const -> bool { return cpp1_found; } @@ -416,7 +416,7 @@ class source //----------------------------------------------------------------------- // has_cpp2: Returns true if this file has some Cpp2 lines // (note: import lines don't count toward Cpp1 or Cpp2) - // + // auto has_cpp2() const -> bool { return cpp2_found; } @@ -428,8 +428,8 @@ class source // filename the source file to be loaded // source program textual representation // - auto load( - std::string const& filename + auto load( + std::string const& filename ) -> bool { @@ -485,12 +485,12 @@ class source auto found_openbrace = false; while ( !process_cpp2_line( - lines.back().text, - in_comment, - brace_depth, - found_semi, - found_openbrace, - std::ssize(lines)-1, + lines.back().text, + in_comment, + brace_depth, + found_semi, + found_openbrace, + std::ssize(lines)-1, errors ) && in.getline(&buf[0], max_line_len) @@ -509,10 +509,10 @@ class source } else { auto stats = process_cpp_line( - lines.back().text, - in_comment, + lines.back().text, + in_comment, in_string_literal, - brace_depth, + brace_depth, std::ssize(lines) - 1, errors ); @@ -536,8 +536,8 @@ class source if (in.gcount() >= max_line_len-1) { errors.emplace_back( - source_position(lineno_t(std::ssize(lines)), 0), - std::string("source line too long - length must be less than ") + source_position(lineno_t(std::ssize(lines)), 0), + std::string("source line too long - length must be less than ") + std::to_string(max_line_len) ); return false; @@ -548,7 +548,7 @@ class source if (!in.eof()) { errors.emplace_back( - source_position(lineno_t(std::ssize(lines)), 0), + source_position(lineno_t(std::ssize(lines)), 0), std::string("unexpected error reading source lines - did not reach EOF") ); return false; @@ -562,8 +562,8 @@ class source unmatched_brace_lines = std::to_string(line) + " " + unmatched_brace_lines; } errors.emplace_back( - source_position(lineno_t(std::ssize(lines)), 0), - std::string("end of file reached with ") + source_position(lineno_t(std::ssize(lines)), 0), + std::string("end of file reached with ") + std::to_string(std::ssize(brace_depth)) + " missing } to match earlier { on line" + (std::size(brace_depth)>1 ? "s " : " " ) @@ -578,7 +578,7 @@ class source //----------------------------------------------------------------------- // get_lines: Access the source lines // - auto get_lines() const -> std::vector const& + auto get_lines() const -> std::vector const& { return lines; } @@ -587,7 +587,7 @@ class source //----------------------------------------------------------------------- // debug_print // - auto debug_print(std::ostream& o) const -> void + auto debug_print(std::ostream& o) const -> void { for (auto lineno = 0; auto const& line : lines) { // Skip dummy first entry diff --git a/source/parse.h b/source/parse.h index 5763c1251..4315a74a1 100644 --- a/source/parse.h +++ b/source/parse.h @@ -36,7 +36,7 @@ auto violates_lifetime_safety = false; //G one of not //G auto is_prefix_operator(lexeme l) -> bool -{ +{ return l == lexeme::Not; } @@ -45,7 +45,7 @@ auto is_prefix_operator(lexeme l) -> bool //G one of ++ -- * & ~ $ //G auto is_postfix_operator(lexeme l) -> bool -{ +{ switch (l) { break;case lexeme::PlusPlus: case lexeme::MinusMinus: @@ -64,7 +64,7 @@ auto is_postfix_operator(lexeme l) -> bool //G one of = *= /= %= += -= >>= <<= //G auto is_assignment_operator(lexeme l) -> bool -{ +{ switch (l) { break;case lexeme::Assignment: case lexeme::MultiplyEq: @@ -85,15 +85,15 @@ auto is_assignment_operator(lexeme l) -> bool //----------------------------------------------------------------------- -// +// // Parse tree node types -// +// //----------------------------------------------------------------------- // //----------------------------------------------------------------------- // try_emit -// +// // Helper to visit whatever is in a variant where each // alternative is a smart pointer // @@ -150,7 +150,7 @@ template< String Name, typename Term > -struct binary_expression_node +struct binary_expression_node { std::unique_ptr expr; @@ -311,7 +311,7 @@ struct postfix_expression_node { std::unique_ptr expr; - struct term + struct term { token const* op; @@ -1078,12 +1078,12 @@ struct translation_unit_node //----------------------------------------------------------------------- -// +// // parser: parses a section of Cpp2 code -// +// //----------------------------------------------------------------------- // -class parser +class parser { std::vector& errors; @@ -1094,14 +1094,14 @@ class parser struct capture_groups_stack_guard { parser* pars; - capture_groups_stack_guard(parser* p, capture_group* cg) - : pars{p} - { + capture_groups_stack_guard(parser* p, capture_group* cg) + : pars{p} + { assert(p); assert(cg); pars->current_capture_groups.push_back(cg); } - ~capture_groups_stack_guard() { + ~capture_groups_stack_guard() { pars->current_capture_groups.pop_back(); } }; @@ -1113,7 +1113,7 @@ class parser public: //----------------------------------------------------------------------- // Constructor - // + // // errors error list // parser( @@ -1126,9 +1126,9 @@ class parser //----------------------------------------------------------------------- // parse - // + // // tokens input tokens for this section of Cpp2 source code - // + // // Each call parses this section's worth of tokens and adds the // result to the stored parse tree. Call this repeatedly for the Cpp2 // sections in a TU to build the whole TU's parse tree @@ -1159,7 +1159,7 @@ class parser //----------------------------------------------------------------------- // get_parse_tree - // + // // Get the entire parse tree, from the root (translation_unit_node) // auto get_parse_tree() -> translation_unit_node& @@ -1200,7 +1200,7 @@ class parser //----------------------------------------------------------------------- // visit // - auto visit(auto& v) -> void + auto visit(auto& v) -> void { parse_tree->visit(v, 0); } @@ -1208,9 +1208,9 @@ class parser private: //----------------------------------------------------------------------- // Error reporting: Fed into the supplied this->error object - // + // // msg message to be printed - // + // // include_curr_token in this file (during parsing)_ we normally want // to show the current token as the unexpected text // we encountered, but some sema rules are applied @@ -1219,8 +1219,8 @@ class parser // we detect and reject a "std::move" qualified-id, // it's not relevant to add "at LeftParen: (" // just because ( happens to be the next token) - // - auto error(char const* msg, bool include_curr_token = true) const -> void + // + auto error(char const* msg, bool include_curr_token = true) const -> void { auto m = std::string{msg}; if (include_curr_token) { @@ -1228,17 +1228,17 @@ class parser } errors.emplace_back( curr().position(), m ); } - - auto error(std::string const& msg, bool include_curr_token = true) const -> void - { - error(msg.c_str()); + + auto error(std::string const& msg, bool include_curr_token = true) const -> void + { + error(msg.c_str()); } //----------------------------------------------------------------------- // Token navigation: Only these functions should access this->token_ - // - auto curr() const -> token const& + // + auto curr() const -> token const& { if (done()) { throw std::runtime_error("unexpected end of source file"); @@ -1247,7 +1247,7 @@ class parser return (*tokens_)[pos]; } - auto peek(int num) const -> token const* + auto peek(int num) const -> token const* { assert (tokens_); if (pos + num >= 0 && pos + num < std::ssize(*tokens_)) { @@ -1256,14 +1256,14 @@ class parser return {}; } - auto done() const -> bool + auto done() const -> bool { assert (tokens_); assert (pos <= std::ssize(*tokens_)); return pos == std::ssize(*tokens_); } - auto next(int num = 1) -> void + auto next(int num = 1) -> void { assert (tokens_); pos = std::min( pos+num, as(std::ssize(*tokens_)) ); @@ -1272,7 +1272,7 @@ class parser //----------------------------------------------------------------------- // Parsers for unary expressions - // + // //G primary-expression: //G literal @@ -1281,7 +1281,7 @@ class parser //G unnamed-declaration //G inspect-expression //G - auto primary_expression() + auto primary_expression() -> std::unique_ptr { auto n = std::make_unique(); @@ -1305,7 +1305,7 @@ class parser curr().type() == lexeme::BinaryLiteral || curr().type() == lexeme::HexadecimalLiteral || curr().type() == lexeme::Keyword - ) + ) { n->expr = &curr(); next(); @@ -1369,7 +1369,7 @@ class parser //G postfix-expression ( expression-list? ) //G postfix-expression . id-expression //G - auto postfix_expression() + auto postfix_expression() -> std::unique_ptr { auto n = std::make_unique(); @@ -1468,7 +1468,7 @@ class parser //GTODO alignof ( type-id ) //GTODO throws-expression //G - auto prefix_expression() + auto prefix_expression() -> std::unique_ptr { auto n = std::make_unique(); @@ -1484,7 +1484,7 @@ class parser //----------------------------------------------------------------------- // Parsers for binary expressions - // + // // The general /*binary*/-expression: // /*term*/-expression { { /* operators at this predecence level */ } /*term*/-expression }* @@ -1498,7 +1498,7 @@ class parser IsValidOp is_valid_op, TermFunc term ) - -> std::unique_ptr + -> std::unique_ptr { auto n = std::make_unique(); if ( (n->expr = term()) ) { @@ -1519,14 +1519,14 @@ class parser } //G is-as-expression: - //G prefix-expression + //G prefix-expression //GTODO is-as-expression is-expression-constraint //GTODO is-as-expression as-type-cast //GTODO type-id is-type-constraint //G auto is_as_expression() { return binary_expression ( - [](token const& t){ + [](token const& t){ std::string_view s{t}; return t.type() == lexeme::Keyword && (s == "is" || s == "as"); }, @@ -1535,7 +1535,7 @@ class parser } //G multiplicative-expression: - //G is-as-expression + //G is-as-expression //G multiplicative-expression * is-as-expression //G multiplicative-expression / is-as-expression //G multiplicative-expression % is-as-expression @@ -1560,9 +1560,9 @@ class parser } //G shift-expression: - //G additive-expression - //G shift-expression << additive-expression - //G shift-expression >> additive-expression + //G additive-expression + //G shift-expression << additive-expression + //G shift-expression >> additive-expression //G auto shift_expression() { return binary_expression ( @@ -1572,7 +1572,7 @@ class parser } //G compare-expression: - //G shift-expression + //G shift-expression //G compare-expression <=> shift-expression //G auto compare_expression() { @@ -1760,7 +1760,7 @@ class parser //G expression //G id-expression //G - auto unqualified_id() -> std::unique_ptr + auto unqualified_id() -> std::unique_ptr { // Handle the identifier if (curr().type() != lexeme::Identifier && @@ -1814,7 +1814,7 @@ class parser } // Use the lambda trick to jam in a "next" clause while ( - curr().type() == lexeme::Comma && + curr().type() == lexeme::Comma && [&]{term.comma = curr().position(); next(); return true;}() ); // When this is rewritten in Cpp2, it will be: @@ -1844,7 +1844,7 @@ class parser //G member-name-specifier: //G unqualified-id . //G - auto qualified_id() -> std::unique_ptr + auto qualified_id() -> std::unique_ptr { auto n = std::make_unique(); @@ -1907,7 +1907,7 @@ class parser //G unqualified-id //G qualified-id //G - auto id_expression() -> std::unique_ptr + auto id_expression() -> std::unique_ptr { auto n = std::make_unique(); if (auto id = qualified_id()) { @@ -1930,20 +1930,20 @@ class parser //G expression ; //G expression //G - auto expression_statement(bool semicolon_required) -> std::unique_ptr + auto expression_statement(bool semicolon_required) -> std::unique_ptr { auto n = std::make_unique(); if (!(n->expr = expression())) { return {}; } - if (semicolon_required && curr().type() != lexeme::Semicolon && + if (semicolon_required && curr().type() != lexeme::Semicolon && peek(-1)->type() != lexeme::Semicolon // this last peek(-1)-condition is a hack (? or is it just // maybe elegant? I'm torn) so that code like // // callback := :(inout x:_) = x += "suffix"; ; - // + // // doesn't need the redundant semicolon at the end of a decl... // there's probably a cleaner way to do it, but this works and // it doesn't destabilize any regression tests @@ -1964,7 +1964,7 @@ class parser //G if constexpr-opt expression compound-statement //G if constexpr-opt expression compound-statement else compound-statement //G - auto selection_statement() -> std::unique_ptr + auto selection_statement() -> std::unique_ptr { if (curr().type() != lexeme::Keyword || curr() != "if") { return {}; @@ -1997,7 +1997,7 @@ class parser if (curr().type() != lexeme::Keyword || curr() != "else") { // Add empty else branch to simplify processing elsewhere // Note: Position (0,0) signifies it's implicit (no source location) - n->false_branch = + n->false_branch = std::make_unique( source_position(0,0) ); } else { @@ -2020,7 +2020,7 @@ class parser //G return-statement: //G return expression-opt ; //G - auto return_statement() -> std::unique_ptr + auto return_statement() -> std::unique_ptr { if (curr().type() != lexeme::Keyword || curr() != "return") { return {}; @@ -2062,11 +2062,11 @@ class parser //G for expression next-clause-opt do unnamed-declaration //G //G next-clause: - //G next assignment-expression + //G next assignment-expression //G - auto iteration_statement() -> std::unique_ptr + auto iteration_statement() -> std::unique_ptr { - if (curr().type() != lexeme::Keyword || + if (curr().type() != lexeme::Keyword || (curr() != "while" && curr() != "do" && curr() != "for") ) { @@ -2103,7 +2103,7 @@ class parser n->condition = std::move(x); return true; }; - + auto handle_compound_statement = [&]() -> bool { auto s = compound_statement(); if (!s) { @@ -2166,7 +2166,7 @@ class parser n->body = unnamed_declaration(curr().position()); auto func = n->body ? std::get_if(&n->body->type) : nullptr; - if (!n->body || n->body->identifier || !func || !*func || + if (!n->body || n->body->identifier || !func || !*func || std::ssize((**func).parameters->parameters) != 1 || (**func).returns.index() != function_type_node::empty ) @@ -2197,7 +2197,7 @@ class parser //G alt-name: //G unqualified-id : //G - auto alternative() -> std::unique_ptr + auto alternative() -> std::unique_ptr { auto n = std::make_unique(); @@ -2216,7 +2216,7 @@ class parser // } // next(); //} - + // Now we should be as "is" or "as" // (initial partial implementation, just "is/as id-expression") if (curr() != "is" && curr() != "as") { @@ -2261,7 +2261,7 @@ class parser //G alternative //G alternative-seq alternative //G - auto inspect_expression(bool is_expression) -> std::unique_ptr + auto inspect_expression(bool is_expression) -> std::unique_ptr { if (curr() != "inspect") { return {}; @@ -2361,7 +2361,7 @@ class parser //G iteration-statement //G inspect-expression //G let parameter-list statement - // + // //GTODO jump-statement //GTODO try-block //G @@ -2451,7 +2451,7 @@ class parser //G statement //G statement-seq statement //G - auto compound_statement(source_position equal_sign = source_position{}) + auto compound_statement(source_position equal_sign = source_position{}) -> std::unique_ptr { if (curr().type() != lexeme::LeftBrace) { @@ -2462,7 +2462,7 @@ class parser // In the case where this is a declaration initializer with // = { - // on the same line, we want to remember our start position + // on the same line, we want to remember our start position // as where the = was, not where the { was if (equal_sign.lineno == curr().position().lineno) { n->open_brace = equal_sign; @@ -2503,14 +2503,14 @@ class parser //G auto parameter_declaration( bool returns = false - ) - -> std::unique_ptr + ) + -> std::unique_ptr { auto n = std::make_unique(); n->pass = returns ? passing_style::out : passing_style::in; n->pos = curr().position(); - if (curr().type() == lexeme::Identifier) { + if (curr().type() == lexeme::Identifier) { if (curr() == "in") { if (returns) { error("a return value cannot be 'in'"); @@ -2553,7 +2553,7 @@ class parser } } - if (curr().type() == lexeme::Identifier) { + if (curr().type() == lexeme::Identifier) { if (curr() == "implicit") { n->mod = parameter_declaration_node::modifier::implicit; next(); @@ -2590,7 +2590,7 @@ class parser auto parameter_declaration_list( bool returns = false ) - -> std::unique_ptr + -> std::unique_ptr { if (curr().type() != lexeme::LeftParen) { return {}; @@ -2605,17 +2605,17 @@ class parser while ((param = parameter_declaration(returns)) != nullptr) { n->parameters.push_back( std::move(param) ); - if (curr().type() == lexeme::RightParen) { + if (curr().type() == lexeme::RightParen) { break; } - else if (curr().type() != lexeme::Comma) { + else if (curr().type() != lexeme::Comma) { error("expected , in parameter list"); return {}; } next(); } - if (curr().type() != lexeme::RightParen) { + if (curr().type() != lexeme::RightParen) { error("invalid parameter list"); next(); return {}; @@ -2711,7 +2711,7 @@ class parser //G contract //G contract-seq contract //G - auto function_type() -> std::unique_ptr + auto function_type() -> std::unique_ptr { auto n = std::make_unique(); @@ -2774,7 +2774,7 @@ class parser //G : id-expression-opt = statement //G : id-expression //G - auto unnamed_declaration(source_position pos, bool semicolon_required = true, bool captures_allowed = false) -> std::unique_ptr + auto unnamed_declaration(source_position pos, bool semicolon_required = true, bool captures_allowed = false) -> std::unique_ptr { auto deduced_type = false; @@ -2786,7 +2786,7 @@ class parser auto n = std::make_unique(); n->pos = pos; - auto guard = + auto guard = captures_allowed ? make_unique(this, &n->captures) : std::unique_ptr() @@ -2886,7 +2886,7 @@ class parser //G declaration: //G identifier unnamed-declaration //G - auto declaration(bool semicolon_required = true) -> std::unique_ptr + auto declaration(bool semicolon_required = true) -> std::unique_ptr { if (done()) { return {}; } @@ -2916,7 +2916,7 @@ class parser //G translation-unit: //G declaration-seq-opt // - auto translation_unit() -> std::unique_ptr + auto translation_unit() -> std::unique_ptr { auto n = std::make_unique(); for (auto d = declaration(); d; d = declaration()) { @@ -2929,9 +2929,9 @@ class parser //----------------------------------------------------------------------- -// +// // Common parts for printing visitors -// +// //----------------------------------------------------------------------- // struct printing_visitor @@ -2961,15 +2961,15 @@ struct printing_visitor //----------------------------------------------------------------------- -// +// // Visitor for printing a parse tree -// +// //----------------------------------------------------------------------- // class parse_tree_printer : printing_visitor { using printing_visitor::printing_visitor; - + std::vector current_expression_list_term = {}; public: @@ -3009,7 +3009,7 @@ class parse_tree_printer : printing_visitor if (current_expression_list_term.back() == nullptr) { assert(n.expressions.empty()); } - assert( + assert( current_expression_list_term.back() == nullptr || current_expression_list_term.back() == &n.expressions[0] + n.expressions.size() ); @@ -3179,13 +3179,13 @@ class parse_tree_printer : printing_visitor //----------------------------------------------------------------------- -// +// // Visitor for moving tokens that are to the right on the same line // and shifting their positions left 'n' spaces - used only at the // end when lowering to Cpp1, as a convenient way to adjust for other // positions shifts we create (e.g., moving some operators to prefix // notation, or inserting "std::move" prefixes) -// +// //----------------------------------------------------------------------- // class adjust_remaining_token_columns_on_this_line_visitor diff --git a/source/sema.h b/source/sema.h index 0f515dab1..873e9f971 100644 --- a/source/sema.h +++ b/source/sema.h @@ -24,9 +24,9 @@ namespace cpp2 { //----------------------------------------------------------------------- -// +// // Symbol/scope table -// +// //----------------------------------------------------------------------- // struct declaration_sym { @@ -196,9 +196,9 @@ auto is_definite_last_use(token const* t) -> last_use const* //----------------------------------------------------------------------- -// +// // sema: Semantic analysis -// +// //----------------------------------------------------------------------- // class sema @@ -212,11 +212,11 @@ class sema std::vector partial_decl_stack;; std::vector active_selections; - + public: //----------------------------------------------------------------------- // Constructor - // + // // errors error list // sema( @@ -300,14 +300,14 @@ class sema auto const& sym = std::get(s.sym); assert (sym.identifier); if (auto use = is_definite_last_use(sym.identifier)) { - o << "*** " << sym.identifier->position().to_string() + o << "*** " << sym.identifier->position().to_string() << " DEFINITE LAST " << (use->is_forward ? "FORWARDING" : "POTENTIALLY MOVING") << "USE OF "; } if (is_definite_initialization(sym.identifier)) { - o << "*** " << sym.identifier->position().to_string() + o << "*** " << sym.identifier->position().to_string() << " DEFINITE INITIALIZATION OF "; } else if (sym.assignment_to) { @@ -358,7 +358,7 @@ class sema // It's an uninitialized variable (incl. named return values) if it's // a variable with no initializer and that isn't a parameter // - auto is_uninitialized_variable_decl = [&](symbol const& s) + auto is_uninitialized_variable_decl = [&](symbol const& s) -> declaration_sym const* { if (auto const* sym = std::get_if(&s.sym)) { @@ -373,13 +373,13 @@ class sema // It's a local (incl. named return value or copy or move or forward parameter) // - auto is_potentially_movable_local = [&](symbol const& s) + auto is_potentially_movable_local = [&](symbol const& s) -> declaration_sym const* { if (auto const* sym = std::get_if(&s.sym)) { - if (sym->start && sym->declaration->is(declaration_node::active::object) && - sym->parameter && - (sym->parameter->pass == passing_style::copy || + if (sym->start && sym->declaration->is(declaration_node::active::object) && + sym->parameter && + (sym->parameter->pass == passing_style::copy || sym->parameter->pass == passing_style::move || sym->parameter->pass == passing_style::forward ) @@ -403,7 +403,7 @@ class sema // if (auto decl = is_uninitialized_variable_decl(symbols[sympos])) { assert (decl->identifier && !decl->initializer); - ret = ret && + ret = ret && ensure_definitely_initialized(decl->identifier, sympos+1, symbols[sympos].depth); } @@ -487,7 +487,7 @@ class sema // Check that local variable *id is initialized before use on all paths // starting at the given position and depth in the symbol/scope table - // + // // TODO: After writing the first version of this, I realized that it could be // simplified a lot by using a sentinel value to represent the base case like // the others instead of as a special case. It's tempting to rewrite this now @@ -510,7 +510,7 @@ class sema stack_entry(int p) : pos{p} { } - auto debug_print(std::ostream& o) const -> void + auto debug_print(std::ostream& o) const -> void { o << "Stack entry: " << pos << "\n"; for (auto const& e : branches) { @@ -526,8 +526,8 @@ class sema break;case symbol::active::declaration: { auto const& sym = std::get(symbols[pos].sym); if (sym.start && sym.identifier && *sym.identifier == *id) { - errors.emplace_back( - sym.identifier->position(), + errors.emplace_back( + sym.identifier->position(), "local variable" + sym.identifier->to_string(true) + " cannot have the same name as an uninitialized" " variable in the same function"); @@ -539,8 +539,8 @@ class sema assert (sym.identifier); if (is_definite_initialization(sym.identifier)) { - errors.emplace_back( - sym.identifier->position(), + errors.emplace_back( + sym.identifier->position(), "local variable " + id->to_string(true) + " must be initialized before " + sym.identifier->to_string(true) + " (local variables must be initialized in the order they are declared)" @@ -558,8 +558,8 @@ class sema definite_initializations.push_back( sym.identifier ); } else { - errors.emplace_back( - sym.identifier->position(), + errors.emplace_back( + sym.identifier->position(), "local variable " + sym.identifier->to_string(true) + " is used before it was initialized"); } @@ -576,8 +576,8 @@ class sema definite_initializations.push_back( sym.identifier ); } else { - errors.emplace_back( - sym.identifier->position(), + errors.emplace_back( + sym.identifier->position(), "local variable " + sym.identifier->to_string(true) + " is used in a condition before it was initialized"); } @@ -604,8 +604,8 @@ class sema definite_initializations.push_back( sym.identifier ); } else { - errors.emplace_back( - sym.identifier->position(), + errors.emplace_back( + sym.identifier->position(), "local variable " + sym.identifier->to_string(true) + " is used in a branch before it was initialized"); } @@ -644,7 +644,7 @@ class sema // If this is not an implicit 'else' branch (i.e., if lineno > 0) if (symbols[b.start].position().lineno > 0) { (b.result ? true_branches : false_branches) - += "\n branch starting at line " + += "\n branch starting at line " + std::to_string(symbols[b.start].position().lineno); } else { @@ -652,7 +652,7 @@ class sema += "\n implicit else branch"; } } - + // If none of the branches was true if (true_branches.length() == 0) { @@ -686,17 +686,17 @@ class sema // Else we found a missing initializion, report it and return false else { - errors.emplace_back( - id->position(), + errors.emplace_back( + id->position(), "local variable " + id->to_string(true) + " must be initialized on both branches or neither branch"); - + assert (symbols[selection_stack.back().pos].sym.index() == symbol::active::selection); auto const& sym = std::get(symbols[pos].sym); - errors.emplace_back( + errors.emplace_back( sym.selection->identifier->position(), "\"" + sym.selection->identifier->to_string(true) - + "\" initializes " + id->to_string(true) + + "\" initializes " + id->to_string(true) + " on:" + true_branches + "\nbut not on:" + false_branches ); @@ -710,11 +710,11 @@ class sema break;case symbol::active::compound: { auto const& sym = std::get(symbols[pos].sym); - // If we're in a selection + // If we're in a selection if (std::ssize(selection_stack) > 0) { // If this is a compound start with the current selection's depth // plus one, it's the start of one of the branches of that selection - if (sym.start && + if (sym.start && symbols[pos].depth == symbols[selection_stack.back().pos].depth+1 ) { @@ -729,9 +729,9 @@ class sema } - errors.emplace_back( - id->position(), - id->to_string(true) + errors.emplace_back( + id->position(), + id->to_string(true) + " - variable must be initialized on every branch path"); return false; }