-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The preprocessor should be standardized #65
Comments
I'd guess you are unaware that Fortran 2003 contained an optional Part 3 that was conditional compilation. I'm not sure any vendor implemented it, there was little interest, and it got withdrawn in F2008. I really don't see us going back there. The reality is that people use cpp (or a variant) and that seems to work for most everyone. The better question to ask is "what are the use cases for a preprocessor, and can better language design satisfy that need?" Look at the C interop stuff, for example - it eliminates a large swath of what preprocessors were used for. A proper generics feature would eliminate more. |
I am aware of CoCo and how poorly it fared. The fact remains, C-like preprocessing is a real-world feature that is available in all compilers. Fortran would be more portable if the language acknowledged the existence of preprocessing and defined a standardized portable subset of behavior. |
@klausler thanks for bringing this up. Related to this is the fact that the default behavior is to not use a preprocessor for Why not standardize a subset of the current behavior, and automatically apply it to @sblionel's counterpoint is valid though --- just like C++ is moving away from using preprocessor by adding language features, Fortran is moving in that direction too. So I think the counterpoint is to not standardize a preprocessor, but rather improve language features so that the preprocessor is not needed. Besides templates, one common use case for a preprocessor that I have seen in many codes is a custom |
Keep in mind that the Fortran standard knows nothing about .f90 files or source files in general. It would be a broad expansion to try to legislate behavior based on file types. Note also that some operating systems are not case-sensitive for file names. I don't think that the current state of preprocessing is broken enough to warrant the standard trying to get involved. |
The standard doesn't have any concept of source files, much less source file names. The f18 compiler always recognizes and follows preprocessing directives and applies macro replacement, ignoring the source file name suffix, since its preprocessing capabilities are built in to its source prescanning and normalization phase, and are essentially free when not used. I absolutely agree that some common subset of preprocessing behavior should be standardized. This is the one major part of Fortran that every compiler provides that is not covered by the standard language document; but perhaps improving portability of Fortran programs across vendors, writing interoperable header files usable by Fortran and C++, or providing safe guarantees of portability into the future are no longer primary objectives. As part of defining f18's preprocessing behavior, I performed a survey of various Fortran compilers and wrote a collection of tests to check how their preprocessors interacted with difficult Fortran features like line continuation, fixed form, &c. The current state of the art is far more fragmented than I expected to find (see my link above for details and a compiler comparison table), and none of the existing compilers seemed to stand out as a model to be followed. EDIT: The standard does have a concept of source files in the context of |
@sblionel yes, the standard currently does not have concept of source files, but perhaps it should. I see this GitHub repository as broader than what is strictly covered by the standard today --- because we might decide in the future to include some such things (such as more details about source files) into the standard. And I must say I agree with @klausler's on this. There is a lot that Fortran should standardize and improve. Perhaps it does not need to go into the standard itself, but then let there be a document that we all agree upon, we do not need to call it the "standard" (perhaps we can call it "vendor recommendation"), but it will achieve what we want: improving portability across vendors. |
I am not sure, whether pre-processing must be necessarily implemented within the compiler or being standardized at all. Using an appropriate interpreter language (e.g. Python) it is possible to implement a pre-processor satisfying all requirements @klausler formulated (and even much more) within a single file. You add this one file to your project, and you can build your project with all Fortran compilers, as your pre-processor on board makes sure that the compiler only sees standard conforming source files. You will have to have the interpreter of course around whenever the project is built, but by choosing something wide-spread as Python, it would be the case on almost all systems automatically. Disclaimer: I may be biased as I myself also wrote such a one file pre-processor (Fypp) which apparently has found its way into several Fortran projects. |
@aradi Thanks for the link, I wasn't aware of Fypp. Its syntax seems incompatible with the other preprocessors though. I can see that it has more features, so that's probably the reason. But having the preprocessor syntax standardized I think is valuable. |
@certik The syntax is different from the usual cpp-derived pre-processors to make sure, nobody tries to run those on files meant to be processed Fypp. 😉 (Also, it allows better escaping and ensures better prevention against unwanted substitutions, which is sometimes tricky with cpp-based approaches.) @gronki Fypp was actually written in order to allow for easy generation of templates, so yes, a pre-processor can help to work around (but not solve) many of the generic programming needs. Still, I am not sure, whether it is a good idea to use "standardized pre-processor based workarounds" for generics, as we will stick then with them for the next few decades. 😄 |
@gronki , I never suggested a timeframe for generics. But there is a lot of resistance to adding features that paper over shorter-term problems. But there is an existing preprocessor solution that works, why complicate issues with trying to wedge preprocessing into the standard? I'd prefer the energy and cycles to be put into solving the language issues that make people reach for a preprocessor. |
Depends whether you believe that full generics will make it into Fortran. I
would rather cut this absolute insanity of copy-pasting the same code using
some simple but decent preprocessing language rather than wait 10 years.
wt., 12 lis 2019 o 17:56 Bálint Aradi <[email protected]> napisał(a):
… @certik <https://github.com/certik> The syntax is different from the
usual cpp-derived pre-processors to make sure, nobody tries to run those on
files meant to be processed Fypp. 😉 (Also, it allows better escaping and
ensures better prevention against unwanted substitutions, which is
sometimes tricky with cpp-based approaches.)
@gronki <https://github.com/gronki> Fypp was actually written in order to
allow for easy generation of templates, so yes, a pre-processor can help to *work
around* (but not solve) many of the generic programming needs. Still, I
am not sure, whether it is a good idea to use "standardized pre-processor
based workarounds" for generics, as we will stick then with them for the
next few decades. 😄
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#65?email_source=notifications&email_token=AC4NA3MODDAPT4MDBJECU2LQTLN5FA5CNFSM4JH7T652YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED26AIA#issuecomment-552984608>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC4NA3ICF2YZIBKPWTNJNGDQTLN5FANCNFSM4JH7T65Q>
.
|
@sblionel sure you have the point. Agree that including it in the core standard could be a waste of resources. But what about TS, just like it was with CoCo? Is there any idea or information why CoCo "lost" with cpp (despite implementation being available)? I didn't see anything wrong with it other than it just didn't take off. |
In terms of priorities, I think I agree with @sblionel that it makes sense to invest our efforts in getting generics into the standard, rather than prioritize a short term solution over the long term. That's a good point. |
@gronki, CoCo was before my time on the committee. All I know is that vendors didn't implement it and users didn't ask for it - they continued to use cpp. CoCo was an optional part of the standard, not a TS. A TS has the expectation that it will be incorporated in a future standard largely unchanged. Our experience so far with optional standard parts is that neither users nor implementors are all that interested in them. I will also trot out my oft-used point that everything that goes into the standard (and a TS or optional part is no different) has a cost, in terms of the resources and time of the committee members. |
Thank you for great explanation! I did not recognize the difference between the optional part and the TS. |
It's not either/or, and people have work to do today. Fortran compilers have preprocessors, people use them, and code portability would benefit from standardizing them to the extent that they can be. |
The preprocessors of fortran compilers are in fact standardized by the cpp standard. I think this is more than sufficient. (eg. C18 (ISO/IEC 9899:2018) |
Running Fortran through a "bare" cpp that doesn't know about Fortran commentary, line continuations, column-73 fixed-form line truncation, CHARACTER concatenation operator, and built-in |
Just for clarity, is this proposal focusing on standardizing the behavior of existing preprocessors (cpp, fpp) rather than extending or adding new features for more robust code conversion or metaprogramming facilities (i.e., the latter should be posted elsewhere)? |
I do not exclude the addition of new features to a standardized preprocessor in my concept, but I don't know of any particular new feature that I would want that isn't already implemented in at least one compiler. What do you have in mind? |
I think, in case generics do not make it into the standard, loop constructs would be useful to generate various specific cases for a given template. We are using that a lot for creating library functions with the same functionality but different data types / kinds. (For example wrapping MPI-functions as in MPIFX). We use currently Fypp for that (shipping Fypp with each project), but would be more than happy to change it to any other pre-processor language, provided we can be sure, each compiler can deal with it. |
As for code conversion, my immediate use case is to iterate over multiple symbols to generate codes from a templated one. For example I often have a pattern like
to open or close files for each type component. Although those lines are the same except for property names (like foo), I cannot write them conveniently at once (with standard Fortran/cpp/fpp). A similar situation occurs when doing some operation for all (or some of) type components, e.g. scaling by some factor
If a preprocessor supports iteration over symbols, I may be able to write, e.g.
(where I suppose "$" is interpreted as in Bash). Fypp already has this facility
and Julia also uses such loops over symbols sometimes, e.g. I think it would be useful if standard Fortran or preprocessor will support such a feature somehow, if Fortran generics (discussed on Github) may not cover such a feature. Apart from the feature request, I am a bit concerned that the extensive use of "#" or "$" can make a code very noisy or cryptic (the worst case of which might be Pe*l??), which I hope to be avoided (if possible...). In particular, if cpp/fpp requires the directive to start from column 1 with "#", the code may become less readable (as I often feel for codes with a lot of "#ifdef MPI"). |
Another feature request for a preprocessor (according to StackOverflow) might be to support output of multi-line processed sources (without using semicolon). Fortran Preprocessor Macro with Newline |
The reason he or she does not want to use semicolons is fear of a 132-character line limit, which is something that a compiler with a built-in preprocessing stage should enforce before macro expansion, not after. |
Still, I can think of scenarios where passing multiline arguments to macros would be useful. Thinking about macro based unit test systems (as in Google Test or Catch for C++) you would need the ability to pass multiline macros to Fortran. |
Multi-line arguments are a different problem from multi-statement expansions. Both should work; specifically, Fortran line continuations should be usable within macro invocations. |
I'm not proposing anything for the standard here. I understand that you don't want to standardize preprocessing, and that you get to decide whether preprocessing is standardized or not. Fine. But preprocessing is still used in real codes by real users, it's part of every Fortran implementation, and I would like to provide the best implementation of preprocessing for them that I can in f18 in the absence of guidance from a standard. |
No, I don't get to decide. I am just one vote among all WG5. But as I have said, we already did standardize preprocessing (though this happened before I was on the committee) and it was ignored and has now been dropped from the standard. Nothing I or WG5 say will stop people from using preprocessing using the tools (cpp) they are already using. |
Does any production Fortran compiler actually use cpp? It doesn't interact well with line continuation, line truncation, Hollerith, or (especially) INCLUDE. I don't know of a compiler that preprocesses with a stock cpp. CoCo wasn't rejected by users and implementors because people didn't need or want preprocessing. CoCo was a failure because it was gratuitously different from the C-like preprocessing and local tooling that people were already using, and it wasn't a better solution (in fact, it's really weird and ugly). |
I don't know what every compiler uses, but I often see cpp invoked with an option that better handles Fortran. ifort has its own My point was that people are already using cpp or a cpp-like preprocessor that they already have. I agree that CoCo was "gratuitiously different", but what the users told us was that cpp (or cpp-like) was working for them. |
And if the common subset of the behaviors of Fortran-aware preprocessors (and built-in preprocessing phases) were to be documented, then both users and implementors would know what's portable and what's not. This is exactly the sort of thing that should be in a de jure standard. But that's not going to happen, and the best I could do for f18 was to determine that common subset myself, figure out the most reasonable behavior in edge cases where compilers differ, and ask users for guidance. If you have better advice for an implementor, I'm all ears. EDIT: See here for a table of preprocessing behaviors of various compilers, using fixed and free form samples in this directory. As one can see, things are not terribly compatible today, but there is a common portable subset. |
@klausler I agree with you and I think the best we can do is to get a community / vendors consensus of what should be supported and document it. Most production Fortran codes that I have seen use macros in some form, and thus compilers must support them. Thank you for taking the lead on that in the document you shared. |
Here is a permanent link to the preprocessor documentation: |
We are currently figuring out how to add preprocessor support for LFortran. I can see that in Flang it is integrated into the compiler. I don't know if it is feasible to pre-process ahead of time (de-coupled from the compiler) and keep line numbers consistent, this is also relevant:
That would be my preferable approach, but I assume the down side are worse error messages and possibly it is slower? |
In f18 the first phase of compilation is called prescanning. It reads the original source file, and expands any INCLUDE or #include files, normalizes the source in many ways (preprocessing directives, macro expansion, line continuation, comment removal, space insertion for fixed form Hollerith, space removal / collapsing, case lowering, &c.) to construct a big contiguous string in memory. This string is what the parser parses, and it makes parsing so much easier. Each byte in that string can be mapped to its original source byte or macro expansion or whatever by means of an index data structure. In the parser and semantics we just use |
I think, it would be much more important, that all compiler accept and process |
I think the |
No, if the source file is around, the compiler will show the right line. You can test it yourself. test.F90:
Executing
you obtain the error message:
|
My bad, you are right. The only issue will happen if there is a syntax error in the expanded ASSERT macro, wouldn't it? Like this:
I would expect it to show an incorrect column number. |
If the error occurs in the expanded text, the error message can be indeed confusing. E.g.
with
results in
In this case, one would have to drop the line marker generation as with
to obtain
But, this is independent of, whether the pre-pocessor is external or built in into the compiler. Do you show the original line or the expanded line (or both), when the error occurs in an expanded code? Whichever strategy one goes for, it can be equally realized with built-in as well as with external pre-processors (provided they generate line marker directives and the compiler understands them). |
@aradi I am glad you posted here, I think you are right. Indeed the compiler could now about the pre-processor, as a black box, and it could show errors either in the expanded form, or unexpanded form, and in each way it would show the correct line. How would it know the line comes from a macro expansion? Well, I guess once it found the line with the error in the expanded form, it can compare the unexpanded line (from the #line directive) and if it differs, it can show both, i.e. the error can look something like this:
If the line does not differ, then it can simply show the unexpanded form, as that will be the one which users see in their files. I think this might be a very acceptable approach, with the advantage that we can use different pre-processors, such as Summary of the black box approach:
I can still see some potential advantages of integrating the pre-processor more deeply with the compiler:
But the black box approach is not bad, and one can implement both. |
@certik I fully agree. Yes, the column number will be incorrect in the unexpanded form. And yes, a tight integration can give even deeper insights. But that assumes the existence of a well defined (standardized) pre-processor language which all Fortran compilers implement and follow, and which covers all the pre-processing needs people may come up with. In the mean time, the line directives can serve as a "bridging technology", allowing the usage of custom pre-processors. |
I created an issue at https://gitlab.com/lfortran/lfortran/-/issues/281 to implement this in LFortran. |
I think that it's necessary to have an integrated preprocessing facility in the same part of the compiler that's handling INCLUDE statements, line continuation, case normalization, &c. It's not hard to implement and it should be standardized. |
|
FYI: Doesn't solve the general question of standardizing a preprocessor, but might be useful considering it is not standarized ... A trick some may or might not know with gfortran is it can read code from stdin. That can be handy for issues like this. It might still be a problem if there are a lot of files doing the compiling that just have a "gfortran" command in them or other issues; #!/bin/bash
cat >x3.F90 <<\EOF
#define msg(x) print *, #x
program testit
implicit none
real :: w=1234.5678
write(*,*)w
write(*,*)__LINE__
write(*,*)__DATE__
write(*,*)__TIME__
msg('Hello')
end program testit
EOF
rm -f ./a.out
cpp x3.F90|gfortran -x f95 -
#cpp x3.F|gfortran -x f77 -
./a.out
exit I have a little script that I call "gfortran+" that does several preconditioning steps but otherwise lets you call it like gfortran that takes advantage of that; that is particularly handy with fpm because I just set the environment variable FPM_FC to "gfortran+" Personally, I gave up and have my own preprocessor :> |
Yeah, there are lots of workarounds. NWChem has been able to do a two-step to preprocess since forever because of bad Fortran compilers that can't preprocess correctly. The point here is that there is no excuse for users to have to do this. Every decent Fortran compiler can implement the thing users want. |
@urbanjost Neat, indeed. But tricks, which only work with gfortran, are not really useful for most projects, as people may want to compile them also with other compilers. @jeffhammond Yes, we did exactly the same in DFTB+ (postprocessing a cpp-preprocessed source, to make sure, it is standard conforming) for quite a while. Then, we decided to write our own preprocessor (fypp), which has a well defined behavior on all platforms and spits out standard conforming source files. I think, this is still the best option as long as there is not a standardized pre-processor with a well defined, platform and compiler independent behavior, which is implemented in all popular compilers. (So, at least for the next 10-20 years?) |
The problem has not been a lack of pre-processors; from m4 to fypp to prep to fpp to cpp too ... the problem has been standardization and uniiversal availability. A bash shell makes an excellent preprocessor for example; with the code in here documents, and ships with most platforms and takes a minute to install on others. Variable expansion, looping, conditionals, call any system utility... and many people are already familiar with the syntax required. Really does make a superb preprocessor. It has been the lack of anything in the standard and that for most users the "ttuw" program (the thing users want) seems to be somethinig very close to fpp; which is obviously close to cpp. With ISO_C_BINDING preprocessing interest had nearly vanished; now with an uptick in interest in templating it has gained more interest again. Except when the processor is also in Fortran (or C) the processors have depended on specific environments that have waxed and waned. Something like Java, Ruby, Python, Perl ... have been assumed available (Definitely not the case, I have worked on multiple clusters where none were available) so nothing is likely to resolve the issue except for a fpp(1) program defined as part of the lanuage, and then it will not have some capability someone desires and the cycle will continue. So all I depend on the compiler to do is compile standard Fortran; and try to pick preprocessing tools that are readily available on any likely platform I need. As there are less and less environments, and it becomes easier to just have a little portable environment you can use like an app (ie VMs, containers, ...) the problem is basically turned on it's head but will continue unless/until the language defines it. And remember most languages that require preprocessing are always trying to get rid of if. Look no futher than C to see were freely supporting pre-processing leads. |
Preprocessor standardization was the most common item on Fortran 202Y wish lists (except maybe for templates, which are already on the docket). |
Yes. There has been a bit of lobbying for this to promote code portability. Current discussion in JoR is that it might be treated as a new "Part 2" to the standard, but a mandatory, not optional, "companion processor". There may be a separate subgroup established to focus on this and make substantive progress between meetings. Stay tuned. |
I have encountered two specific preprocessing issues where standardization might help. Both are connected to C interop, where the symbols are ambiguous (for example, small differences in glibc on Linux and BSD's libc) and must be determined at compile time. We use autoconf and CPP macros to assign the internal name of these symbols, but Currently we assign the macros as actual strings, e.g. A second case is that local Although we don't like to rely on these sort of platform-specific flags, they help to provide useful defaults when autoconf is unavailable or when using a legacy build system. I already saw stringizing in @klausler's document, so no new information here, but I thought that a specific example might help support the effort. |
The link in the original post to the preprocessing document is broken. The current link appears to be: https://github.com/llvm/llvm-project/blob/main/flang/docs/Preprocessing.md |
Most Fortran compilers support some form of source code preprocessing using a syntax similar to the preprocessing directives and macro references in C/C++. The behavior of the preprocessing features in the various Fortran compilers varies quite a bit (see https://github.com/flang-compiler/f18/blob/master/documentation/Preprocessing.md for a summary of the situation). To improve code portability, the Fortran standard should accept the existence of preprocessing, and standardize the behaviors that are common and/or most useful.
@gklimowicz edit: The more recent link is Preprocessing.md.
The text was updated successfully, but these errors were encountered: