-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scala Preprocessor / Conditional Compilation #640
Comments
Looks really good, Stefan! Do you think you could expand a bit on what is meant by Rust's cfg attribute and macro behaviour? Either just describe it or better yet with examples. Thanks! |
Yes, very nice writeup! Thanks for doing the hard work and not just dumping out some syntax ideas :-) |
The @cfg(""" binaryVersion = "2.13" """)
def foo: Int = ... // 2.13 version
@cfg(""" not(binaryVersion = "2.13") """)
def foo: Int = ... // 2.11/2.12 version
The println("The binary version is " + cfg("binaryVersion")) Values produced by the macro are expanded into literals at compile time. |
A possible way to avoid the namer issue (especially at the top level) without too much complexity would be a new annotation-like syntax like |
Could this express, for example
Do we need / want that? :-) |
|
Thanks! Can you think of cases where the annotation based syntax would not work well enough? My example above is a superclass, that could be worked around with a type alias. But for example if I want to add a parent conditionally (and not extend anything in the other case), I don't see how that could be done (except making two copies of the entire class). |
You can always extend |
Here's my prototype so far: https://github.com/szeiger/scala/tree/wip/preprocessor I'm not quite happy with the set-based key/value checks. It doesn't feel correct with Scala syntax. Supporting imports will need a bit of refactoring in the parser. It's not as straight-forward to add as I had hoped. I wanted to try it with collections-compat but discovered that annotations do not work for package objects. This is also a limitation of the parser, to it affects my pseudo-annotations as well. I'm not sure if this is intentional or a bug. Fixing it should be on the same order of difficulty as supporting imports. Except for these limitations it should be fully functional. |
The patch has
so supporting annotation ascriptions is planned, right? |
I assume it's trivial to implement but didn't get around to testing it yet. |
Looks like the restriction on disallowing annotations in general for package objects is intentional: https://www.scala-lang.org/files/archive/spec/2.13/09-top-level-definitions.html#compilation-units. But since |
The latest update supports imports, package objects and annotated expressions. |
Here's a version of scala-collection-compat that does all the conditional compilation with the proprocessor: https://github.com/szeiger/scala-collection-compat/tree/wip/preprocessor-test. This shows the limits of what is possible. In practice I would probably keep 2.13 completely separate but use conditional compilation for the small differences between 2.11 and 2.12. |
What are the concrete use cases for this? IMO proposals should always start with a set of use cases, and their design should be driven and guided by how well they solve those use cases. |
Thanks for the detailed write up! Some quick questions. How do you envision that the code editing and navigation experience would work in IDEs for conditionally compiled statements? Can you maybe elaborate on the goal below with an example situation where conditional source files have been insufficient in practice?
I am concerned that preprocessing introduces one more way to solve the same problem that conditional source files already solve. Conditional source files have their flaws but they work mostly OK with IDE tooling. |
I would love this, biggest pain as library maintainers is having to have (mostly) redundant branches because we can't do conditionals based on the current scala version
The conditionals should use the value that corresponds to the current compiler version that is set by the IDE?
Migrating to the new scala collections is a major one if you use https://github.com/mdedetrich/scalajson/blob/master/build.sbt#L98 is another example |
The same way that different source folders work. An IDE that imports the sbt build (like IntelliJ currently does) would also see the config options that are passed to scalac (and computed in the build in the same way as the source folders). |
The motivating use case is indeed the collections migration where we see the need for separate source files in many cases. I neglected to put that into the proposal because the proposal "we should have a preprocessor for conditional compilation" already existed when I adopted it to create a design. Here is a version of scala-collection-compat that takes my current preprocessor prototype to the limit: https://github.com/szeiger/scala-collection-compat/tree/wip/preprocessor-test. Note that this is not a style I would recommend. For collection-compat, assuming that 2.11 and 2.12 already had the preprocessor, I would have used the preprocessor to combine and simplify the 2.11 and 2.12 versions (which are very similar) and kept the entirely different 2.13 version separate. |
I am personally somewhat doubtful of this. Cross-version sources have worked well enough, are supported by every build tool (SBT, Mill, our Bazel build at work), and encourage the best practice of keeping your version-specific stuff encapsulated in a small number of files rather scattering if-defs all over the codebase.
Not mentioned in this proposal is Scala.js. The Scala.js community has been working with platform-specific source folders forever. It's worked well. I don't think I've heard any great groundswell of demand for |
Here's a summary of my AST-based preprocessor prototype (https://github.com/szeiger/scala/tree/wip/preprocessor). It's the same approach that Rust uses for the SyntaxConditional compilation is done with a pseudo-annotation called
Note that the source code must be completely parseable into an AST before preprocessing. For example, this is allowed: @if(scala213) val x = 1
@if(!scala213) val x = 0 Whereas this is not: val x = (1: @if(scala213))
(0: @if(!scala213)) Configuration OptionsConfiguration options consist of string keys associated with a set of string values. They are passed to scalac with scalacOptions ++= Seq("-Cfeature=foo", "-Cfeature=bar") This gives the config option Preprocessor PredicatesPredicates for the
PreprocessingThe preprocessor runs in the new Reifying Configuration OptionsThe val features = sys.cfg("feature") is expanded into val features = Set[String]("foo", "bar") |
I don’t think having a preprocessor is a good idea. It adds another meta-language layer above the Scala language, which increases the cognitive load for reading source code. Also, unless such a meta-language is as powerful as a general-purpose language (which is something we don’t want!), we will still need to rely on the build system to accommodate some specific needs that are not supported by the few preprocessor directives of the system. I believe such a system would have very limited applicability compared to its cost. |
@szeiger can you can spill a bit more ink on that motivating use case? And are there others? For instance,
|
In my experience, cross version sources have resulted in massive amounts of code duplication. You basically have duplicate the entire source file/s minus the difference you are targeting. In some cases you can get around this by using traits, but that then opens other problems |
@som-snytt I don't mean to come across that way, but shouldn't said research be done before making such a significant investment? @mdedetrich that's interesting. Can you explain why separate directories results in so much duplication? Part of why I'm asking is because the deeper you understand a problem, the more you understand the solution space. There could be solutions that haven't been thought of or explored fully. But partially, it's because if we don't write these things down, people won't appreciate the direction being taken. Anyway I seem to recall such a case mentioned recently by @sjrd on gitter that may have pointed in this direction. I'm not sure it generalizes though. |
I've tried this, it's not a great solution. You lose all sorts of things doing things in multiple git branches v.s. just having separate folders:
Using git branches for cross versioning always sounds great initially, but you are giving up a lot of commonly-used functionality and workflows in order to get there. Version specific sources files are pretty easy, and is the de-facto convention throughout the entire community. Matthew doesn't specify why he doesn't like splitting things into version-specific traits, but I've done it for years over a million lines of cross-versioned code across three different cross axes and it's never been particularly difficult. I don't think it would be an exaggeration to say I maintain more cross-built Scala code than anyone else in the community. Even at the extremes, like when Ammonite has to deal with incompatibilities in path-dependent nested-trait-cakes inside IMO this proposal as stated suffers from the same problem as many others: the proposed solution is clearly stated, well analyzed and thoroughly studied, but the problem it is trying to solve is relegated to a single throwaway sentence without elaboration. That seems very much like the wrong way of approaching things, and is an approach very prone to ending up with a solution looking for a problem |
Well for starters, its a workaround. Traits are a language level abstraction for structuring your code. Its not designed for dealing with breaking differences between different Scala versions. Its used this way because its the only sane way to handle differences between platforms and scala versions if you are able to do so (apart from Git branches which you already expanded upon). They are kind of like pollution, they wouldn't normally be there if it wasn't for the fact you were targeting another Scala version/platform which doesn't support feature X. There are also difficulties in using traits, they can cause issues with binary compatibility in non trivial circumstances. And they also don't abstract over everything cleanly, as shown in the repo I linked earlier. |
I've pushed some bug fixes to the prototype. And I converted the existing cross-building setup of akka-http to use the preprocessors:
Some observations from converting akka-http:
Overall, I'm coming around to preferring the lexical syntax, but the annotation-based syntax is a lot simpler, both in terms of surface area, implementation complexity and interaction with other language features. And then there's the elephant in the room: Dotty's new indentation-based syntax. IMHO this would rule out the lexical preprocessor. You should be able to use indentation to structure your code between preprocessor directives but this indentation would then interfere with Scala's own indentation-based parser. |
I believe this proposal is a serious step backwards with regards to tooling and I don't think the advertised benefits have been adequately motivated. There's a significant difference between this proposal and @if(scala213)
def foo = ...
@if(!scala213)
def foo = ... The classpath is different for the 2.13 and non-2.13 Conditional compilation based on source files has its drawbacks but I believe it still is the best solution to address the problem statement of this issue. |
Given the number of negative comments and reactions on this thread, I’d like to know better what is the plan. How far does Lightbend want to experiment with this idea? |
Of course these changes are subject to a decision by the SIP committee. Stefan has prepared a few prototypes and a proposal. Complications to tooling are an important consideration in evaluating the complexity of this proposal. We are willing to keep refining until we find consensus (or it gets voted down). I hear your concerns, but I'd also like to remind everyone that tooling authors are outnumbered by users by a large factor. How big is the effort to implement this compared to the benefits to a large user base? Currently our users are forced to maintain duplicated code in different source files -- an approach not taken by any other language they are likely familiar with. We've gotten feedback that large shops have implemented their own pre-processor. Technically speaking, I think the tooling burden is limited if we treat "inactive" code as comments: you can still edit them, but you don't get the same level of code assist. Definitions in skipped fragments shouldn't show up in find symbol, IMO. This means the tool/IDE only needs to know which properties are passed to the compiler (I'd imagine those are canonically specified in the build and can be exported using BSP) and either use the official parser or implement the same logic that skips parts of the code for which the |
Re tooling, one issue for many refactorings is that they should ideally act on all variants. And they need to modify sources before preprocessing, based on preprocessing results. Tooling authors are fewer, but the real issue is whether all users get worse tools.
|
That's a good point, but again -- I'm not convinced of the hypothesis that tools should act on all possible combinations. I would say they only act on the currently active one. Since we already have logic in sbt for cross-versioning, they could easily be run in sequence on the small selection of valid combinations (usually I'd expect you only key on the targeted Scala version). |
To be clear, that reduces the complexity to our current one, where each valid combination is represented by a version-name-mangled folder that contains the stuff that varies between combinations, as well as everything that happened to be in the same file without being sensitive to different configurations. Our preprocessor proposal inverts this: only the things that change are duplicated, and we keep the same source folder structure. Your build determines a sequence of valid combinations, and your tools will have to run on all of those, just like the build already runs for different target versions of Scala (cross-building). |
Personally I hope that Scala 3.x will be backwards-compatible (in the language and the standard library) so users can write their code in the minimum Scala 3.x version their project supports. Thus dropping the need for variant source files/directories or a source file preprocessor. |
FYI: I'm currently working on a SIP document based on my earlier posts and other information in this thread. The main proposal will be the lexical preprocessor that I did 2nd. |
The potential differences you'd want to abstract over with a preprocessor are not just about syntactic changes. Migration to libraries that make incompatible API changes is probably the most common, and that's often just one method call that needs to be written in two ways. |
No, we were not pressured or incentivized financially to propose/implement this. 🤷♂ How about the simpler explanation -- that I'm genuinely trying to make Scala a language with wide appeal across the industry!? PS: As soon as our VCs ask us to implement Scala features, I'll be sure to let you know! |
If Scala had conditional compilation, the library update that I've been working on for the last several days would have only taken a few hours, and I would not have had to drop support for older versions of Scala, and the library would not have had duplicated code. Right now about 80% of the code must be duplicated for each major Scala release. I would love to see conditional compilation in Scala, ASAP, and I believe it is possible to implement such that reasoning about types is deterministic. |
I also needed CoCo recently. The OP omits the other Scala idiom
I'd assume it's sufficient to slip Adriaan something under the table? It's probably even more efficient to slip Adriaan under the table. |
An assumption I had was that the use-case for this feature was primarily cross-building against different (versions of) dependencies, particularly the stdlib. I've been pointed to this may not actually being the case. That raises the question, what is the actual problem that conditional compilation aims to solve? I somewhat fear that the answer there has to do with encoding business logic in compile-time conditionals as some sort of dependency injection framework, but that's still just guesswork. So what are we looking at exactly? What are the cases that people want to use a pre-processor for? And does a plain boolean conditional compilation preprocessor adequately solve the problems of those use cases? |
The org that I've talked to most that has experience with an (in-house) preprocessor actually bans encoding business logic or debug/release modes. It's purely about managing migration in the presence of breaking changes to external (to the current project) changes. |
Here is a section of libraryDependencies ++= scalaVersion {
case sv if sv.startsWith("2.13") =>
Seq(
"javax.inject" % "javax.inject" % "1" withSources(),
"com.typesafe.play" %% "play" % "2.7.3" % Provided,
"com.typesafe.play" %% "play-json" % "2.7.4" % Provided,
"org.scalatestplus.play" %% "scalatestplus-play" % "4.0.3" % Test,
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
case sv if sv.startsWith("2.12") =>
Seq(
"com.typesafe.play" %% "play" % "2.6.23" % Provided,
"com.typesafe.play" %% "play-json" % "2.7.4" % Provided,
"org.scalatestplus.play" %% "scalatestplus-play" % "3.1.2" % Test,
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
case sv if sv.startsWith("2.11") =>
Seq(
"com.typesafe.play" %% "play" % "2.5.16" % Provided,
"com.typesafe.play" %% "play-json" % "2.7.4" % Provided,
"org.scalatestplus.play" %% "scalatestplus-play" % "2.0.1" % Test,
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
case sv if sv.startsWith("2.10") =>
Seq(
"com.typesafe.play" %% "play" % "2.2.6" % Provided,
"org.scalatestplus" %% "play" % "1.5.1" % Test
)
}.value From this I observe:
And a questions arises: should conditional compilation be accomplished by a tool that "knows" Scala, as opposed to a dumb generic macro processor such as provided in C? I think conditional compilation should be performed by a proxy with a well-defined interface that could be driven by the build system. This proxy could be a new phase of the Scala compiler, or it might be a Scala-knowledgeable precompiler. Conditional compilation might include:
It would also be wonderful if the build system had more use of expressions and less statements. Perhaps the design of conditional compilation might drive that change. |
It seems like a lot of this is a bandaid for something that could be fixed
by TASTy and/or Fury.
It doesn't look like it's mostly "here is how we solve the problem on 2.11,
and here is how we solve it on 2.12", but "we need library X but X has no
version that's published against all the scala versions that we need, so to
work around that we need to maintain code against a bunch of different
versions of library X."
Fury could help by building the library from source for all needed versions
of scala. And if TASTy were released then libraries could publish binary
artifacts that are agnostic of the scala version.
…On Thu, Oct 3, 2019 at 11:55 AM Mike Slinn ***@***.***> wrote:
Here is a section of build.sbt for the project I mentioned:
libraryDependencies ++= scalaVersion {
case sv if sv.startsWith("2.13") =>
Seq(
"javax.inject" % "javax.inject" % "1" withSources(),
"com.typesafe.play" %% "play" % "2.7.3" % Provided,
"com.typesafe.play" %% "play-json" % "2.7.4" % Provided,
"org.scalatestplus.play" %% "scalatestplus-play" % "4.0.3" % Test,
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
case sv if sv.startsWith("2.12") =>
Seq(
"com.typesafe.play" %% "play" % "2.6.23" % Provided,
"com.typesafe.play" %% "play-json" % "2.7.4" % Provided,
"org.scalatestplus.play" %% "scalatestplus-play" % "3.1.2" % Test,
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
case sv if sv.startsWith("2.11") =>
Seq(
"com.typesafe.play" %% "play" % "2.5.16" % Provided,
"com.typesafe.play" %% "play-json" % "2.7.4" % Provided,
"org.scalatestplus.play" %% "scalatestplus-play" % "2.0.1" % Test,
"ch.qos.logback" % "logback-classic" % "1.2.3"
)
case sv if sv.startsWith("2.10") =>
Seq(
"com.typesafe.play" %% "play" % "2.2.6" % Provided,
"org.scalatestplus" %% "play" % "1.5.1" % Test
)
}.value
From this I observe:
1. Dependencies might demand imports that vary and type definitions
that vary between major releases of the Scala compiler and/or dependencies.
Conditional compilation would help smooth the differences out.
2. The use of multiway branching would probably dominate over binary
choices (if/then) as code bases mature.
3. The build system (in this case SBT) should be considered together
with the Scala program code when designing conditional compilation.
Conditional compilation is really just a new type of meta-project.
And a questions arises: should conditional compilation be accomplished by
a tool that "knows" Scala, as opposed to a dumb generic macro processor
such as provided in C?
I think conditional compilation should be performed by a proxy with a
well-defined interface that could be driven by the build system. This proxy
could be a new phase of the Scala compiler, or it might be a
Scala-knowledgeable precompiler.
Conditional compilation might include:
1. Injection of generated code
2. New or conditionally derived type definitions
3. Responding to dependency versions
4. Responding to type definitions in dependencies
It would also be wonderful if the build system had more use of expressions
and less statements. Perhaps the design of conditional compilation might
drive that change.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#640?email_source=notifications&email_token=AAAYAUDRKREWV757XJG6AJLQMYIVRA5CNFSM4ICEXZDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAIVNLI#issuecomment-538007213>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAYAUFHMFFFTVTN4GZUU73QMYIVRANCNFSM4ICEXZDA>
.
|
Here's the PR for the SIP: scala/docs.scala-lang#1541 |
In my experience, cross-building is hard and error-prone. Cross-version source files have made cross-building possible but have also introduced bugs in every project where we used them so far (every project has only a low single digit number of files cross-built). The reason is that maintaining near-copies of files just does not work in practice. People always forget to cross-check to make sure all instances of fixes have been applied everywhere. You could say that this would be detected early because of test coverage and that's somewhat right but only if you have 100 % test coverage of all branches, versions, platforms, OSs, etc. Even if you have that you probably won't cover all the potential performance issues you might have (this is amplified by the fact cross-building is often necessary because of scala collection API incompatibilities where you then run into subtle scala collection performance incompatibilties uncovered only much later). In general, the less code is duplicated the better. However, because of other constraints, some files cannot be split up. In that case, some preprocessor would help a lot. In akka-http, we introduced some macros as proposed here. By now, I'm not sure that is the best idea because
For similar reason, I find that support in Scala itself might not necessarily be the best solution. I will now experiment with a preprocessor plugin on the sbt level which will apply the preprocessor to select files in special source directories and just generate final sources into |
It seems that would give a bad user experience when editing the original source files (with the conditional compilation statements) in the IDE. This solution would also be build-tool dependent. Conditional compilation would not be the only feature where the compiler flags need to be known to be able to compile the source files (classpath, Yimports, Xsource). We can maybe find a good way to persist this information. |
As a C++ programmer who recently started using Scala, I needed to write a library that cross-builds against multiple versions of other libraries and thought this kind of feature would be useful. Personally, I hate having multiple Git branches because it's actually duplicating a lot of code in different branches and it increases the maintenance burden. Currently I'm using version-specific source directories but I think something like enable_if/is_detected in C++ will be useful because sometimes it's awkward to extract the small differences into separate source files and a small amount of duplication is inevitable with version-specific sources. Note that enable_if/is_detected are implemented with C++ templates (exploiting the SFINAE "feature"), not with preprocessor macros. Maybe this is already possible with Scala's macros? |
That's what we do in akka-http in places: https://github.com/akka/akka-http/blob/master/akka-parsing/src/main/scala/akka/http/ccompat/pre213macro.scala However, how it's done is more of an implementation detail. The important thing would be to have a standardized solution that avoids duplicating code.
I agree that standardization would be good for that reason. That said, given that IDEs usually don't have a concept of compiling/analyzing the same code under different configurations, I don't expect a big opportunity for improvement here. Existing solutions like version-based directories or letting sbt generating files into |
I've implemented something - https://eed3si9n.com/ifdef-macro-in-scala/ |
I have a quick question. Why are we not using something like |
I have something more - ifdef in Scala via pre-typer processing
I guess because no one has implemented it. I wasn't sure if conditional compilation is possible at all, so I opted to use simple String checking. |
Previous SIP for reference (also linked above in this ticket): scala/docs.scala-lang#1541 |
@eed3si9n Interesting work!! I think it would be good if you either revive the previous SIP linked by @lrytz or start a new one (or maybe start a thread on Contributors?) to have a discussion on the implications of ifdef for tooling etc. Also, if I understood @sjrd and @smarter right from a discussion at today's SIP meting, it is a bug that it is possible to run compiler plugins before the typer outside of nightly versions? So, for this to fly, there needs to be some explicit support from the compiler/language, if I have understood this correctly. |
Scala has done quite well so far without any preprocessor but in some situations it would be quite handy to just drop an
#ifdef
or#include
into the source code. Let's resist this temptation (of using cpp) and focus instead on solving the actual problems that we have without adding too much complexity.Goals
Non-goals
Status quo in Scala
All of these require build tool support. Conditional source files are supported out of the box (for simple cross-versioning in sbt) or relatively easy to add manually. sbt-buildinfo is also ready to use. Code generation is more difficult to implement. Different projects use various ad-how solutions.
Conditional compilation in other languages
C
Using the C preprocessor (cpp):
HTML
Conditional comments:
Rust
Built-in conditional compilation:
cfg attribute
(annotation in Scala) allowed where other attributes are allowedcfg_attr
generated attributes conditionallycfg
macro includes config values in the source codeJava
static final boolean
flags can be used for conditional compilation of well-typed codeHaskell
Conditional compilation is supported by Cabal:
Design space
At which level should conditional compilation work?
Before parsing: This keeps the config language separate from Scala. It is the most powerful option that allows arbitrary pieces of source code to be made conditional (or replaced by config values) but it is also difficult to reason about and can be abused to create very unreadable code.
After lexing: This option is taken by cpp (at least conceptually by using the same lexer as C, even when implemented by a separate tool). If avoids some of the ugly corner cases of the first option (like being able to make the beginning or end of a comment conditional) while still being very flexible. An implementation for Scala would probably be limited to the default tokenizer state (i.e. no conditional compilation within XML expressions or string interpolation). Tokenization rules do not change very often or very much so that cross-compiling to multiple Scala versions should be easy.
After parsing: This is the approach taken by Rust. It limits what can be made conditional (e.g. only single methods but not groups of multiple methods with a single directive) and requires valid syntax in all conditional parts. It cannot be used for version-dependent compilation that requires new syntax not supported by the older versions. An additional concern for Scala is the syntax. Using annotations like in Rust is possible but it would break existing Scala conventions that annotations must not change the interpretation of source code. It is also much harder to justify now (rather than from the beginning when designing a new language) because old tools would misinterpret source code that uses this new feature.
After typechecking: This is too limiting in practice and can already be implemented (either using macros or with Scala's optimizer and compile-time constants, just like in Java).
From my experience of cross-compiling Scala code and using conditional source directories, I think that option 3 is sufficiently powerful for most use cases. However, if we have to add a new syntax for it anyway (instead of using annotations), option 2 is worth considering.
Which features do we need?
Rust's
cfg
attribute + macro combination looks like a good solution for most cases. I don't expect a big demand for conditional annotations, so we can probably skipcfg_attr
. Thecfg
macro can be implemented as a (compiler-intrinsic) macro in Scala, the attribute will probably require a dedicated syntax.Sources of config options
Conditions for conditional compilation can be very complex. There are two options where this complexity can be expressed:
I prefer the first option. We already have a standard build tool which allows arbitrary Scala code to be run as part of the build definition. Other build tools have developed scripting support, too. The standalone
scalac
tool would not have to support anything more than allow configuration options to be set from the command line. We should consider some predefined options but even in seemingly simple cases (like the version number) this could quickly lead to a demand for a more complex predicate language.The text was updated successfully, but these errors were encountered: