Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to configure locations of generated directories #8

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions 0000-specify-gen-directory-locations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
- Start Date: 2024-03-01
- RFC PR: [lf-lang/rfcs#8](https://github.com/lf-lang/rfcs/pull/8)
- Tracking Issue(s): [lf-lang/lingua-franca#2203](https://github.com/lf-lang/lingua-franca/issues/2203)

# Abstract

[abstract]: #abstract

This RFC proposes multiple changes which when combined can address a single problem, which is conflicts between generated directories and directories that may have been created by the user.

1. Make surface-level UX improvements to address the pitfall described in [this issue](https://github.com/lf-lang/lingua-franca/issues/2203). This is not inherently a breaking change but is partially dependent on item 3, which is breaking.
2. Make the locations of generated directories configurable. This is also not inherently a breaking change.
3. Change the default locations of some generated directories. This is a breaking change.

# Motivation

[motivation]: #motivation

Currently, `lfc` will silently delete directories that have the same name as generated directories without any warning to the user. For example, if the user is already maintaining a directory called `include` that contains source files, then the current behavior will cause that directory to be overwritten or even deleted, which in the worst case can result in lost work.

Furthermore, the only way to address such a directory naming conflict is for the user to change the names of the directories that they maintains.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Furthermore, the only way to address such a directory naming conflict is for the user to change the names of the directories that they maintains.
Furthermore, the only way to address such a directory naming conflict is for the user to change the names of the directories that they maintain.


# Proposed Implementation

[proposed-implementation]: #proposed-implementation

In the following discussion, the "project root" associated with a given LF file with absolute canonical path `f` is the parent of the most recent ancestor of `f` that is named `src`. The project root is also the location of the applicable `Lingo.toml` file, if such a file exists. Having a directory called `src` inside the root `src` directory is undefined behavior. A `Lingo.toml` file is not applicable to an LF file whose absolute canonical path is not a child of a directory named `src` that is a sibling of that `Lingo.toml` file.

High-level context: In this RFC, there is a bizarre two-way interaction/dependency between `lfc` and `lingo`. This is because of the history of our project: Instead of tolerating a complex-to-use CLI for `lfc` with the understanding that we would later make it easy to use by providing a package manager like `lingo`, we chose to design everything so that `lfc` would work properly even when run with very few arguments. For example, this is why build configurations currently live inside LF files. Now, we have users who invoke `lfc` directly, and so it may not be acceptable to add functionality to `lingo` without easing the transition by replicating some of it in `lfc`.

## 1: Non-breaking UX improvements

1. Detect, on a best-effort basis, whether a directory has been generated by `lfc` before overwriting it. This may involve generating a file in the directory that marks it as having been generated by `lfc`. If the directory has not been generated by `lfc`, warn the user and request confirmation to overwrite the directory. Note that this is much more difficult to do when the directory is only indirectly generated by `lfc`. For example, if `lfc` invokes CMake and CMake generates various directories in the project root, as is the current behavior in the C++ target, then in order to work properly our tools would need to make assumptions about the behavior of CMake in order to recognize these directories as being indirectly generated by `lfc`. This problem should be resolved by requiring directories generated by CMake to be placed in an `install` directory that is generated by `lfc`.

- Users may (unwisely) add their own files inside generated directories. We can use timestamps (provided by the operating system) to detect this scenario and warn the user about this.

2. Add a `lingo clean` functionality that reads Lingo.toml and deletes all directories generated by `lfc`. This makes it practical to use multiple generated directories, such as `fed-gen` and `include-gen`, that are siblings of `src-gen`, without making the process of deleting those directories excessively burdensome to the user.
<!-- 3. Pack parts of `lingo` into the `lfc` JAR so that even users who have not yet migrated to `lingo` can benefit from these UX improvements. Note that this can only work if we add `wasmtime` as a dependency of our toolchain and structure the `lingo` code base such that side-effectful operations such as file system writes are passed in to the core as parameters; this architecture is analogous to concepts used in capability-based security where the potential side effects of a procedure are apparent based on what capabilities (such as functions to do file system writes) are passed into it. -->

## 2: Make locations of generated directories configurable

Introduce optional properties to be included in `Lingo.toml`:

1. `src-gen`: The location for code that is generated by `lfc`
1. `install`: The location for any directories generated by CMake that are not in `src-gen`, and for the `bin` directory regardless of whether the `bin` directory is generated by CMake
1. `fed-gen`: The location for code that is generated for individual federates
1. `include-gen`: The location for user-facing C/C++ header files that are generated by `lfc`

Additionally, add the following property:

1. `bin`: Override the location of the generated `bin` directory so that it is not inside the `install` directory.

Any relative paths shall be resolved relative to the project root.

## 3: Change default locations of generated directories

The generated directory `include` which currently appears in the project root for the C target should be renamed to `include-gen` by default. It is possible that for many users, this will not be a breaking change because we will generate CMake code such that `include-gen` is on the user's include path.

The files generated by CMake in the case of the C++ target should be moved to a directory called `install`. As explained above (in the "UX improvements" section), we should not make it possible for the user to configure their `Lingo.toml` such that these directories stay in the project root, where they are now.

The generated directory `bin` should be moved to be inside whatever the `install` directory is.

To mitigate the problems that could result from these breaking changes, the release notes should include a link to a short Markdown file that presents and discusses an example `Lingo.toml` file that will minimize the amount of change in behavior.

<!-- The proposed solution is to generate the headers in `src-gen/include` only. Because user-provided C files will be included directly in `${LF_MAIN_TARGET}`, this directory will be on the include path for user-provided C files. Here is an example of how an `#include` directive might look in a user-provided C file given the proposed new behavior:

```c
#include "Count/Count.h"
```

For completeness, this is for a C file that is included in the LF project using the following target specification:

```
target C {
cmake-include: ["../../c/count.cmake"],
files: ["../../c"]
}
```

And the following contents of `count.cmake`:

```cmake
target_sources(${LF_MAIN_TARGET} PRIVATE c/count.c)
```

Note the lack of any `target_include_directories` statements in `count.cmake`. This shows how the generated CMakeLists.txt takes care of putting the generated headers on the include path. -->

# Drawbacks

[drawbacks]: #drawbacks

A drawback that applies to this entire RFC is that it will be complex to implement the RFC in such a way that the desired behavior is unambiguous. If the behavior is different depending on whether the user is invoking `lfc` or `lingo`, then we will have to enforce that `lfc` will not be directly invoked without `lingo` when a `Lingo.toml` file is present. This may be difficult to enforce without some hack, such as an environment variable that says whether `lfc` is being invoked by `lingo`.

This proposal consists of a large amount of work, not all of which is strictly necessary in the near term; however, it could be awkward for users to adapt to the changes in this RFC if the RFC is only partially implemented before the next release.

This RFC introduces complexity which users would have to learn about. It may be too early to introduce such complexity because we may realize later that we need to change the behavior again.

## 1.1: Detect whether a directory has been generated by `lfc` before overwriting it

Any request for user confirmation will cause problems in headless runs of `lfc`.

## 3. Change default locations of generated directories

This is a breaking change.

<!-- ## 1.3: Pack parts of `lingo` into `lfc`

This introduces a dependency on `wasmtime` or some other CLI-based WASM executor. `lfc` already has a large number of dependencies, but every dependency is a burden. -->

<!-- One potential drawback of this proposal is that users may not be fully aware of the user-facing include directory, or it may not be as clear that the interfaces defined there are considered stable and safe to depend on. Putting both user-facing and non-user-facing files in `src-gen` could blur the lines between the two kinds of files. This can be addressed by carefully documenting which files are safe to depend on. -->

# Rationale and alternatives

[rationale-and-alternatives]: #rationale-and-alternatives

For a competing alternative, see [RFC #7](https://github.com/lf-lang/rfcs/pull/7). Note that placing the generated headers inside the `src-gen` directory is not compatible in any obvious way with the idea of making the location of generated headers configurable. This is because it will be complex and error-prone for us to allow users to configure distinct generated directories to be children of each other.

A potential drawback of this alternative is that users may not be fully aware of the user-facing include directory, or it may not be as clear that the interfaces defined there are considered stable and safe to depend on. Putting both user-facing and non-user-facing files in `src-gen` could blur the lines between the two kinds of files.

<!-- One alternative is to put these headers in a sibling directory of the `src-gen` directory called `include-gen`, and to either add that directory to the include path or copy it into src-gen and put the copied directory on the include path. The disadvantage of this alternative is that it violates the principle that generated files will go in src-gen; this principle is simple and easy to explain, so it is worth keeping. -->

# Unresolved questions

[unresolved-questions]: #unresolved-questions

Should `lingo clean` read the contents of LF files, or should we forbid generated directory locations from being specified in LF files? The latter is preferable because we wish to encourage users to migrate to `lingo` and because in order to implement the former without having to maintain two distinct LF parsers, we would need `lingo clean` to invoke `lfc` to obtain an AST, which will incur a JVM startup time delay even for users that do not use this target property.

How can the proposed UX improvements be made available to users who have not yet migrated to `lfc`? Should we introduce `lingo` as a dependency of `lfc`, and have `lfc` invoke `lingo`? Or should we introduce `wasmtime` as a dependency of `lfc`, and pack a WASM bin of `lingo` into `lfc`? Or should we ignore the problem and accept that people who do not use `lingo` will not benefit from these changes?

Is it adequate to require all paths to be either absolute, or relative to the project root? Or, will it be necessary to introduce CMake-like variables in `Lingo.toml`, such as `${LF_PROJECT_ROOT}` and `${LF_INSTALL_DIR}`?

To what extent is the location of the `bin` directory the responsibility of `lingo` versus the responsibility of CMake? It is complex to implement the installation of binaries and appealing to let CMake take care of it because CMake is a mature tool that can address all of the corner cases that users are likely to encounter. However, it is also easy to implement a solution in `lingo` that is convenient to use and that works most of the time, and it is necessary for our tools (including our test framework and our VS Code extension) to know where binaries are installed; otherwise they cannot work properly. Besides, we generate executables, such as bash scripts, which are not managed by CMake. It is possible that we can have the best of both worlds by letting our tools query CMake in order to find out where executables will be found and then copy the executables to other locations as needed.

# Future possibilities

[future-possibilities]: #future-possibilities

Let `lingo` automatically add generated directories to a `.gitignore`.