Skip to content
This repository has been archived by the owner on Dec 13, 2021. It is now read-only.

Update to codespan and codespan-reporting 0.6 #160

Merged
merged 5 commits into from
Jan 6, 2020

Conversation

etaoins
Copy link
Owner

@etaoins etaoins commented Jan 4, 2020

This is a fairly big change due to the different representation of Span in codespan 0.4.

Previously all files were represented in a contiguous 32bit space as if they were concatenate in to a single file. This is also how the Rust compiler and Arret's ancient bespoke system used to work. Now, there is a separate FileId that identifies the file while the Span is an offset within the file.

This makes the codespan API cleaner but presents a couple of disadvantages:

  1. In some cases we don't have a FileId. This comes up a lot in tests and also in the REPL when we want to parse for syntax highlighting. In these cases we never report diagnostics so we don't need a FileId. Previously we would just create our own Spans that were offset from 0 and hope we didn't accidentally diagnose with them.

    We could create a codespan::Files every time we parse but this adds overhead, especially because codespan isn't thread safe so it requires locking.

    We use Option<FileId> in our custom Span type to get around the cases where we don't want to commit to a full file.

  2. This increases the size of our Span type from 8 bytes to 12 bytes (start + end + Option<FileId>).

On the plus side this brings us back onboard with codepan which should help with LSP support. There also seems to be better rendering of diagnostics across multiple files.

This is a fairly big change due to the different representation of
`Span` in `codespan` 0.6.

Previously all files were represented in a contiguous 32bit space as if
they were concatenate in to a single file. This is also how the Rust
compiler and Arret's ancient bespoke system used to work. Now, there is
a separate `FileId` that identifies the file while the `Span` is an
offset within the file.

This makes the `codespan` API cleaner but presents a couple of
disadvantages:

1. In some cases we don't have a `FileId`. This comes up a lot in tests
   and also in the REPL when we want to parse for syntax highlighting.
   In these cases we never report diagnostics so we don't need a
   `FileId`. Previously we would just create our own `Span`s that were
   offset from 0 and hope we didn't accidentally diagnose with them.

   We could create a `codespan::Files` every time we parse but this adds
   overhead, especially because `codespan` isn't thread safe so it
   requires locking.

   We use `Option<FileId>` in our custom `Span` type to get around the
   cases where we don't want to commit to a full file.

2. This blows up the size of our `Span` type from 8 bytes to 13 bytes
   (start + end + `FileId` + `Option` overhead). This will be rounded up
   to 16 bytes in most situations due to alignment and padding.

   This can be reduced to 12 bytes by making `FileId` `NonZeroU32`
   upstream.

There also the complication of `Files` no longer being parameterized on
string types. This comes with two other downsides:

1. We can no longer use `Cow<str>` to directly share the `&'static str`
   from our loaded RFI libraries.

2. Because we now need the `FileId` to build our `Span` we can't parse
   the input until we add it to `Files`. This means we need to borrow
   the source back from `Files` after we add it. However, because we
   can't use `Arc` this means we need to hold a read lock on `Files`
   while we parse.

   This effectively means we can only load + parse a single file at a
   time across all threads. For our integration tests that need to parse
   a lot of files up front this causes contention.

Additionally, there is no way to create a `codespan::Span` in a const
context anymore. This required replacing the `EMPTY_SPAN` constant with
an `empty_span()` function.

On the plus side this brings us back onboard with `codepan` which should
help with LSP support. There also seems to be better rendering of
diagnostics across multiple files.
This is enabled by brendanzab/codespan#143 which requires us to take a
Git dependency until `codespan` 0.7 is released.
This allows us to parameterise `Files` with a new `SourceText` that can
both avoid cloning RFI type names and holding a read lock on `Files`
during parsing. This is the last of the major regressions since
`codespan` 0.3 besides the unavoidable increase in the size of `Span`.
@etaoins etaoins merged commit d640978 into master Jan 6, 2020
@etaoins etaoins deleted the update-to-codespan-0.6 branch January 6, 2020 08:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant