Skip to content

Latest commit

 

History

History
56 lines (40 loc) · 2.55 KB

Readme.org

File metadata and controls

56 lines (40 loc) · 2.55 KB

louis-migrate-yaml

A tool to migrate liblouis YAML files to a new normalized format.

Why

While the existing liblouis YAML file format is very succinct it is not valid according to the YAML spec because liblouis uses the same key in a mapping multiple times, i.e. the keys in a mapping are not unique. This is not allowed.

This is also the reason why Serde, the standard mechanism to read YAML files cannot be used to read the liblouis YAML files.

The goal of this tool is to migrate the liblouis YAML format to a new valid YAML format.

How

The original C-based YAML parser can handle the liblouis YAML because it is an event based parser and has no problem with non-unique keys in mappings.

So in theory we could enhance the C-based YAML tool to convert the YAML tests.

Instead we decided to write a Rust-based tool based on the Rust libyaml bindings.

Why not integrate this in the main liblouis Rust implementation?

Instead of converting the liblouis YAML we could just use this implementation to run the YAML tests. Why separate it into a different tool?

The main reason is the dependency on libyaml. We’d like to keep to pure Rust to make sure we can compile liblouis everywhere including WebAssembly.

Granted the newest version of the bindings uses unsafe-libyaml, a version of libyaml that was transpiled to Rust, hence the bindings are no longer dependent on the C library. But it probably still is a big pile of unsafe code, not something you desperately want to depend on.

If we keep the dependency in a separate tool, and migrate the YAML test files to a new format we can keep louis-rs in pure Rust.

Depend on libyaml after all but just for checking the YAML tests?

We could, as a provisional measure, make louis-rs depend on libyaml and interprete the original YAML files directly to run the tests. This would certainly simplify the process of the rewrite in Rust. We could argue that only the checkyaml functionality is really dependent on libyaml. We could also, in theory, make this feature optional and hide it behind a feature. So only the checking of YAML files would maybe to be so easily portable. I might be willing to make that compromise.