Skip to content

Commit

Permalink
refactor: take review into account
Browse files Browse the repository at this point in the history
  • Loading branch information
Conaclos committed Nov 10, 2023
1 parent 959db97 commit c9ed86a
Show file tree
Hide file tree
Showing 41 changed files with 881 additions and 218 deletions.
151 changes: 76 additions & 75 deletions crates/biome_analyze/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -299,17 +299,44 @@ just ready

### Rule configuration

Some rules may allow customization using configuration.
Biome tries to introduce a minimum of the rule configuration.
Before adding an option discuss that.
Some rules may allow customization using options.
We try to keep rule options to a minimum and only when needed.
Before adding an option, it's worth a discussion.
Options should follow our [technical philosophy](https://biomejs.dev/internals/philosophy/#technical).

The first step is to create the data representation of the rule's configuration.
Let's assume that the rule we implement support the following options:

- `behavior`: a string among `"A"`, `"B"`, and `"C"`;
- `threshold`: an integer between 0 and 255;
- `behaviorExceptions`: an array of strings.

We would like to set the options in the `biome.json` configuration file:

```json
{
"linter": {
"rules": {
"recommended": true,
"nursery": {
"my-rule": {
"behavior": "A",
"threshold": 30,
"behaviorExceptions": ["f"],
}
}
}
}
}
```

The first step is to create the Rust data representation of the rule's options.

```rust,ignore
#[derive(Debug, Default, Clone)]
pub struct GreatRuleOptions {
main_behavior: Behavior,
extra_behaviors: Vec<Behavior>,
pub struct MyRuleOptions {
behavior: Behavior,
threshold: u8,
behavior_exceptions: Vec<String>
}
#[derive(Debug, Default, Clone)]
Expand All @@ -321,24 +348,18 @@ pub enum Behavior {
}
```

You also need to picture the equivalent data representation in _JSON_:

```json
{
"mainBehavior": "A",
"extraBehaviors": ["C"]
}
```
To allow deserializing instances of the types `MyRuleOptions` and `Behavior`,
they have to implement the `Deserializable` trait from the `biome_deserialize` crate.

So, you have to implement the `Deserializable` trait for these two types.
An implementation can reuse an existing type that implements `Deserializable`.
For example, we could deserialize `Behavior` by first deserializing a string,
and then checking that the string is either `A`, `B`, or `C`.
This is what we do in the following code snippet.
Note that, instead of using `String`, we use `TokenText`.
This avoids a string allocation.
In the following code, we implement `Deserializable` for `Behavior`.
We first deserialize the input into a `TokenText`.
Then we validate the retrieved text by checking that it is one of the allowed string variants.
If it is an unknown variant, we emit a diagnostic and return `None` to signal that the deserialization failed.
Otherwise, we return the corresponding variant.

```rust,ignore
use biome_deserialize::{Deserializable, DeserializableValue, DeserializationVisitor};
impl Deserializable for Behavior {
fn deserialize(
value: impl DeserializableValue,
Expand All @@ -348,9 +369,9 @@ impl Deserializable for Behavior {
let range = value.range();
let value = TokenText::deserialize(value, diagnostics)?;
match value.text() {
"A" => Some(Behavior.A),
"B" => Some(Behavior.B),
"C" => Some(Behavior.C),
"A" => Some(Behavior::A),
"B" => Some(Behavior::B),
"C" => Some(Behavior::C),
_ => {
diagnostics.push(DeserializationDiagnostic::new_unknown_value(
value.text(),
Expand All @@ -364,57 +385,58 @@ impl Deserializable for Behavior {
}
```

Implementing `Deserializable` for `GreatRuleOptions` requires more work,
because we cannot rely on an existing deserializable type.
We have to use a _deserialization visitor_.
We create a visitor by creating a zero-sized `struct` that implements `DeserializationVisitor`.
A visitor must specify the type that it produces in its associated type `Output`.
Here the visitor produces a `GreatRuleOptions`.
It must also specify which type is expected with the associated constant `EXPECTED_TYPE`.
Here we deserialize an object (a _map_ of string-value pairs).
Thus, it expects a `ExpectedType::MAP`.
So we implement `visit_map` that traverses the key-value pairs,
deserializes every key as a string (a token text to avoid allocating a string),
and deserializes the value based on the key.
To implement `Deserializable` for `MyRuleOptions`,
we cannot reuse an existing deserializer because a `struct` has custom fields.
Instead, we delegate the deserialization to a visitor.
We implement a visitor by implementing the `DeserializationVisitor` trait from the `biome_deserialize` crate.
The visitor traverses every field (key-value pair) of our object and deserialize them.
If an unknown field is found, we emit a diagnostic.

```rust,ignore
impl Deserializable for GreatRuleOptions {
use biome_deserialize::{DeserializationDiagnostic, Deserializable, DeserializableValue, DeserializationVisitor, VisitableType};
impl Deserializable for MyRuleOptions {
fn deserialize(
value: impl DeserializableValue,
diagnostics: &mut Vec<DeserializationDiagnostic>,
) -> Option<Self> {
value.deserialize(GreatRuleOptionsVisitor, diagnostics)
value.deserialize(MyRuleOptionsVisitor, diagnostics)
}
}
struct GreatRuleOptionsVisitor;
impl DeserializationVisitor for GreatRuleOptionsVisitor {
type Output = GreatRuleOptions;
struct MyRuleOptionsVisitor;
impl DeserializationVisitor for MyRuleOptionsVisitor {
type Output = MyRuleOptions;
const EXPECTED_TYPE: ExpectedType = ExpectedType::MAP;
const EXPECTED_TYPE: VisitableType = VisitableType::MAP;
fn visit_map(
self,
members: impl Iterator<Item = (impl DeserializableValue, impl DeserializableValue)>,
_range: TextRange,
diagnostics: &mut Vec<DeserializationDiagnostic>,
) -> Option<Self::Output> {
const ALLOWED_KEYS: &[&str] = &["mainBehavior", "extraBehavior"];
const ALLOWED_KEYS: &[&str] = &["behavior", "threshold", "behaviorExceptions"];
let mut result = Self::Output::default();
for (key, value) in members {
let key_range = key.range();
let Some(key) = TokenText::deserialize(key, diagnostics) else {
continue;
};
match key.text() {
"mainBehavior" => {
if let Some(strict_case) = Deserialize::deserialize(value, diagnostics) {
result.main_behavior = value;
"behavior" => {
if let Some(behavior) = Deserialize::deserialize(value, diagnostics) {
result.behavior = behavior;
}
}
"threshold" => {
if let Some(threshold) = Deserialize::deserialize(value, diagnostics) {
result.behavior = threshold;
}
}
"extraBehavior" => {
if let Some(enum_member_case) = Deserialize::deserialize(value, diagnostics) {
result.extra_behavior = enum_member_case;
"behaviorExceptions" => {
if let Some(exceptions) = Deserialize::deserialize(value, diagnostics) {
result.behavior_exceptions = exceptions;
}
}
_ => diagnostics.push(DeserializationDiagnostic::new_unknown_key(
Expand All @@ -431,53 +453,32 @@ impl DeserializationVisitor for GreatRuleOptionsVisitor {

Once done, you can set the associated type `Options` of the rule:


```rust,ignore
impl Rule for GreatRule {
impl Rule for MyRule {
type Query = Semantic<JsCallExpression>;
type State = Fix;
type Signals = Vec<Self::State>;
type Options = GreatRuleOptions;
type Options = MyRuleOptions;
...
}
```

This allows the rule to be configured inside `biome.json` file like:

```json
{
"linter": {
"rules": {
"recommended": true,
"nursery": {
"greatRule": {
"level": "error",
"options": {
"mainBehavior": "A"
}
}
}
}
}
}
```

A rule can retrieve its option with:

```rust,ignore
let options = ctx.options();
```

The compiler should warn you that `GreatRuleOptions` does not implement some required types.
The compiler should warn you that `MyRuleOptions` does not implement some required types.
We currently require implementing _serde_'s traits `Deserialize`/`Serialize` and _Bpaf_'s parser trait.
You can simply use a derive macros:

```rust,ignore
#[derive(Debug, Default, Clone, Serialize, Deserialize, Bpaf)]
#[cfg_attr(feature = "schemars", derive(JsonSchema))]
#[serde(rename_all = "camelCase", deny_unknown_fields)]
pub struct GreatRuleOptions {
pub struct MyRuleOptions {
#[bpaf(hide)]
#[serde(default, skip_serializing_if = "is_default")]
main_behavior: Behavior,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ configuration ━━━━━━━━━━━━━━━━━━━━━━
```block
biome.json:6:17 deserialize ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
× Incorrect type, expected a string.
× Incorrect type, expected a string, but received a boolean.
4 │ },
5 │ "javascript": {
Expand Down
89 changes: 89 additions & 0 deletions crates/biome_deserialize/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# `biome_deserialize`

`biome_deserialize` is a framework for deserializing Rust data structures generically.

The crate consists of data structures that know how to deserialize themselves along with data formats that know how to deserialize data.
It provides the layer by which these two groups interact with each other,
allowing any supported data structure to be deserialized using any supported data format.

`biome_deserialize` is designed for textual data formats.
It assumes that every supported data formats supports the following types:

- null-like values;
- boolean;
- number -- integers and floats;
- string;
- array;
- maps of key-value pairs (covers objects).

This crate is inspired by [serde](https://serde.rs/).
The only supported data format is JSON.

## Design overview

The crate provides three traits:

- `Deserializable`;
- `DeserializableValue`;
- `DeserializationVisitor`.

A data structure that knows how to deserialize itself is one that implements the `Deserializable` trait.

`DeserializableValue` is implemented by data formats such as _JSON_.

Simple implementations of `Deserializable` can reuse other deserializable data structures.
For instance, an enumeration that corresponds to a string among A, B, and C, can first deserialize a string and then check that the string is one of its values.

Data structures that cannot directly use another deserializable data structures, use a visitor.
A visitor is generally a zero-sized data structure that implements the `DeserializationVisitor` trait.
A [visitor](https://en.wikipedia.org/wiki/Visitor_pattern) is a well-known design pattern.
It allows selecting an implementation based on the deserialized type without bothering of data format details.

## Usage examples

### Deserializing common types

`biome_deserialize` implements `Deserializable` for common Rust data structure.

In the following example, we deserialize a boolean, an array of integers, and an unordered map of string-integer pairs.

```rust
use biome_deserialize::json::deserialize_from_json_str;
use biome_deserialize::Deserialized;
use biome_json_parser::JsonParserOptions;

let json = "false";
let Deserialized {
deserialized,
diagnostics,
} = deserialize_from_json_str::<bool>(&source, JsonParserOptions::default());
assert_eq!(deserialized, Some(false));

let json = "[0, 1]";
let Deserialized {
deserialized,
diagnostics,
} = deserialize_from_json_str::<Vec<u8>>(&source, JsonParserOptions::default());
assert_eq!(deserialized, Some(vec![0, 1]));

use std::collections::HashMap;
let json = r#"{ "a": 0, "b": 1 }"#;
let Deserialized {
deserialized,
diagnostics,
} = deserialize_from_json_str::<HashMap<String, u8>>(&source, JsonParserOptions::default());
assert_eq!(deserialized, Some(HashMap::from([("a".to_string(), 0), ("b".to_string(), 1)])));
```

### Custom integer range

...WIP...

Sometimes you want to deserialize an integer and ensure that it is between two given integers.

For instance, let's assume we want to deserialize a percentage represented by an integer between 0 and 100.
We can use the new-type pattern in Rust:

```rust
pub struct Percentage(u8);
```
Loading

0 comments on commit c9ed86a

Please sign in to comment.