Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize escape_ascii. #125340

Closed

Conversation

reitermarkus
Copy link
Contributor

@reitermarkus reitermarkus commented May 20, 2024

Follow-up to #124307. CC @joboet

Alternative/addition to #125317.

Based on #124307 (comment), it doesn't look like this function is the cause for the regression, but this change produces even fewer instructions (https://rust.godbolt.org/z/nebzqoveG).

@rustbot
Copy link
Collaborator

rustbot commented May 20, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 20, 2024
@reitermarkus
Copy link
Contributor Author

r? @Kobzol

@rustbot rustbot assigned Kobzol and unassigned Mark-Simulacrum May 20, 2024
@Kobzol
Copy link
Contributor

Kobzol commented May 20, 2024

I'm probably not the best person to review this, but I can try. I have the same question as here though - do you have some (micro)benchmarks to show that this is an improvement? :)

@rust-log-analyzer

This comment has been minimized.

@reitermarkus
Copy link
Contributor Author

@Kobzol, what's the best way to do a benchmark for this? Just create a standalone crate with two versions of this function, or is there a recommended way to test against different commits in this repo?

@rust-log-analyzer

This comment has been minimized.

@Kobzol
Copy link
Contributor

Kobzol commented May 22, 2024

Well, that depends. From the microbenchmark side, you could show e.g. on godbolt that this produces "objectively" better asssembly. From the macrobenchmark side, you would probably bring some program that is actually improved by this change.

Usually people have some explicit motivation for doing these kinds of optimizations, which is demonstrated by some change either in codegen or an improvement for some real-world code.

@reitermarkus
Copy link
Contributor Author

reitermarkus commented May 23, 2024

e.g. on godbolt

I have updated the Godbolt link in the PR description to reflect the current changes, i.e. 3 fewer jumps and 7 fewer instructions.

I have also done a micro benchmark using criterion:

Source
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(let_chains)]
#![feature(inline_const)]
#![feature(const_option)]

use core::ascii;
use core::ops::Range;

use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion, PlotConfiguration};

const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();

#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 2) };
    let mut output = [ascii::Char::Null; N];
    output[0] = ascii::Char::ReverseSolidus;
    output[1] = a;
    (output, 0..2)
}

#[inline]
const fn escape_ascii_before<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    match byte {
        b'\t' => backslash(ascii::Char::SmallT),
        b'\r' => backslash(ascii::Char::SmallR),
        b'\n' => backslash(ascii::Char::SmallN),
        b'\\' => backslash(ascii::Char::ReverseSolidus),
        b'\'' => backslash(ascii::Char::Apostrophe),
        b'\"' => backslash(ascii::Char::QuotationMark),
        byte => {
            let mut output = [ascii::Char::Null; N];

            if let Some(c) = byte.as_ascii()
                && !byte.is_ascii_control()
            {
                output[0] = c;
                (output, 0..1)
            } else {
                let hi = HEX_DIGITS[(byte >> 4) as usize];
                let lo = HEX_DIGITS[(byte & 0xf) as usize];

                output[0] = ascii::Char::ReverseSolidus;
                output[1] = ascii::Char::SmallX;
                output[2] = hi;
                output[3] = lo;

                (output, 0..4)
            }
        }
    }
}

#[inline]
const fn escape_ascii_after<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    let mut output = [ascii::Char::Null; N];

    // NOTE: This `match` is roughly ordered by the frequency of ASCII
    //       characters for performance.
    match byte.as_ascii() {
        Some(
            c @ ascii::Char::QuotationMark
            | c @ ascii::Char::Apostrophe
            | c @ ascii::Char::ReverseSolidus,
        ) => backslash(c),
        Some(c) if !byte.is_ascii_control() => {
            output[0] = c;
            (output, 0..1)
        }
        Some(ascii::Char::LineFeed) => backslash(ascii::Char::SmallN),
        Some(ascii::Char::CarriageReturn) => backslash(ascii::Char::SmallR),
        Some(ascii::Char::CharacterTabulation) => backslash(ascii::Char::SmallT),
        _ => {
            let hi = HEX_DIGITS[(byte >> 4) as usize];
            let lo = HEX_DIGITS[(byte & 0xf) as usize];

            output[0] = ascii::Char::ReverseSolidus;
            output[1] = ascii::Char::SmallX;
            output[2] = hi;
            output[3] = lo;

            (output, 0..4)
        }
    }
}

pub fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("escape_ascii");

    group.sample_size(1000);

    for i in [b'a', b'Z', b'\"', b'\t', b'\n', b'\xff'] {
        let i_s = if let Some(c) = i.as_ascii() {
            format!("{c:?}")
        } else {
            format!("'\\x{i:02x}'")
        };

        group.bench_with_input(BenchmarkId::new("before", &i_s), &i, |b, i| {
            b.iter(|| escape_ascii_before::<4>(*i));
        });
        group.bench_with_input(BenchmarkId::new("after", &i_s), &i, |b, i| {
            b.iter(|| escape_ascii_after::<4>(*i));
        });
    }

    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
Output
escape_ascii/before/'a' time:   [1.6945 ns 1.7047 ns 1.7170 ns]
Found 21 outliers among 1000 measurements (2.10%)
  8 (0.80%) low mild
  4 (0.40%) high mild
  9 (0.90%) high severe
escape_ascii/after/'a'  time:   [427.36 ps 428.23 ps 429.15 ps]
Found 23 outliers among 1000 measurements (2.30%)
  2 (0.20%) high mild
  21 (2.10%) high severe
escape_ascii/before/'Z' time:   [1.6944 ns 1.6971 ns 1.6996 ns]
escape_ascii/after/'Z'  time:   [430.95 ps 431.52 ps 432.06 ps]
Found 372 outliers among 1000 measurements (37.20%)
  230 (23.00%) low severe
  37 (3.70%) high mild
  105 (10.50%) high severe
escape_ascii/before/'"' time:   [1.3287 ns 1.3308 ns 1.3328 ns]
Found 1 outliers among 1000 measurements (0.10%)
  1 (0.10%) high mild
escape_ascii/after/'"'  time:   [429.44 ps 430.54 ps 431.73 ps]
Found 9 outliers among 1000 measurements (0.90%)
  2 (0.20%) high mild
  7 (0.70%) high severe
escape_ascii/before/'\t'
                        time:   [1.3326 ns 1.3369 ns 1.3413 ns]
Found 99 outliers among 1000 measurements (9.90%)
  80 (8.00%) high mild
  19 (1.90%) high severe
escape_ascii/after/'\t' time:   [1.3184 ns 1.3215 ns 1.3246 ns]
Found 308 outliers among 1000 measurements (30.80%)
  158 (15.80%) low mild
  10 (1.00%) high mild
  140 (14.00%) high severe
escape_ascii/before/'\n'
                        time:   [1.3336 ns 1.3377 ns 1.3419 ns]
escape_ascii/after/'\n' time:   [1.3033 ns 1.3057 ns 1.3080 ns]
Found 223 outliers among 1000 measurements (22.30%)
  210 (21.00%) low mild
  9 (0.90%) high mild
  4 (0.40%) high severe
escape_ascii/before/'\xff'
                        time:   [1.5074 ns 1.5116 ns 1.5168 ns]
Found 7 outliers among 1000 measurements (0.70%)
  3 (0.30%) high mild
  4 (0.40%) high severe
escape_ascii/after/'\xff'
                        time:   [444.86 ps 456.22 ps 469.96 ps]
Found 51 outliers among 1000 measurements (5.10%)
  8 (0.80%) high mild
  43 (4.30%) high severe

Graph (unfortunately Y-axis is not sorted by input):

violin

@Kobzol
Copy link
Contributor

Kobzol commented May 24, 2024

Your benchmark was executed on a single byte input? It would be good to also see how it behaves on something larger, e.g. a short/medium size/long byte slice, to see the effects in practice.

Could you describe the motivation for this change? If I understand your comment correctly, "frequency of ASCII characters" means how often do given characters appear in the input. It makes sense to me to optimize for the common case, which I would expect is that the input does not need to be escaped at all. So my intuition would be to start with first checking if it's an alphabetic ASCII character, and then continue from there. So this optimization seems reasonable, in general. I just wonder if you have some use-case where this escaping is an actual bottleneck and we could actually see some wins in practice?

Btw, in general, the fact that there are less instructions doesn't necessarily mean that the code will be faster. In microarchitecture simulation (llvm mca), the original code seems to have better IPC (https://rust.godbolt.org/z/3qKeohGjs), athough in this case it's hard to decide upon that, because this function is very data dependent.

@clarfonthey
Copy link
Contributor

Hmm.

Omitting the non-ASCII case, perhaps this could be done with a lookup table? You could squeeze it down to just 127 bytes if you use the eighth bit to determine if there should be a backslash, since the escaped character will only need 7 bits. This way, you don't need to worry about ordering things by prevalence. Have no idea what the current codegen looks like so I dunno if it'd be much faster, but that feels like the best route to me.

@rust-log-analyzer

This comment has been minimized.

@reitermarkus
Copy link
Contributor Author

I have made some further changes and updated the Godbolt link in the PR description. The instruction count is again slightly lower, and LLCM-MCA now also shows fewer instructions and better IPC and throughput.

I re-ran the previous benchmark with larger inputs (a 100MB file with random data, and a 100MB JSON file). The results show no difference between the two functions:

violin
violin
violin

I also ran LLVM-MCA locally for Cortex M4, and it shows ~25% fewer instructions with ~35% higher throughput:

LLVM-MCA (Cortex M4) - before

cargo asm --features before --lib --target thumbv7em-none-eabihf --att --mca --mca-arg=-mcpu=cortex-m4

    Finished release [optimized] target(s) in 0.03s

Iterations:        100
Instructions:      6900
Total Cycles:      6901
Total uOps:        6900

Dispatch Width:    1
uOps Per Cycle:    1.00
IPC:               1.00
Block RThroughput: 69.0


Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)

[1]    [2]    [3]    [4]    [5]    [6]    Instructions:
 1      1     1.00                        mvn	r2, #8
 1      1     1.00                        uxtab	r3, r2, r1
 1      1     1.00                        uxtb.w	r12, r1
 1      1     1.00                        cmp	r3, #30
 1      1     1.00                  U     bhi	.LBB0_4
 1      1     1.00                  U     tbb	[pc, r3]
 1      1     1.00                        mov.w	r1, #512
 1      1     1.00           *            strh	r1, [r0, #4]
 1      1     1.00                        movw	r1, #29788
 1      1     1.00           *            str	r1, [r0]
 1      1     1.00                  U     bx	lr
 1      1     1.00                        cmp.w	r12, #92
 1      1     1.00                  U     bne	.LBB0_6
 1      1     1.00                        mov.w	r1, #512
 1      1     1.00           *            strh	r1, [r0, #4]
 1      1     1.00                        movw	r1, #23644
 1      1     1.00           *            str	r1, [r0]
 1      1     1.00                  U     bx	lr
 1      1     1.00                        cmp.w	r12, #128
 1      1     1.00                        mov	r3, r12
 1      1     1.00                  U     it	hs
 1      1     1.00                        movhs	r3, #128
 1      1     1.00                        sxtb	r2, r1
 1      1     1.00                        cmp	r2, #0
 1      1     1.00                  U     bmi	.LBB0_9
 1      1     1.00                        cmp.w	r12, #32
 1      1     1.00                  U     blo	.LBB0_9
 1      1     1.00                        cmp.w	r12, #127
 1      1     1.00                  U     itttt	ne
 1      1     1.00                        movne	r1, #1
 1      1     1.00           *            strbne	r1, [r0, #5]
 1      1     1.00                        movne	r1, #0
 1      1     1.00           *            strne.w	r1, [r0, #1]
 1      1     1.00                  U     itt	ne
 1      1     1.00           *            strbne	r3, [r0]
 1      1     1.00                  U     bxne	lr
 1      1     1.00                        movw	r3, :lower16:.L__unnamed_1
 1      1     1.00                        mov.w	r2, #1024
 1      1     1.00                        and	r1, r1, #15
 1      1     1.00           *            strh	r2, [r0, #4]
 1      1     1.00                        movt	r3, :upper16:.L__unnamed_1
 1      1     1.00                        lsr.w	r2, r12, #4
 1      2     1.00    *                   ldrb	r2, [r3, r2]
 1      2     1.00    *                   ldrb	r1, [r3, r1]
 1      1     1.00                        movw	r3, #30812
 1      1     1.00           *            strh	r3, [r0]
 1      1     1.00           *            strb	r1, [r0, #3]
 1      1     1.00           *            strb	r2, [r0, #2]
 1      1     1.00                  U     bx	lr
 1      1     1.00                        mov.w	r1, #512
 1      1     1.00           *            strh	r1, [r0, #4]
 1      1     1.00                        movw	r1, #28252
 1      1     1.00           *            str	r1, [r0]
 1      1     1.00                  U     bx	lr
 1      1     1.00                        mov.w	r1, #512
 1      1     1.00           *            strh	r1, [r0, #4]
 1      1     1.00                        movw	r1, #29276
 1      1     1.00           *            str	r1, [r0]
 1      1     1.00                  U     bx	lr
 1      1     1.00                        mov.w	r1, #512
 1      1     1.00           *            strh	r1, [r0, #4]
 1      1     1.00                        movw	r1, #8796
 1      1     1.00           *            str	r1, [r0]
 1      1     1.00                  U     bx	lr
 1      1     1.00                        mov.w	r1, #512
 1      1     1.00           *            strh	r1, [r0, #4]
 1      1     1.00                        movw	r1, #10076
 1      1     1.00           *            str	r1, [r0]
 1      1     1.00                  U     bx	lr


Resources:
[0]   - M4Unit


Resource pressure per iteration:
[0]    
69.00  

Resource pressure by instruction:
[0]    Instructions:
1.00   mvn	r2, #8
1.00   uxtab	r3, r2, r1
1.00   uxtb.w	r12, r1
1.00   cmp	r3, #30
1.00   bhi	.LBB0_4
1.00   tbb	[pc, r3]
1.00   mov.w	r1, #512
1.00   strh	r1, [r0, #4]
1.00   movw	r1, #29788
1.00   str	r1, [r0]
1.00   bx	lr
1.00   cmp.w	r12, #92
1.00   bne	.LBB0_6
1.00   mov.w	r1, #512
1.00   strh	r1, [r0, #4]
1.00   movw	r1, #23644
1.00   str	r1, [r0]
1.00   bx	lr
1.00   cmp.w	r12, #128
1.00   mov	r3, r12
1.00   it	hs
1.00   movhs	r3, #128
1.00   sxtb	r2, r1
1.00   cmp	r2, #0
1.00   bmi	.LBB0_9
1.00   cmp.w	r12, #32
1.00   blo	.LBB0_9
1.00   cmp.w	r12, #127
1.00   itttt	ne
1.00   movne	r1, #1
1.00   strbne	r1, [r0, #5]
1.00   movne	r1, #0
1.00   strne.w	r1, [r0, #1]
1.00   itt	ne
1.00   strbne	r3, [r0]
1.00   bxne	lr
1.00   movw	r3, :lower16:.L__unnamed_1
1.00   mov.w	r2, #1024
1.00   and	r1, r1, #15
1.00   strh	r2, [r0, #4]
1.00   movt	r3, :upper16:.L__unnamed_1
1.00   lsr.w	r2, r12, #4
1.00   ldrb	r2, [r3, r2]
1.00   ldrb	r1, [r3, r1]
1.00   movw	r3, #30812
1.00   strh	r3, [r0]
1.00   strb	r1, [r0, #3]
1.00   strb	r2, [r0, #2]
1.00   bx	lr
1.00   mov.w	r1, #512
1.00   strh	r1, [r0, #4]
1.00   movw	r1, #28252
1.00   str	r1, [r0]
1.00   bx	lr
1.00   mov.w	r1, #512
1.00   strh	r1, [r0, #4]
1.00   movw	r1, #29276
1.00   str	r1, [r0]
1.00   bx	lr
1.00   mov.w	r1, #512
1.00   strh	r1, [r0, #4]
1.00   movw	r1, #8796
1.00   str	r1, [r0]
1.00   bx	lr
1.00   mov.w	r1, #512
1.00   strh	r1, [r0, #4]
1.00   movw	r1, #10076
1.00   str	r1, [r0]
1.00   bx	lr
LLVM-MCA (Cortex M4) - after

cargo asm --features after --lib --target thumbv7em-none-eabihf --att --mca --mca-arg=-mcpu=cortex-m4

    Finished release [optimized] target(s) in 0.02s

Iterations:        100
Instructions:      5100
Total Cycles:      5301
Total uOps:        5100

Dispatch Width:    1
uOps Per Cycle:    0.96
IPC:               0.96
Block RThroughput: 51.0


Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)

[1]    [2]    [3]    [4]    [5]    [6]    Instructions:
 1      1     1.00           *      U     push	{r4, r6, r7, lr}
 1      1     1.00                  U     add	r7, sp, #8
 1      1     1.00                        uxtb	r4, r1
 1      1     1.00                        movw	r12, :lower16:.L__unnamed_1
 1      1     1.00                        and	r3, r1, #15
 1      1     1.00                        movt	r12, :upper16:.L__unnamed_1
 1      1     1.00                        lsrs	r2, r4, #4
 1      1     1.00                        cmp	r4, #126
 1      2     1.00    *                   ldrb.w	lr, [r12, r3]
 1      2     1.00    *                   ldrb.w	r12, [r12, r2]
 1      1     1.00                  U     bhi	.LBB0_9
 1      1     1.00                        movs	r2, #92
 1      1     1.00                        movs	r3, #2
 1      1     1.00                        cmp	r4, #34
 1      1     1.00                  U     beq	.LBB0_4
 1      1     1.00                        cmp	r4, #39
 1      1     1.00                  U     beq	.LBB0_4
 1      1     1.00                        cmp	r4, #92
 1      1     1.00                  U     bne	.LBB0_5
 1      1     1.00                        mov	r4, r1
 1      1     1.00                        b	.LBB0_10
 1      1     1.00                        cmp	r4, #31
 1      1     1.00                  U     bls	.LBB0_7
 1      1     1.00                        movs	r4, #120
 1      1     1.00                        movs	r3, #1
 1      1     1.00                        mov	r2, r1
 1      1     1.00                        b	.LBB0_10
 1      1     1.00                        subs	r1, #9
 1      1     1.00                        uxtb	r2, r1
 1      1     1.00                        cmp	r2, #4
 1      1     1.00                  U     bhi	.LBB0_9
 1      1     1.00                        movw	r2, :lower16:.Lswitch.table.after.1
 1      1     1.00                        sxtb	r1, r1
 1      1     1.00                        movt	r2, :upper16:.Lswitch.table.after.1
 1      2     1.00    *                   ldrb	r4, [r2, r1]
 1      1     1.00                        movw	r2, :lower16:.Lswitch.table.after
 1      1     1.00                        movt	r2, :upper16:.Lswitch.table.after
 1      2     1.00    *                   ldrb	r3, [r2, r1]
 1      1     1.00                        movs	r2, #92
 1      1     1.00                        b	.LBB0_10
 1      1     1.00                        movs	r4, #120
 1      1     1.00                        movs	r2, #92
 1      1     1.00                        movs	r3, #4
 1      1     1.00                        movs	r1, #0
 1      1     1.00           *            strb	r3, [r0, #5]
 1      1     1.00           *            strb	r1, [r0, #4]
 1      1     1.00           *            strb.w	lr, [r0, #3]
 1      1     1.00           *            strb.w	r12, [r0, #2]
 1      1     1.00           *            strb	r4, [r0, #1]
 1      1     1.00           *            strb	r2, [r0]
 1      2     1.00    *             U     pop	{r4, r6, r7, pc}


Resources:
[0]   - M4Unit


Resource pressure per iteration:
[0]    
51.00  

Resource pressure by instruction:
[0]    Instructions:
1.00   push	{r4, r6, r7, lr}
1.00   add	r7, sp, #8
1.00   uxtb	r4, r1
1.00   movw	r12, :lower16:.L__unnamed_1
1.00   and	r3, r1, #15
1.00   movt	r12, :upper16:.L__unnamed_1
1.00   lsrs	r2, r4, #4
1.00   cmp	r4, #126
1.00   ldrb.w	lr, [r12, r3]
1.00   ldrb.w	r12, [r12, r2]
1.00   bhi	.LBB0_9
1.00   movs	r2, #92
1.00   movs	r3, #2
1.00   cmp	r4, #34
1.00   beq	.LBB0_4
1.00   cmp	r4, #39
1.00   beq	.LBB0_4
1.00   cmp	r4, #92
1.00   bne	.LBB0_5
1.00   mov	r4, r1
1.00   b	.LBB0_10
1.00   cmp	r4, #31
1.00   bls	.LBB0_7
1.00   movs	r4, #120
1.00   movs	r3, #1
1.00   mov	r2, r1
1.00   b	.LBB0_10
1.00   subs	r1, #9
1.00   uxtb	r2, r1
1.00   cmp	r2, #4
1.00   bhi	.LBB0_9
1.00   movw	r2, :lower16:.Lswitch.table.after.1
1.00   sxtb	r1, r1
1.00   movt	r2, :upper16:.Lswitch.table.after.1
1.00   ldrb	r4, [r2, r1]
1.00   movw	r2, :lower16:.Lswitch.table.after
1.00   movt	r2, :upper16:.Lswitch.table.after
1.00   ldrb	r3, [r2, r1]
1.00   movs	r2, #92
1.00   b	.LBB0_10
1.00   movs	r4, #120
1.00   movs	r2, #92
1.00   movs	r3, #4
1.00   movs	r1, #0
1.00   strb	r3, [r0, #5]
1.00   strb	r1, [r0, #4]
1.00   strb.w	lr, [r0, #3]
1.00   strb.w	r12, [r0, #2]
1.00   strb	r4, [r0, #1]
1.00   strb	r2, [r0]
1.00   pop	{r4, r6, r7, pc}

@Kobzol
Copy link
Contributor

Kobzol commented Jun 3, 2024

I suspect that in the grand scheme of things (escaping strings, rather than chars), this might not have such a large effect (btw https://lemire.me/blog/2024/05/31/quickly-checking-whether-a-string-needs-escaping/ might be interesting to you). The code looked a bit more readable before, but not strong opinion on my side.

r? libs

@rustbot rustbot assigned joboet and unassigned Kobzol Jun 3, 2024
@joboet
Copy link
Member

joboet commented Jun 20, 2024

The current version really isn't particularly readable, so I don't think I can accept it.

However I found an even better version (at least according to llvm-mca) that is even more readable than the old one: https://rust.godbolt.org/z/8bfWP9aP8 (the top one)

Do you want to try that?

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 20, 2024
@Dylan-DPC
Copy link
Member

@reitermarkus any updates on this? thanks

@JohnCSimon
Copy link
Member

@reitermarkus
Ping from triage:

I'm closing this due to inactivity because the PR hasn't been touched by the author in a few months.
If want to continue on this PR, please reopen before committing to the branch. Thank you.

@rustbot label: +S-inactive

@JohnCSimon JohnCSimon closed this Oct 6, 2024
@rustbot rustbot added the S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. label Oct 6, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 13, 2024
Optimize `escape_ascii` using a lookup table

Based upon my suggestion here: rust-lang#125340 (comment)

Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.

The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)

But, if you want to try my benchmarking code for yourself:

<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>

```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};

const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();

#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 2) };

    let mut output = [ascii::Char::Null; N];

    output[0] = ascii::Char::ReverseSolidus;
    output[1] = a;

    (output, 0..2)
}

#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    let mut output = [ascii::Char::Null; N];

    let hi = HEX_DIGITS[(byte >> 4) as usize];
    let lo = HEX_DIGITS[(byte & 0xf) as usize];

    output[0] = ascii::Char::ReverseSolidus;
    output[1] = ascii::Char::SmallX;
    output[2] = hi;
    output[3] = lo;

    (output, 0..4)
}

#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 1) };

    let mut output = [ascii::Char::Null; N];

    output[0] = a;

    (output, 0..1)
}

/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    match byte {
        b'\t' => backslash(ascii::Char::SmallT),
        b'\r' => backslash(ascii::Char::SmallR),
        b'\n' => backslash(ascii::Char::SmallN),
        b'\\' => backslash(ascii::Char::ReverseSolidus),
        b'\'' => backslash(ascii::Char::Apostrophe),
        b'\"' => backslash(ascii::Char::QuotationMark),
        0x00..=0x1F => hex_escape(byte),
        _ => match ascii::Char::from_u8(byte) {
            Some(a) => verbatim(a),
            None => hex_escape(byte),
        },
    }
}

/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    /// Lookup table helps us determine how to display character.
    ///
    /// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
    /// indicate whether the result is escaped or unescaped.
    ///
    /// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
    /// escaped NUL will not occur.
    const LOOKUP: [u8; 256] = {
        let mut arr = [0; 256];
        let mut idx = 0;
        loop {
            arr[idx as usize] = match idx {
                // use 8th bit to indicate escaped
                b'\t' => 0x80 | b't',
                b'\r' => 0x80 | b'r',
                b'\n' => 0x80 | b'n',
                b'\\' => 0x80 | b'\\',
                b'\'' => 0x80 | b'\'',
                b'"' => 0x80 | b'"',

                // use NUL to indicate hex-escaped
                0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',

                _ => idx,
            };
            if idx == 255 {
                break;
            }
            idx += 1;
        }
        arr
    };

    let lookup = LOOKUP[byte as usize];

    // 8th bit indicates escape
    let lookup_escaped = lookup & 0x80 != 0;

    // SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
    let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };

    if lookup_escaped {
        // NUL indicates hex-escaped
        if matches!(lookup_ascii, ascii::Char::Null) {
            hex_escape(byte)
        } else {
            backslash(lookup_ascii)
        }
    } else {
        verbatim(lookup_ascii)
    }
}

fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
    let mut vec = Vec::new();
    for b in bytes {
        let (buf, range) = f(*b);
        vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
    }
    vec
}

pub fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("escape_ascii");

    group.sample_size(1000);

    let rand_200k = &mut [0; 200 * 1024];
    thread_rng().fill(&mut rand_200k[..]);
    let cat = include_bytes!("/bin/cat");
    let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");

    group.bench_function("old_rand", |b| {
        b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
    });
    group.bench_function("new_rand", |b| {
        b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
    });

    group.bench_function("old_bin", |b| {
        b.iter(|| escape_bytes(cat, escape_ascii_old));
    });
    group.bench_function("new_bin", |b| {
        b.iter(|| escape_bytes(cat, escape_ascii_new));
    });

    group.bench_function("old_cargo_toml", |b| {
        b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
    });
    group.bench_function("new_cargo_toml", |b| {
        b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
    });

    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```

</details>

My benchmark results:

```
escape_ascii/old_rand   time:   [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
  4 (0.40%) high mild
  18 (1.80%) high severe
escape_ascii/new_rand   time:   [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
  38 (3.80%) high mild
escape_ascii/old_bin    time:   [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
  17 (1.70%) high mild
  22 (2.20%) high severe
escape_ascii/new_bin    time:   [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
  43 (4.30%) high mild
  64 (6.40%) high severe
escape_ascii/old_cargo_toml
                        time:   [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
  21 (2.10%) high mild
  183 (18.30%) high severe
escape_ascii/new_cargo_toml
                        time:   [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
  56 (5.60%) high mild
  32 (3.20%) high severe
```

Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
RalfJung pushed a commit to RalfJung/miri that referenced this pull request Oct 14, 2024
Optimize `escape_ascii` using a lookup table

Based upon my suggestion here: rust-lang/rust#125340 (comment)

Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.

The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)

But, if you want to try my benchmarking code for yourself:

<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>

```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};

const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();

#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 2) };

    let mut output = [ascii::Char::Null; N];

    output[0] = ascii::Char::ReverseSolidus;
    output[1] = a;

    (output, 0..2)
}

#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    let mut output = [ascii::Char::Null; N];

    let hi = HEX_DIGITS[(byte >> 4) as usize];
    let lo = HEX_DIGITS[(byte & 0xf) as usize];

    output[0] = ascii::Char::ReverseSolidus;
    output[1] = ascii::Char::SmallX;
    output[2] = hi;
    output[3] = lo;

    (output, 0..4)
}

#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 1) };

    let mut output = [ascii::Char::Null; N];

    output[0] = a;

    (output, 0..1)
}

/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    match byte {
        b'\t' => backslash(ascii::Char::SmallT),
        b'\r' => backslash(ascii::Char::SmallR),
        b'\n' => backslash(ascii::Char::SmallN),
        b'\\' => backslash(ascii::Char::ReverseSolidus),
        b'\'' => backslash(ascii::Char::Apostrophe),
        b'\"' => backslash(ascii::Char::QuotationMark),
        0x00..=0x1F => hex_escape(byte),
        _ => match ascii::Char::from_u8(byte) {
            Some(a) => verbatim(a),
            None => hex_escape(byte),
        },
    }
}

/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    /// Lookup table helps us determine how to display character.
    ///
    /// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
    /// indicate whether the result is escaped or unescaped.
    ///
    /// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
    /// escaped NUL will not occur.
    const LOOKUP: [u8; 256] = {
        let mut arr = [0; 256];
        let mut idx = 0;
        loop {
            arr[idx as usize] = match idx {
                // use 8th bit to indicate escaped
                b'\t' => 0x80 | b't',
                b'\r' => 0x80 | b'r',
                b'\n' => 0x80 | b'n',
                b'\\' => 0x80 | b'\\',
                b'\'' => 0x80 | b'\'',
                b'"' => 0x80 | b'"',

                // use NUL to indicate hex-escaped
                0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',

                _ => idx,
            };
            if idx == 255 {
                break;
            }
            idx += 1;
        }
        arr
    };

    let lookup = LOOKUP[byte as usize];

    // 8th bit indicates escape
    let lookup_escaped = lookup & 0x80 != 0;

    // SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
    let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };

    if lookup_escaped {
        // NUL indicates hex-escaped
        if matches!(lookup_ascii, ascii::Char::Null) {
            hex_escape(byte)
        } else {
            backslash(lookup_ascii)
        }
    } else {
        verbatim(lookup_ascii)
    }
}

fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
    let mut vec = Vec::new();
    for b in bytes {
        let (buf, range) = f(*b);
        vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
    }
    vec
}

pub fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("escape_ascii");

    group.sample_size(1000);

    let rand_200k = &mut [0; 200 * 1024];
    thread_rng().fill(&mut rand_200k[..]);
    let cat = include_bytes!("/bin/cat");
    let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");

    group.bench_function("old_rand", |b| {
        b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
    });
    group.bench_function("new_rand", |b| {
        b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
    });

    group.bench_function("old_bin", |b| {
        b.iter(|| escape_bytes(cat, escape_ascii_old));
    });
    group.bench_function("new_bin", |b| {
        b.iter(|| escape_bytes(cat, escape_ascii_new));
    });

    group.bench_function("old_cargo_toml", |b| {
        b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
    });
    group.bench_function("new_cargo_toml", |b| {
        b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
    });

    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```

</details>

My benchmark results:

```
escape_ascii/old_rand   time:   [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
  4 (0.40%) high mild
  18 (1.80%) high severe
escape_ascii/new_rand   time:   [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
  38 (3.80%) high mild
escape_ascii/old_bin    time:   [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
  17 (1.70%) high mild
  22 (2.20%) high severe
escape_ascii/new_bin    time:   [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
  43 (4.30%) high mild
  64 (6.40%) high severe
escape_ascii/old_cargo_toml
                        time:   [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
  21 (2.10%) high mild
  183 (18.30%) high severe
escape_ascii/new_cargo_toml
                        time:   [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
  56 (5.60%) high mild
  32 (3.20%) high severe
```

Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Oct 17, 2024
Optimize `escape_ascii` using a lookup table

Based upon my suggestion here: rust-lang/rust#125340 (comment)

Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.

The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)

But, if you want to try my benchmarking code for yourself:

<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>

```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};

const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();

#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 2) };

    let mut output = [ascii::Char::Null; N];

    output[0] = ascii::Char::ReverseSolidus;
    output[1] = a;

    (output, 0..2)
}

#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    let mut output = [ascii::Char::Null; N];

    let hi = HEX_DIGITS[(byte >> 4) as usize];
    let lo = HEX_DIGITS[(byte & 0xf) as usize];

    output[0] = ascii::Char::ReverseSolidus;
    output[1] = ascii::Char::SmallX;
    output[2] = hi;
    output[3] = lo;

    (output, 0..4)
}

#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 1) };

    let mut output = [ascii::Char::Null; N];

    output[0] = a;

    (output, 0..1)
}

/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    const { assert!(N >= 4) };

    match byte {
        b'\t' => backslash(ascii::Char::SmallT),
        b'\r' => backslash(ascii::Char::SmallR),
        b'\n' => backslash(ascii::Char::SmallN),
        b'\\' => backslash(ascii::Char::ReverseSolidus),
        b'\'' => backslash(ascii::Char::Apostrophe),
        b'\"' => backslash(ascii::Char::QuotationMark),
        0x00..=0x1F => hex_escape(byte),
        _ => match ascii::Char::from_u8(byte) {
            Some(a) => verbatim(a),
            None => hex_escape(byte),
        },
    }
}

/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
    /// Lookup table helps us determine how to display character.
    ///
    /// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
    /// indicate whether the result is escaped or unescaped.
    ///
    /// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
    /// escaped NUL will not occur.
    const LOOKUP: [u8; 256] = {
        let mut arr = [0; 256];
        let mut idx = 0;
        loop {
            arr[idx as usize] = match idx {
                // use 8th bit to indicate escaped
                b'\t' => 0x80 | b't',
                b'\r' => 0x80 | b'r',
                b'\n' => 0x80 | b'n',
                b'\\' => 0x80 | b'\\',
                b'\'' => 0x80 | b'\'',
                b'"' => 0x80 | b'"',

                // use NUL to indicate hex-escaped
                0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',

                _ => idx,
            };
            if idx == 255 {
                break;
            }
            idx += 1;
        }
        arr
    };

    let lookup = LOOKUP[byte as usize];

    // 8th bit indicates escape
    let lookup_escaped = lookup & 0x80 != 0;

    // SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
    let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };

    if lookup_escaped {
        // NUL indicates hex-escaped
        if matches!(lookup_ascii, ascii::Char::Null) {
            hex_escape(byte)
        } else {
            backslash(lookup_ascii)
        }
    } else {
        verbatim(lookup_ascii)
    }
}

fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
    let mut vec = Vec::new();
    for b in bytes {
        let (buf, range) = f(*b);
        vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
    }
    vec
}

pub fn criterion_benchmark(c: &mut Criterion) {
    let mut group = c.benchmark_group("escape_ascii");

    group.sample_size(1000);

    let rand_200k = &mut [0; 200 * 1024];
    thread_rng().fill(&mut rand_200k[..]);
    let cat = include_bytes!("/bin/cat");
    let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");

    group.bench_function("old_rand", |b| {
        b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
    });
    group.bench_function("new_rand", |b| {
        b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
    });

    group.bench_function("old_bin", |b| {
        b.iter(|| escape_bytes(cat, escape_ascii_old));
    });
    group.bench_function("new_bin", |b| {
        b.iter(|| escape_bytes(cat, escape_ascii_new));
    });

    group.bench_function("old_cargo_toml", |b| {
        b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
    });
    group.bench_function("new_cargo_toml", |b| {
        b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
    });

    group.finish();
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```

</details>

My benchmark results:

```
escape_ascii/old_rand   time:   [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
  4 (0.40%) high mild
  18 (1.80%) high severe
escape_ascii/new_rand   time:   [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
  38 (3.80%) high mild
escape_ascii/old_bin    time:   [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
  17 (1.70%) high mild
  22 (2.20%) high severe
escape_ascii/new_bin    time:   [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
  43 (4.30%) high mild
  64 (6.40%) high severe
escape_ascii/old_cargo_toml
                        time:   [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
  21 (2.10%) high mild
  183 (18.30%) high severe
escape_ascii/new_cargo_toml
                        time:   [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
  56 (5.60%) high mild
  32 (3.20%) high severe
```

Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants