Skip to content

Commit

Permalink
[Flight] Optimize Large Strings by Not Escaping Them (#26932)
Browse files Browse the repository at this point in the history
This introduces a Text row (T) which is essentially a string blob and
refactors the parsing to now happen at the binary level.

```
RowID + ":" + "T" + ByteLengthInHex + "," + Text
```

Today, we encode all row data in JSON, which conveniently never has
newline characters and so we use newline as the line terminator. We
can't do that if we pass arbitrary unicode without escaping it. Instead,
we pass the byte length (in hexadecimal) in the leading header for this
row tag followed by a comma.

We could be clever and use fixed or variable-length binary integers for
the row id and length but it's not worth the more difficult
debuggability so we keep these human readable in text.

Before this PR, we used to decode the binary stream into UTF-8 strings
before parsing them. This is inefficient because sometimes the slices
end up having to be copied so it's better to decode it directly into the
format. The follow up to this is also to add support for binary data and
then we can't assume the entire payload is UTF-8 anyway. So this
refactors the parser to parse the rows in binary and then decode the
result into UTF-8. It does add some overhead to decoding on a per row
basis though.

Since we do this, we need to encode the byte length that we want decode
- not the string length. Therefore, this requires clients to receive
binary data and why I had to delete the string option.

It also means that I had to add a way to get the byteLength from a chunk
since they're not always binary. For Web streams it's easy since they're
always typed arrays. For Node streams it's trickier so we use the
byteLength helper which may not be very efficient. Might be worth
eagerly encoding them to UTF8 - perhaps only for this case.

DiffTrain build for commit db50164.
  • Loading branch information
sebmarkbage committed Jun 13, 2023
1 parent 42d25ac commit d13c641
Show file tree
Hide file tree
Showing 7 changed files with 9 additions and 9 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -23922,7 +23922,7 @@ function createFiberRoot(
return root;
}

var ReactVersion = "18.3.0-canary-ce6842d8f-20230610";
var ReactVersion = "18.3.0-canary-db50164db-20230612";

// Might add PROFILE later.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8617,7 +8617,7 @@ var devToolsConfig$jscomp$inline_1031 = {
throw Error("TestRenderer does not support findFiberByHostInstance()");
},
bundleType: 0,
version: "18.3.0-canary-ce6842d8f-20230610",
version: "18.3.0-canary-db50164db-20230612",
rendererPackageName: "react-test-renderer"
};
var internals$jscomp$inline_1230 = {
Expand Down Expand Up @@ -8648,7 +8648,7 @@ var internals$jscomp$inline_1230 = {
scheduleRoot: null,
setRefreshHandler: null,
getCurrentFiber: null,
reconcilerVersion: "18.3.0-canary-ce6842d8f-20230610"
reconcilerVersion: "18.3.0-canary-db50164db-20230612"
};
if ("undefined" !== typeof __REACT_DEVTOOLS_GLOBAL_HOOK__) {
var hook$jscomp$inline_1231 = __REACT_DEVTOOLS_GLOBAL_HOOK__;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9043,7 +9043,7 @@ var devToolsConfig$jscomp$inline_1073 = {
throw Error("TestRenderer does not support findFiberByHostInstance()");
},
bundleType: 0,
version: "18.3.0-canary-ce6842d8f-20230610",
version: "18.3.0-canary-db50164db-20230612",
rendererPackageName: "react-test-renderer"
};
var internals$jscomp$inline_1271 = {
Expand Down Expand Up @@ -9074,7 +9074,7 @@ var internals$jscomp$inline_1271 = {
scheduleRoot: null,
setRefreshHandler: null,
getCurrentFiber: null,
reconcilerVersion: "18.3.0-canary-ce6842d8f-20230610"
reconcilerVersion: "18.3.0-canary-db50164db-20230612"
};
if ("undefined" !== typeof __REACT_DEVTOOLS_GLOBAL_HOOK__) {
var hook$jscomp$inline_1272 = __REACT_DEVTOOLS_GLOBAL_HOOK__;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ if (
}
"use strict";

var ReactVersion = "18.3.0-canary-ce6842d8f-20230610";
var ReactVersion = "18.3.0-canary-db50164db-20230612";

// ATTENTION
// When adding new symbols to this file,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -642,4 +642,4 @@ exports.useSyncExternalStore = function (
);
};
exports.useTransition = useTransition;
exports.version = "18.3.0-canary-ce6842d8f-20230610";
exports.version = "18.3.0-canary-db50164db-20230612";
Original file line number Diff line number Diff line change
Expand Up @@ -645,7 +645,7 @@ exports.useSyncExternalStore = function (
);
};
exports.useTransition = useTransition;
exports.version = "18.3.0-canary-ce6842d8f-20230610";
exports.version = "18.3.0-canary-db50164db-20230612";

/* global __REACT_DEVTOOLS_GLOBAL_HOOK__ */
if (
Expand Down
Original file line number Diff line number Diff line change
@@ -1 +1 @@
ce6842d8f528977119b80d969306c8475099f66e
db50164dbac39d7421c936689a5c026e9fd2f034

0 comments on commit d13c641

Please sign in to comment.