-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add browser support #63
Add browser support #63
Conversation
I hope you don't mind if I ask some questions and comment. The goal is to gather some understanding of the platform and corresponding challenges. Maybe there is something that would be appropriate to do slightly differently on the blst side, more specifically in blst.hpp... |
// BLS12_381_G1: P1_Affine; | ||
// BLS12_381_NEG_G1: P1_Affine; | ||
// BLS12_381_G2: P2_Affine; | ||
// BLS12_381_NEG_G2: P2_Affine; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest not aiming for providing access to these. There are generator()
methods that serve the purpose. Though one can make a case for constant generators. I mean generator()
methods create mutable copies... Which can be viewed as an extra burden, because as far as I understand you're responsible for freeing all the objects explicitly. One can reformulate the question as following. Are there cases when constant generator points would be ultimately useful? One common case is the PT
constructor, but it should be noted that the underlying subroutine, blst_miller_loop
, actually replaces null references with generators...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one has me in a bit of an awkward position. 😅 My original intention was to replace the native binding, leaving the API layer unchanged. I don't know enough to have strong opinion on this but perhaps @dapplion can weigh in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, sometimes one has to recognize the limitations. I mean webasm interface is purely functional and one can't share constant data in the same way. As already mentioned, there are generator()
methods and they are available even in native bindings. So that it's possible to adapt the existing code to avoid the limitation of one [platform]... But sure, let's first hear how bad is it. In worst case one can define them as constants, and have C++ side recognize them and replace with pointers to constant data...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there are way to execute some code upon module load? There should be... It looks like it would be possible to instantiate the objects by having contructors recognize the constants and returning pointers to constant data... Though problem would be that the data won't be actually constant. I mean unlike the native bindings where attempt to manipulate would trigger a crash, here a buggy program would get away with otherwise erroneous modification, which would incur hard-to-find bugs later on... Again, generator()
methods would be preferable... But again, let's first hear...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see Module.onRuntimeInitialized. I am using it in post.js:982-988.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though problem would be that the data won't be actually constant... hard-to-find bugs later on...
On closer examination maybe it's not that bad. In sense that the constant points in question are affine and the affine class doesn't have methods that would modify the data. In other words an application won't have an easily accessible way to wreck havoc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(additional lifecycle hooks are also available)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one has me in a bit of an awkward position. sweat_smile My original intention was to replace the native binding, leaving the API layer unchanged. I don't know enough to have strong opinion on this but perhaps @dapplion can weigh in here?
Since this library is eventually wrapped by chainsafe/bls I don't think it necessary that bindings and wasm API are the same. If wasm exports only a functional API that fits consumer ergonomics, then that's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(additional lifecycle hooks are also available)
Could you provide some pointers? Like a real-life usage example and a specification or something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, no. Usage of the Module.onRuntineInitialized
hook and the emscripten docs that I referenced above contain the full extent of my understanding at the moment.
case "Uint8Array": | ||
return [ensureUint8(val), val.length]; | ||
// TODO: | ||
// case BigInt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one would go something like this
if (val < 0) {
throw new Error(`expecting unsigned value`);
}
var temp = [];
while (val != 0) {
temp.push(Number(val & 255n));
val >>= 8n;
}
return [ensureUint8(temp), temp.length];
[As you can tell, I don't have a firm grasp on how things get to the wasm memory. I only know that it's separate, that Javascript can read and write it, while wasm can not read, let alone write, the Javascript memory. Right?]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This binary logic seems very on-brand 😜
I don't think webpack will like the bigint literals, so maybe don't be surprised if BigInt(...)
s come around later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In essence the idea would be to fully harmonize with blst.hpp.ts.
var out = ensureCache.alloc(new Uint8Array(32), HEAPU8); | ||
_emscripten_bind_SecretKey_to_lendian_1(self, out); | ||
return new Uint8Array(HEAPU8.subarray(out, out + 32)); | ||
// TODO: free `out`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that Uint8Array(wasm-pointer, ...) provides direct access to wasm memory. My question is [if] HEAP8.subarray() makes a copy of the data from wasm memory to the Javascript pool. (I apologize if the question is stupid, I'm not that familiar with interactions between wasm and Javascript.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Internet suggests that one might just as well return a pointer to the webasm stack, which you then immediately copy to the Javascript pool. I suppose this means that data sits safely on the stack till the next call from to the corresponding webasm module. This way you don't need to allocate a temporary buffer and figure out when it can be freed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question is [if] HEAP8.subarray() makes a copy of the data from wasm memory to the Javascript pool.
From what I gather .subarray
doesn't, but in the context new Uint8Array
does the copy. As for freeing out
. My understanding is that ensureCache
allocates a chunk of wasm memory and reuses it specifically for argument passing. It has no free method.
if (ikm && typeof ikm === 'object') { | ||
if (ikm instanceof Uint8Array) { | ||
ikm_len = ikm.length; | ||
ikm = ensureUint8(ikm, ikm.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensureUint8
doesn't take length. So passing ikm.length
is redundant, right? What might be appropriate is to double-check that the length is not shorter than 32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ensureUint8 doesn't take length. So passing ikm.length is redundant, right?
Haha, so it would seem. 🤦
} | ||
} else { | ||
ikm_len = ikm.length; | ||
ikm = ensureString(ikm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formally speaking the ikm_len
can be deceptive in the context, String. The [String's] length method returns amount of characters, while what should reach C is array of bytes. The lengths of the two are not the same if there are non-ASCII characters. BTW, I don't see the intArrayFromString
implementation and it doesn't seem to be standard. Where is it? In some dependency? Just in case, expectation is that String would be encoded as UTF-8...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the
intArrayFromString
implementation
It's provided by emscripten as part for the current module interface, right? As for ikm_len
, I gather it should be ikm_len = lengthBytesUTF8(ikm);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intArrayFromString
is part of emscripten's preamble.js API. When emcc builds the WASM module, it also dumps the preamble into the accompanying JavaScript (before dead-code elimination).
Just in case, expectation is that String would be encoded as UTF-8...
Correct, is there a potential pitfall there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case, expectation is that String would be encoded as UTF-8...
This came out wrong. Expectation is that all strings, not "String" as in "Javascript String", that are given to blst would be encoded as UTF-8. And the trouble is that "Javascript String"-s are stored as UTF-16. intArrayFromString does the UTF-16->UTF-8 conversion, but lengths' meanings are different. On the Javascript side it's amount of UTF-16 characters, while on blst side it's the amount of bytes it takes to encode the string in question in UTF-8. To give an example. In Javascript "Straße".length
is 6, while the amount of bytes corresponding UTF-8 string takes is 7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thanks for clarifying 👍
var self = this.ptr; | ||
ensureCache.prepare(); | ||
if (typeof input == 'object') { input = ensureInt8(input); } | ||
_emscripten_bind_SecretKey_from_bendian_1(self, input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who deallocates input
? BTW, why is it copied to stack in the bind method? Is it actually required? BTW, I would argue that a buffer length check would be appropriate here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm used to believing that webasm can't access the Javascript pool. But now I see things about emscripten::val
and it looks like one can create a memoryView of webasm memory in C++ and call its "set" method to copy data from the Javascript pool. Do you have any experience with that? Idea would be that instead of allocating input
from Javascript one would pass the Uint8Array Javascript object, have C++ copy data to stack and deserialize it into SecretKey. Same technique can be applied to all deserialization methods...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who deallocates
input
?
No need, it's in ensureCache, it will be reused next time one passes arguments to a method. Length check would still be more than appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emscripten::val
It doesn't seem to be possible to pass them as arguments.
Is it correct understanding that the glue code was first auto-generated and then manually fixed up? If so, maybe it would be appropriate to exercise approach similar to blst/bindings/go/generate.py. I.e. instead of applying same fix-ups to P1 and P2, one would put together P1 and then transliterate it to P2? So the each time P1 is modified, P2 is modified accordingly with zero effort. It's even possible to keep it in one single file. If there is interest, I can make more concrete suggestion... |
One can even get more creative and interleave C++ and Javascript in the same file, method by method, for better overview... |
Just to give an idea:
|
My suggestion would be to provide an empty .idl file that simply lists the classes:
This will be used to re-generate empty bindings and all those Javascript helpers [in case it's deemed necessary]. Then one would use a script that interleaves C++/Javascript snippets to append manually written methods to previously generated files. And override the constructors. As for P1->P2 transliteration. A better way to do it, better than bindings/go/generate.py that is, is to do something along following:
[Just in case, I'm using this method to generate C# bindings.] |
I love this idea 😍! It's as if you've done this before 😉 |
@bryanchriswhite so happy to see this PR here in a such a great form, I sincerely appreciate your perseverance ❤️ Also thank you @dot-asm for jumping in already with valuable suggestions. Your help was instrumental in delivering Will do an in-depth review latter this week |
// fs.createReadStream(inputPath); | ||
const data = fs.readFileSync(inputPath); | ||
const js_contents = ` | ||
import { Buffer } from "buffer"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the buffer library already used for some other purpose or only imported here to convert to base64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for base64 decoding. Is there a more browser-idiomatic way you're aware of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've been using this library https://www.npmjs.com/package/uint8arrays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, my thinking was that Buffer was already available via the webpack polyfill. Looking closer, I'm noticing that maybe all but one usage of Buffer is in tests.
"P1.test.ts", | ||
"P2.test.ts", | ||
"Pairing.test.ts", | ||
"SecretKey.test.ts", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to list test files explicitly here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, probably no reason. My workflow involved commenting and uncommenting various sets of tests while debugging.
describe("bls lib", () => { | ||
const n = 3; | ||
// TODO: more robust wait for `Module` to initialize | ||
setTimeout(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use the maybeWaitForRuntime pattern too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, nice catch!
@@ -66,3 +68,11 @@ export function runInstanceTestCases<InstanceType extends {[key: string]: any}>( | |||
} | |||
} | |||
} | |||
|
|||
export function maybeWaitForRuntime(callback: () => void): void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To run some steps before running any test code mocha has some strategies, can you check if these ones play nice with web? Would definitely make the setup simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, I was unaware!
I'm gonna think more what's the best way to provide both bindings and wasm in the same library. @bryanchriswhite do you think wasm has utility in NodeJS or may only be used in web? CC @wemeetagain please take a look, specially how this integrates into chainsafe/bls and esm modules |
@dapplion I believe that node supports the same WebAssembly API as the browser these days and emscripten's preamble is intended to support both. So I'm pretty confident that one could use the wasm binding in node as well. |
This last part, overriding the constructors, is not possible. Hence https://github.com/dot-asm/blst/tree/emscripten/bindings/emscipten, as a starting point... |
@bryanchriswhite ping, reached out via email from your Github profile |
There was typo in directory name, fixed. And runnable.js is equivalent of node.js/runnable.js now... |
https://github.com/dot-asm/blst/tree/emscripten/bindings/emscripten is essentially ready to roll. "Essentially" is an invitation to check it out and provide feedback. Once it's done, it can go to the main blst repo. build.py creates blst.wasm and blst.js in current working directory. And it passes through additional command line flags to emcc, e.g. |
Closing this PR. Browser support is provided via https://github.com/supranational/blst/tree/master/bindings/emscripten and also via https://github.com/ChainSafe/bls . This repos now hosts napi async bindings for multi-threaded verification and is node specific. There is some discussion about building multi-threaded wasm or emscripten bindings but that is still a future "wish list" thing. If this work becomes relevant again we can open a fresh PR and refer back to this one for the comment history. Thank you for your work on this @bryanchriswhite. If you have any questions or concerns please feel free to reach out to us on our discord and we will be happy to chat 😄 |
Summary
This PR re-organizes the work from fetchai/blst-ts#1 in an attempt to make it easier to review and to revert unnecessary changes. This work has also been released on npm as @fetchai/blst-ts.
While all applicable tests are passing in a browser environment, I'm reluctant to say that this "closes" #18 because it is missing bindings to the constants:
BLS12_381_G1
BLS12_381_NEG_G1
BLS12_381_G2
BLS12_381_NEG_G2
The rest of the bindings were to functions, which is well documented, but I was neither able to to find documentation on nor come up with a way to bind to C/C++ constants in the time I've spent on this so far. I would be extremely receptive to any help in this area! 😄
Theblst
submodule is pointed at the current HEAD of supranational/blst#121 but should be updated to the main branch when possible.Dependencies
supranational/blst#121