-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large file is is not read correctly when using UseWasmSharedBuffer
#161
Comments
Thanks for the bug report! Probably some quirk in the unsupported APIs I rely on. |
|
This has sent me down the compatibility/upgrade hole. I'm most likely not going to support 5.0.0-preview.8, that's going to be too much of an effort, there are breaking changes in 5.0.0-rc1, which will be my current target. |
@Tewr Do you know what exactly is causing this issue ? Is it a regression ? Anything on the runtime side that is not working as expected ? |
The buffer byte[] passed InvokeUnmarshalled is corrupted -- or I won't get the actual referenced buffer back from dotnet. I need to do some more testing before I'm sure, it's a bit hard to debug. But my first impression is that its the "wrong" buffer. If I copy the entire stream to a Interesting parts of faulty code:
not sure if a regression as I havent tested much reading streams as strings, at least not that large. |
Ok so the problem is not really corruption. I have managed to reproduce this a bit quicker, by simply running the working version quickly after. Basically, after a while (325 calls, to be more exact) the shared memory mechanism stops working. The correct buffer instance is sent over, I tagged it with some data to read it on the js side. But for some reason, the data that I write (edit: clarification: on the js side) is no longer being written to the buffer (no longer visible in the c# side). If I do a second call ("retry"), it works. But there is, of course, no way for me to know that the call has failed unless I use the slow, stable marshalled method to compare. The repro is here: https://github.com/Tewr/BlazorFileReader/tree/JsonStringBugin50rc1 This is the line that no longer "works":
I implemented this a slightly differently way in a version from last year. I used a normal call and then an explicit callback to allocate a buffer, I think it was heavily inspired by how HttpClient does it. It was a bit slower but I may have to go back to that one, I'm really clueless on why this happens. |
From what I understand, C# sends over a buffer to JS, but JS doesn't see the real data in the buffer that was sent over. This would suggest that the GC did its own thing with the buffer, either moved or collected it. I don't have any experience with the JS interop layer. @kg Do you want to take a look at this and see if you spot anything fishy in the library ? Or maybe we are doing something wrong ? cc @lewing |
If the native heap grows all your existing typed arrays will become detached, because growing the native heap creates a new bigger array buffer to replace the old one. Any existing typed array instances (i.e. Uint8Array) will become dead. If you're keeping typed array instances alive across calls (like malloc) that can grow the native heap, your code is broken. That's likely the problem here. This is unfortunately very hard to debug because the design of the typed arrays spec is wholly deficient in this area. |
I read through the library C# and typescript and didn't see any obvious red flags, but I don't completely understand all of what's going on in there, in part because I have very little experience with JS async programming or blazor. My guess is still that this is breaking due to the native heap growing, which explains why retrying will succeed. You can see here that when the heap grows, emscripten needs to reallocate all the arrays and views it exposes: https://github.com/emscripten-core/emscripten/blob/9456db2b95771f0d85de4b78cc0beeeb4848dd6e/src/library.js#L628 Unfortunately emscripten is not designed to notify applications that the heap has grown, so there is no trivial way to ensure all of your buffers are current other than not allowing them to persist across calls that could enlarge the native heap. In practice this means it is unsafe to use typed arrays in most circumstances unless you allocate them fresh and then immediately discard them. |
Thanks a lot for the explanation, really helpful! |
Minimum buffersize, while working better, doesn't work all the time, so it's not a viable solution. Having read this sentence again
I interpret this as following So for correct usage, I should launch the reader, and then create the typed array when the async call returns. Or find some other solution but I have to do all my processing while staying sync, or I risk loosing my array if the heap is under pressure. @kg does this sound right? |
I'm not completely sure what the right solution here is, but if it helps, think of it this way: Any time control transitions from JavaScript to C or C# (wasm), any typed arrays alive at that point are potentially invalid. Once you transition back to JS, you're fine. The async "pause" itself isn't the issue, but almost certainly some C or C# is going to run between the start of the pause and the end unless you find a way to do all of that processing in JS. The async nature of it isn't itself the problem, if you do this synchronously you will still encounter issues when calling into C/C# because those method calls can grow the heap and invalidate your typed arrays. |
Ok, during the async pause control goes back to c# which will execute What is behind that line is quite complex and that is probably what will trigger the problem. If I allocate the buffer after the pause, there will be no more c# executing before I'm done writing to the buffer so it should be safe. I'll give it a go! |
@erikthysell can you confirm the latest release resolves your problem? v2.1 if you're using blazor 3.2 or v3.0 for blazor 5rc1 |
Describe the bug
Large file is is not read correctly when using
UseWasmSharedBuffer
To Reproduce
Testdata-2018-08-09-15.json
fileExpected behavior
Deserialization completes successfully
Project type
CSB
Environment
Additional context
The data provided by
ReadAllText
is incorrect whenUseWasmSharedBuffer
is enabled. This later leads to a crash in json deserialization. A simple checksum can be used instead to verify the data.The text was updated successfully, but these errors were encountered: