Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Bugs with Reading Remote Records and Repeated Record Data Access #327

Merged
merged 14 commits into from
Dec 4, 2023

Conversation

frankhinek
Copy link
Contributor

@frankhinek frankhinek commented Dec 4, 2023

Summary

This PR primarily focuses on fixing several bugs reported by community members that occur when interacting with records retrieved from remote DWNs and subsequent attempts to consume the data payload. The approach taken to resolve these issues additionally simplifies and improves the Web and Node Stream handling.

Context

A number of community members and TBD developers had reported that errors were returned when attempting to update, delete, and read data from Record instances. Further troubleshooting revealed that the root causes were related to records that had been initially retrieved from a remote DWN via web5.dwn.records.query() or that had been retrieved via web5.dwn.records.read(). A secondary bug was introduced if the record exceeded the size limit that DWNs use to determine whether a RecordsQuery should included the data payloads Base64 URL encoded.

During the troubleshooting process, a tangentially related issue was raised that initially obscured the true root cause: repeated attempts to access a record's data via the record.data.* getter resulted in hangs/timeouts. This was caused by the underlying Readable stream being consumed by the initial read and subsequent requests failing.

Changes

@web5/common

  • Added a new Stream class to provide common functionality needed when working with Web Streams API objects, most notably ReadableStream. Implemented in an isomorphic manner that should work across Node.js, browser, and hopefully React Native runtime environments.
  • Added a new NodeStream class to provide common functionality needed when working with Node.js Stream objects, most notably Readable. Implemented in an isomorphic manner that should work across Node.js, browser, and hopefully React Native runtime environments.
  • Improve test organization and coverage.

@web5/api

  • Fixes a bug that occurs when a record that was initially retrieved in the query results from a remote DWN is interacted with. The Record class now keeps track of whether the record's source is a remote target so that any requests to fetch the data payload are executed on the correct DWN.
  • Fixes a bug that would cause subsequent attempts to access a Record's data payload via record.data.* to fail as a result of the underlying Readable stream having already been consumed.
  • Completely refactored the internals of the Record.data getting to remove confusing and error-prone handling of data. In particular, the conditional handling of async method calls and unresolved Promises was removed.
  • Adopted use of the new @web5/common stream capabilities that were introduced as part of the resolution of the aforementioned issues.
  • Adds the bytes() method to the Record.data getter. This enables developers to easily access the data payload of a record as a byte array in Uint8Array format.
  • Updates the README to include details about both the blob() and bytes() methods.
  • Removes the record.delete() instance method from Record objects. This method is the only one that is a duplicate of the identically named web5.dwn.records.* method and introduces unintuitive and error prone behavior when the record was initially retrieved from a remote DWN. Another class of issues is then introduced depending on whether that record already exists locally. In order to remove this "foot gun" and keep the DWN Records API provided by @web5/api free of duplicates that behavior differently depending on unintuitive context, the instance method was removed.
  • Improve documentation of Record class methods.
  • Improve test coverage for edge cases.

@frankhinek frankhinek added the bug Something isn't working label Dec 4, 2023
@frankhinek frankhinek added this to the @web5/api 0.8.3 milestone Dec 4, 2023
@frankhinek frankhinek self-assigned this Dec 4, 2023
Copy link

codesandbox bot commented Dec 4, 2023

Review or Edit in CodeSandbox

Open the branch in Web EditorVS CodeInsiders

Open Preview

Copy link
Contributor

github-actions bot commented Dec 4, 2023

TBDocs Report

✅ No errors or warnings

@web5/api

  • Project entry file: packages/api/src/index.ts

TBDocs Report Updated at 2023-12-04T19:14:39Z e98b786

@frankhinek frankhinek force-pushed the fix-remote-record-author branch from 81af53e to ede18ae Compare December 4, 2023 18:59
Copy link

codecov bot commented Dec 4, 2023

Codecov Report

Merging #327 (e98b786) into main (ae975ea) will increase coverage by 0.45%.
The diff coverage is 99.46%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #327      +/-   ##
==========================================
+ Coverage   91.77%   92.23%   +0.45%     
==========================================
  Files          73       75       +2     
  Lines       15765    16771    +1006     
  Branches     1448     1567     +119     
==========================================
+ Hits        14469    15469    +1000     
- Misses       1270     1276       +6     
  Partials       26       26              
Components Coverage Δ
api 96.94% <96.77%> (+0.24%) ⬆️
common 97.78% <100.00%> (+2.77%) ⬆️
credentials 94.32% <ø> (ø)
crypto 100.00% <ø> (ø)
dids 88.46% <100.00%> (-0.30%) ⬇️
agent 88.08% <ø> (ø)
identity-agent 56.81% <ø> (ø)
proxy-agent 58.43% <ø> (ø)
user-agent 55.22% <ø> (ø)

Copy link
Contributor

@thehenrytsai thehenrytsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is massive... would take me days to grasp all the context and intricacies, so right now if it's good enough for you then it's good enough for me.

@frankhinek frankhinek merged commit acc0020 into main Dec 4, 2023
30 checks passed
@frankhinek frankhinek deleted the fix-remote-record-author branch December 4, 2023 19:49
// type is Base64 URL encoded string if the Record object was instantiated by dwn.records.query().
// If it is a string, we need to Base64 URL decode to bytes and instantiate a Blob.
this._encodedData = (typeof options.encodedData === 'string') ?
new Blob([Convert.base64Url(options.encodedData).toUint8Array()], { type: this.dataFormat }) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed that in DWN and affiliated libs we use base64url rather than base64 (atob/btoa). Is there a good reason for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamilovtim Base64 URL encoding is the format used in a variety of other related standards (e.g., JWK, DIDs, VCs, etc.).

Chat GTP answer:
The key difference between base64url and regular base64 encoding lies in the characters used for encoding. In regular base64 encoding (as used by atob and btoa in JavaScript), the characters + and / are used. However, these characters can cause issues in URLs and filenames since they have special meanings in those contexts.

base64url encoding addresses this by replacing + with - and / with _. This slight modification makes it URL and filename safe. The padding character = is also typically omitted in base64url to further enhance URL-friendliness.

Copy link
Contributor

@shamilovtim shamilovtim Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's where I'm confused. I assume the DWN data is never used in a URL as its intended for file / data transfer. So should we consider the disadvantages of using pure JS implementations of base64 over browser native (atob/btoa implemented in C++) implementations of base64?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants