-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define a max size limit for JSON-LD VCs #379
Comments
this should go to trace interop no? |
We should adopt MongoDB convention, then take some padding and apply this to "Certificate" types and "TraceablePresenations". |
This should happen in the vocabulary, its a data format issue. |
Simple google: 16Mb |
Suggest we set 16 MB as the max credential and presentation size limit. |
TL;DR: We need better justification for taking this action, with clearer presentation of the reasoning behind the limit(s) we're contemplating imposing. Appealing to a debatable "authority" is not sufficient. I'm wondering why we're imposing one (MongoDB) storage implementation's size limit (which appears not to be absolute, given the comment about GridFS) on VCs and VPs... This seems specially odd given the likelihood of a CBOR-LD spec to come from the new VCWG. Being a compressed format, CBOR-LD VCs will be able to hold much more data within the same 16MB document size limit than JSON-LD VCs -- and suddenly we've lost the assurance that CBOR-LD VCs can be round-tripped with JSON-LD VCs. I do not like imposing this arbitrary document size limit, especially because it's based on one implementation's arbitrary (and work-aroundable) limitation. At minimum, I want more justification for imposing this limit on JSON-LD VCs before we do it. All that said -- This is the Traceability Vocab work item. We are not chartered to impose VC document size limits. Even if we include the Traceability Interop work item, we are still not chartered to impose VC document size limits. Even a recommendation of this sort feels wrong to me, with the current lack of foundational justification. |
See https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html CBOR-LD is not currently used in this document (neither is CBOR). I don't think document constraints need to be set in stone, but it's wise to test the limits and add safety margin in any engineering system. |
Why "must" there be? This really doesn't seem to me like a limitation that is necessary nor even desirable at this stage of the game, if ever, and certainly not in a vocabulary. It might be relevant for traceability-interop, but I'm not convinced there's a need for this recommendation at all.
That's a long page, of which it appears that two bullets within a single small subsection may be relevant. (It would be EXTREMELY helpful if you could provide more specific links in cases like this. Linking just to the whole page says that the time you save by not finding and providing the deeper link is more valuable than the cumulative time all your readers must invest in finding the tiny relevant segment of the linked page.) Those two bullets:
These are not about imposing limits on the size of files, only about common-sense tests relative to users uploading files to a server of some kind, which can help prevent (though not absolutely eliminate) disk and memory overrun. Sure, people who are deploying atop MongoDB may want or need to impose a 16MB (decompressed?) filesize limit, or at least know what to do when a submitted file exceeds that size (e.g., fall back to GridFS storage) — but these limits are not relevant if deploying atop Virtuoso or various other datastores, so why should these limits be imposed on those deployers? |
Is the problem that the limit is too small? or that you think interoperability is achievable without setting limits? |
@OR13 -- You're pasting great big chunks of irrelevant material. That doesn't help further your argument. It especially doesn't help when the size limits discussed in the irrelevant material you choose to quote are a minimum of 2 GB — 125x the 16 MB size limit you initially proposed imposing on JSON-LD VCs. Even more, you seem not to have considered the reasons for the limits on the file systems the descriptions of which you quoted — which were originally due to the 16bit (FAT) and later 32bit (FAT32) and 64bit (NTFS) numbers used to implement those systems, which were the largest available on the computer systems originally (or theoretically) meant to be supported by those file systems. Interop may (but does not always!) require setting limits, on document sizes among other things. However, plucking a document size from all-but-thin-air, based only on one data store implementation's limitation (which doesn't appear to limit the size of the user's stored document, only the size of each "document" used by that implementation to store it at the back end, somewhat like a gzip may be broken up into 100 |
When is the last time you tried signing a 16TB document using RDF dataset
normalization?
…On Thu, Aug 18, 2022, 9:06 AM Ted Thibodeau Jr ***@***.***> wrote:
@OR13 <https://github.com/OR13> -- You're pasting great big chunks of
irrelevant material. That doesn't help further your argument.
It especially doesn't help when the size limits discussed in the
irrelevant material you choose to quote are a minimum of 2 GB — 125x the 16
MB size limit you initially proposed imposing on JSON-LD VCs.
Even more, you seem not to have considered the *reasons* for the limits
on the file systems the descriptions of which you quoted — which were
originally due to the 16bit (FAT) and later 32bit (FAT32) and 64bit (NTFS)
numbers used to implement those systems, which were the largest available
on the computer systems originally (or theoretically) meant to be supported
by those file systems.
Interop *may* (but does not always!) require setting limits, on document
sizes among other things. However, plucking a document size from
all-but-thin-air, based only on one data store implementation's limitation
(which doesn't appear to limit the size of the user's stored document, only
the size of each "document" used by that implementation to store it at the
back end, somewhat like a gzip may be broken up into 100 gz.## files,
each ~1/100 of the original gzip file size, in order to store that gzip
across a number of floppies when you don't have a suitable HDD or similar),
with no further justification nor basis you can apparently state, is not a
good way of setting such limits.
—
Reply to this email directly, view it on GitHub
<#379 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB7JLMA7SJ3NBF6LBS2M2RLVZY7QDANCNFSM5TSPPZLQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I don't see the point of your question. |
Guidance is better than restriction, here. "Keep your Verifiable Credentials as small as possible, and only as large as necessary." |
@TallTed says we should have guidance instead of restriction here. @BenjaminMoe There is a practical limit due to size of RDF canonicalization. I think a section that says please go as small as possible because of canonicalization times, and outlines best practices on them |
@msporny — You wanted to comment on this. |
@brownoxford on interop a hard max is good idea - very common to do so at the api side I agree with the 16mb suggested above as safe- that is likely too large for LD + RDF canon - so we will want a much smaller size max there to avoid potential denial of service around verification etc |
@brownoxford i personally think that we should ban RDF processing prior to signature verification (e.g. no LD proofs) in the future for security concerns, but I would like to see where standardization in the vc2.0 working group lands before we give any guidance in this regard. |
I also agree the profile should not endorse RDF processing prior to sign or verify. I think its fine to do RDF or schema processing after you check the signature, or before you issue a credential, as long as the processing is not "part of the proofing algorithm". |
I also wonder about setting "a max size limit for JSON-LD VCs" in the Traceability Vocab, rather than in the VCDM or VC Data Integrity spec. This just seems the wrong place for it. |
@TallTed I think it would be wise to set a max here: https://github.com/w3c/json-ld-syntax and then let profiles (like this repo), further restrict the allowed size of conforming documents. |
There must be some recommendation we would make on this front.
The text was updated successfully, but these errors were encountered: