-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using "JSON Lines" for large TDs #93
Comments
I am not sure if a line-by-line approach works out for JSON-LD processing (without any further requirements). There are some strong requirements, e.g., that
|
From the home page, it seems that it is not applicable to our use case (i.e. Big TDs). I quote:
This means that the format is meant to be used with a list of JSON values, like a list of Objects, Arraries, or strings. It wouldn't work with a big JSON object. |
I also don't see the use case very well. Even outside of the JSON-LD related features, TD has interdependencies like the So I would say that processing a JSON Line document does not make a lot of sense but transmitting it chunk by chunk before processing makes sense. However, wouldn't it make more sense to rely on the transportation mechanism for that? |
The point is that I think we had a misunderstanding. JSON lines do not seem to split big JSON objects, it will send it as a whole. For example: {
/* super big TD */
}// send the whole object While here: {/* super big TD */} // send this first
{/* super big TD */} // then this one
Generally, speaking yes. HTTP can handle big files easily. However, originally, we thought that big TDs could occupy TDD resources and could cause DOS problems. Moreover, I am not sure that every protocol binding could handle big files. Does COAP have such capability? Finally, I think this might be an optimization but we could leave it out the spec. I mean, it does not have the highest priority on my mind. |
I see. Just to answer the small question :)
Yes -> https://tools.ietf.org/html/rfc7959 and w3c/wot-binding-templates#49 |
I agree, this was suggested in the wrong context. It does not solve the "super big TD" problem. It can be used to deliver TDs one-by-one, as mentioned by @relu91:
allowing the clients to consume them one at a time and interrupt at any time, instead of:
This is similar to paginating with page size of one, except that the client doesn't need to make a new requests for consecutive TDs. JSON Lines responses can be requested through content negotiation. The use cases are for e.g. when querying several TDs and stopping after you receive an expected TD or before you run out of memory. |
Ah I see, makes a lot of sense like this :) |
Well, if I were designing a system to send TDs incrementally, I would do something like a recursive approach, e.g. send the JSON with elements down to some maximum depth, with detailed sub-elements replaced with references that would then be sent later. The problem with this is it's still hard to limit the max size of each chunk. It might be easier to just encode the TD as a string or binary blob, and then just send that in chunks (which should be easy to define). Then the query would still return a JSON outer wrapper, but the TD itself would be encoded as a string value which would have to be unpacked. Note that for signed and/or encrypted TDs we may have to deal with this use case anyway. Returning chunked string-encoded TDs could be an option on the filter. If the consumer is not concerned about incoming size it could be dropped. On the server side though if someone tried to read a really large TD they might get an error if it exceeds some max size, but the error could indicate that that particular TD can only be read in "chunked" mode. If a query returns multiple TDs than if any TD exceeds the max size then the entire query would have to return that error. |
I propose closing this issue and continue the discussion on #117. |
From Discovery call:
|
A protocol for returning JSON line-by-line (or rather chunk-by-chunk) which may be useful for returning large TDs. Suggested by @farshidtz
See https://jsonlines.org/
The text was updated successfully, but these errors were encountered: