-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEARCH and/or GET+Query method #229
Comments
Having said that, I have been recently arguing for the GET method on the IETF mailing list. |
@RubenVerborgh how do you see such Query HTTP header fitting Linked Data Fragments and possibility of using Triple Pattern Fragments there along SPARQL in particular? seeAlso: |
SEARCH or GET with a query would allow for any type of query language to be used. GET with QUERY body or header would of course be cacheable. Not sure why SEARCH should not be - that was the question I had to the SEARCH proposal. |
@bblfish wouldn't use of URI Template cover many of common cases see: http://www.hydra-cg.com/spec/latest/triple-pattern-fragments/#controls |
@elf-pavlik URI Templates increase the number of Resources, are bad for caches, have no semantics so require a mapping to some semantics, are less flexible, etc... So no. If you want to do things correctly then SEARCH/GET+Query is much better. Mind you having said that, it's already in the SoLiD proposal as GET + Header. So I don't think we need to have this issue open anymore. |
It's up to us to write them. Because of CORS we have to go through proxies anyway, and our local clients need to have caches of remote graphs too. It is much more complex to write caches if each resource has a million different URLs, when each of these is just a partial representation of the same resource. That is just an extension of what HTTP 1.1 partial content does and for which there is even a Btw the neat thing about GET+Header or GET+Body is that if the server does not know about it the resource returns the full representation. You can try it out: this actually works on all current servers, that are told to ignore bodies on a GET as the semantics for it is not yet defined.
The Oracle/IBM/... SEARCH proposal mentioned allows for there to be a Location header for refereability if needed.
It clearly does: every URL refresh to a resource, and there is no way for caches to know that two URLs are partial representations of the same resource. As a result
You can do that by resticting the query language too. |
That is where the SEARCH method is useful, as it does not have that effect.
Clearly if it can't understand SPARQL it can't understand SPARQL. If it can it can. Not sure where you are going there. Also SPARQL need not be the only query language required. And this does not exclude the current way of doing things. |
You mean: "how can First note that The reason for using In the case of template Queries the number of URLs is obviously more: one for each URL matching the template. So instead of one URL you can have millions or more. I don't really see how you can fail to see that. And how you can fail to see that intermediaries may not know that these all map to the same resource.
In the case of The query language comes with a well defined mime type, which makes interpretation of it possible. URL templates do not have mime types. In fact it is not recommended part of web architecture to look into URLs to guess meanings. But you could clearly do something to map URL templates to SPARQL queries, by following links, which means extra HTTP requests, which one always wants to avoid. There is a use case for that. But there is also a use case for
|
Note also that it currently does the right thing. If the server does not understand the right headers it returns the full content: For
or for
You always have the same URL
I don't know if the same is made explicit for query urls. Should query URLs always be ignored, by the server if it does not understand it? |
The difference is simple: you create nearly infinitely more URLs in the template case. I suppose you don't see this because you must be thinking of all urls with attribute values after the query as being the same URL without those attribute value pairs. ie. you must be counting the following all as one URL:
But they are different URLs: in fact there are 4 urls there. And they refer to different things usually ( unless the resource describes them all as being owl:sameAs each other ). In any case for the purpose of most caches they refer to different things. In the In the Also, there are also queries that you just can't put in a query URL because of URL length limitation. So
|
You can do that. But that creates many different URLs that are not tied to the original resource. What I am interested in in
Can you refer to a RFC for that?
They are, that's basic Web Architecture, and a question of epistemology. A server needs to have special information to deduce that two different urls refer to the same resource. In the examples given above the URLs furthermore do refer to different resources: different parts of a specific resource, if you wish. I suppose you should look up Range Requests first and ask yourself why those have been put in place. |
@RubenVerborgh wrote:
It's best there to not ask me but to go to the original document Draft-Snell-Search-Method-00
That is the Location header is Optional. That is a way to bridge both worlds if you wish. |
In the GET /card HTTP/1.1
Host: example.org
Content-Type: application/sparql+query
Accept: text/turtle
Content-Length: 14
DESCRIBE <#me> With response 206 Partial Content
Content-Type: text/turtle
Content-Length: ... All other queries using In the template cases you have instead the following
etc... In each case the attribute of the method The template URLs do not have that necessity. |
The question is what is it that is operated upon. In the definition of resource that I take from the HTTP specs which are definitional, the resource operated upon is the same, independent of the query that is requested of it. That is why Range Requests and Your definition makes distinctions at the level of representation, rather than at the level of the resource. It works at the level of resource only with template URLs, that name each one of the triples, ie each one of the representations. Ie. you could name each of your triple with a template url
But then according to your own definition each of the template URLs refers to a different resource: namely a different triple. I believe that you are inclined to map the response representations returned by the Another way of understanding this is via the motto: |
Except it is very different from SOAP, since in SOAP the message then contained information about what methods were to be run on which resource. This is not the case here, the proof being RFC7233. Are you saying that RFC7233 is a SOAP like protocol? Do you think that it would pass muster with Roy Fielding who was one of the editors? Come on. Please take some time to reconsider. |
That is not true in my case. I am writing client apps that sometimes need small portions of a graph to do the rendering quickly. Perhaps it just needs the name and email of a person in a foaf-profile. It does not want a different resource, but just a part of the resource it would have gotten in full. Sorry the topic of this e-mail is SEARCH and GET. If @elf-pavlik wants to open a different topic, please do so in a different issue. You have less options in the template system that's why you have a somewhat smaller number of results that can come back. But that's a different issue. I brought up RFC7233 because it fundamentally contradicts your argument about the RESTfulness of the proposed protocol. In short if you wish to argue against |
Very interesting discussion @RubenVerborgh & @bblfish, thanks for taking your time to understand each other better and clarify various subtle differences. I guess difference comes with types of data sets you two may tend to work with. For public, open knowledge data sets like dbpedia, wikidata etc. URI Templates seem to make a lot of sense. In case of social networking and resources with access control, SEARCH or GET+Query might provide simpler ways for setting ACL. Especially if resources stay persisted in separate files in file system, not in triple/quad store. Not sure if system which uses triple/quad store could as easily check ACL of queries as system which stores data fragmented on file system (tree). |
My argument was not at the level of whether information is stored in a graph store or as files. HTTP is defined in terms of resources, which are identified by URIs. How the information is stored is immaterial to HTTP, as it is a communication protocol. Template URLs are not excluded by What GET+ RFC7233 allows one to page through a resource without creating new URIs for each resource. It is done for exactly the same reason as the proposed The use case for In terms of
This does not mean that template URLs don't have their place. But for the use cases we are implementing in SoLiD the disadvantages outweigh the advantages. |
Let's take an example of this container https://twitter.com/timberners_lee/followers which contains 206K resources. How we would page it with:
Does it make sense to cache pages if list of followers often grows? In case of IRC or mailing list archives it makes a lot of sense to cache if we page them by day/month
But if each new addition (e.g. followers list) will change paging than I don't see such a big gain in caching it, at least not for any longer than some minutes / hours. seeAlso: |
In the The same could be done with URI templates if they allowed full blown query languages to be placed in the URL - (which in any case is ugly). If they don't allow full blown query languages then the client is limited to queries allowed by the templating language, and as I pointed out in my previous response, finding out what those limitations are would come at the price of a number of http requests - which is costly. The Templating answer requires the server to either to specify In short: the HTTP |
I noticed that the previous solution with It certainly makes sense for LDPRs that are not LDPCs. For |
In the section "Reading data using SPARQL" I suggest instead using the SEARCH METHOD ( see the recent draft-snell-search-method-00 RFC which is being discussed on the HTTP mailing list currently and that is gaining momentum )
I have implemented that already in rww-play as described in that curl interaction page
Given that most other WebDAV methods are implemented ( see issue solid/solid-spec#3 ) this should be an easy addition, and seems less ad hoc than what is currently being suggested namely
The text was updated successfully, but these errors were encountered: