-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
live joins? #3
Comments
These are two questions: 1) live stream joins and 2) open ended joins.
|
I have a module for doing joins here that I use here https://github.com/dominictarr/level-inverted-index/blob/map-reduce/index.js#L99-L106 Yes, I have some levelup stuff for working with npm (which is a great dataset) over here https://github.com/dominictarr/npmd will try doing this in levelgraph and see what happens. |
Your module for joins seems more efficient than the current approach, which is quite naive (but it works). Would you like to try an implementation of the join using the your module? I believe it may be possible to get better throughput at the expense of RAM (the big Ideally these two approach might be complementary, as a proper database can use multiple join strategy. I bet a pluggable system will be nice. It would also be cool to get some numbers with a real dataset, so we can know which one works better (the NPM one will be super). |
Hmm, so I think using a hash-join you will need to have multiple joins. If the second condition refuses the result, Agree, we need more generic join modules. Also, we can use Another thing - leveldb is probably pretty smart about caching - so maybe reading from the database is not so bad. |
The problem with hashing is that you can't hash what you do not know. In your query, the first condition is about variables
I'm not getting this question, but I'll try answering.
It seems so, I think that having a real use case that performs badly can drive the optimization. |
Agree about the "real use case that performs badly". I'm gonna change to 0.10, and try sticking npm into levelgraph and running this query. Unfortunately, I have more pressing "real work" but I'll get to this! |
No hurry! You might also want to use #4, I believe it's stable enough. |
it's time to move to 10 anyway. would have done this on sunday, but I encounterd a few bugs, in my stuff and fixed them instead. |
I have been thinking about this live joining. It is doable, albeit not fast and/or optimizable on the first run. How about something:
It should be easy to implement. |
Another possible syntax:
|
The only take in this solution is that the live-stream will only be matched by the first condition, while the rest will be based on the internal graph's data. Another solution might be:
This is actually what I like most, it will need to depend on sublevel, but it is ok. The real question is: should this be forwarding all "internal" matches first, or just the "live" results? |
by "internal", you mean it should match the current values in the database? Most definately - that is extremely useful - I think the most useful is to be able to match the history, the live changes, However, since you are using streams for this, it is pretty straightforward to adapt the idea to both. |
By 'internal' I mean the current values of the database. Not sure about how to support the history (if it is not what is currently in the db). The Join are implemented like a pipeline. An empty solution start, then it is denied or augmented with variable bindings at each step/condition. Implementing a live join will be starting from a live stream of triples matching the first condition. How does it sound? What about the interface? |
Closing in favor of #24 |
Would it be possible to use level-live-stream on the joins
so that you can get a changes feed of the relations as they are added?
also, is it possible to do open ended joins, perhaps:
which might return:
Is that correct?
The text was updated successfully, but these errors were encountered: