Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementability of joins and lazy query expressions #473

Closed
grainier opened this issue Apr 2, 2020 · 4 comments
Closed

Implementability of joins and lazy query expressions #473

grainier opened this issue Apr 2, 2020 · 4 comments
Assignees
Labels
Area/Lang Relates to the Ballerina language specification design/implementability Difficulty implementing the design

Comments

@grainier
Copy link

grainier commented Apr 2, 2020

Consider the following examples for join and lazy query-expr;

join sample (ref[1])

stream<DeptPerson> outputStream = 
    from var person in personStream 
    join var dept in departmentStream
    on person.id == dept.id
    select {
        firstName: person.firstName,
        lastName: person.lastName,
        dept : dept.name
    };

lazy query-expr sample (ref[2])

stream<DeptPerson> outputStream = 
    stream from var person in personStream 
    from var dept in deptList   
    where person.id == dept.id       
    select {
        firstName: person.firstName,
        lastName: person.lastName,
        dept : dept.name
    };

In such examples, to evaluate on, where conditions and move further (i.e to select) we'll have to compare each and every member of the first stream (i.e personStream) with all the members of the second stream/list (i.e departmentStream, deptList).

Screen Shot 2020-04-02 at 3 06 33 PM

This will lead to two issues;

  1. When we evaluate the 1st event of StreamA, it gets consumed and won't be available for the second pass for it to be evaluated with the second event of StreamB.
  2. Even if we maintain a state of 1st event of StreamA to address above, and move to the 2nd event of StreamA to be evaluated with StreamB, the StreamB will be fully consumed at this point.

As I understand, this will be an issue when proceeding with the implementation. What would be the best approach to overcome this? And since the [1], [2] are open, it would be great if you could consider this case for them as well.

Related Issues:
[1] join: #435
[2] query-expr: #443 / #436

@grainier grainier added Area/Lang Relates to the Ballerina language specification design/implementability Difficulty implementing the design labels Apr 2, 2020
@grainier grainier added this to the 2020R2 milestone Apr 2, 2020
@jclark
Copy link
Collaborator

jclark commented Apr 2, 2020

Unbounded streams and streaming join is part #440.

At this stage we are not doing unbounded streams. So the implementation for [1] would read the entire departmentStream once and build a hash table, exactly as with deptList [2].

This is specified in #435.

@jclark
Copy link
Collaborator

jclark commented Apr 2, 2020

Note the syntax uses equals (like LINQ), not ==.

@sanjiva
Copy link
Contributor

sanjiva commented Apr 4, 2020

James wouldn't "where x = y" be very confusing for mort?

We have let x = y too, so I think we should stick to where x == y.

@jclark
Copy link
Collaborator

jclark commented Apr 5, 2020

The syntax is not where x = y. It is on L equals R, where L and R are separate expressions which are evaluated at different times with different things in scope. See #435. Using == instead of equals would be bad because it has quite different semantics from a == expression (which is why LINQ uses equals not ==).

@jclark jclark closed this as completed Apr 8, 2020
@jclark jclark removed this from the Swan Lake preview milestone Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Lang Relates to the Ballerina language specification design/implementability Difficulty implementing the design
Projects
None yet
Development

No branches or pull requests

3 participants