Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: all you want for queries #1070

Closed
pvojtechovsky opened this issue Dec 22, 2016 · 75 comments
Closed

feature: all you want for queries #1070

pvojtechovsky opened this issue Dec 22, 2016 · 75 comments

Comments

@pvojtechovsky
Copy link
Collaborator

pvojtechovsky commented Dec 22, 2016

Requirements/goals for new features of queries:

  1. support of lazy foreach (better performance for large queries because no creation of useless intermediate lists)
    • CtQuery extends Iterator would an elegant solution but not possible because a Scanner is not iterable
  2. support for chaining of foreach
    • what should be the elements given for the next steps?
  3. support for lazy evaluation in map for functions that returns an iterator (already works for functions returning Iterable).
  4. support for queries reusable over multiple inputs (CtQuery query = new CtBaseQueryImpl().map(...).map(...) )
  5. support for sending of results of one step directly to next step, without necessity to build helper lists.
interface CtQuery { // already exists
  /** Performs an action on each element of the query results.
    * Implemented efficiently for CtScanner based queries */
  CtQuery forEach(CtQueryStep);
}

interface CtQueryStep {
  /** 
   * evaluates query step and applies mapping of one input element to many output elements.
   * The output elements are `returned` by call of `outputConsumer.accept(outputElement)`, 
   * which is called once for each outputElement
   */
   void evaluate(Object input, CtConsumer<Object> outputConsumer);
}

/**This interface is equivavlent to java 8 Consumer. 
We can remove it after we upgrade to java 8 */
interface CtConsumer<T> {
 void accept(T item);
}

The method CtQuery#map(CtQueryStep) is semantically equal to the existing method CtQuery#map(CtFunction). The only difference is the way how result of mapping is delivered to the next query step.

  • Implementation of CtFunction - returns result by return objectBooleanOrList;. This is optimal for mapping functions which returns 1 or few elements.
  • Implementation of CtQueryStep- returns result by outputConsumer.apply(item), which is called for each spoon model element, which has to be returned by this step. This is optimal for mapping functions, which returns many elements.

I need a spoon query, which correctly returns all the references to an private or protected class field. It means that query must search for references in correct scope(s).

There was an idea how to implement it during development of PR #1018, but then we removed that code for simplicity reasons.

@monperrus does it makes sense to create a new PR, which will reuse the same ideas again or do you know some better way?

@monperrus
Copy link
Collaborator

monperrus commented Dec 22, 2016 via email

@pvojtechovsky
Copy link
Collaborator Author

The main requirement which is not supported by #1088 is possibility to create the query which has not initialized input element and then this query can be used as a part of another query. Then We can build and use queries/model analyzing/refactoring chains

@monperrus
Copy link
Collaborator

monperrus commented Dec 22, 2016 via email

@monperrus monperrus changed the title support for scope-based Queries feature: add support for scope-based queries Dec 22, 2016
@pvojtechovsky
Copy link
Collaborator Author

it will come from the previous step of the query chain

@pvojtechovsky
Copy link
Collaborator Author

I guess it helps, if I explain the big picture of the problem, I would like to solve using spoon queries. We have an old java project, where nearly each method is polluted with one project specific bad design pattern. I want to get rid of that design pattern. It is easy to localize it, but it is harder to replace it and even more harder to clean all the useless code, which was written just to fulfill needs of that bad pattern.

So I need an algorithm which does something like this:

  • starts on one method of one class. It is the input point for the example analyze, which I need to run first
  • then searches for all callees
  • then searches for all callees again
  • then focuses on parameters which are used to call these method and stores them in list1
  • then searches for all declarations of these parameters and stores them in list2
  • then searches for all other uses of these declarations
  • then filter these uses by a templateMatcher
  • collect the remaining uses in list3
    after I analyze content of list3 more or less manually and configure templateMacher, and list 3 is finally empty.
    I will run that chain again but at places where list1, list2 are collected, there will be called recursive refactoring, which will remove these parameters and their declarations, from the local scope of functions, from the parameters of methods (and all overridden/overriding methods), field variables of types etc. recursively.

I know that it is a big challenge and that the algorithm will be more complicated then this raw sketch, but may be you now understand why I need interface like

interface CtQueryStep {
void evaluate(Object input, Consumer<Object> outputConsumer);
}

This interface and function, let's me build all the queries/filters/refactoring, which I need in effective way.
Current CtQuery, without CtQueryStep and public Consumer interface is like calling spoon Scanner on whole model and to collect all the visited elements in the big List (without Filtering them) and then later filter that list and collect another list etc... bad approach. Yes it is nice and simple for small queries. But not for complex cases with many thousands intermediate results.

May be I will fail and the project stays dirty as it is, but I like that game :-) ... with exception when Martin deletes my code ;-)))
No, problem. You do it well Martin 👍

@monperrus
Copy link
Collaborator

monperrus commented Dec 22, 2016 via email

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 23, 2016

pipeline

no, we already have a pipeline in CtQueryImpl

performance problem?

yes. But not only performance. See this example. In spoon there are many places where model Scanner is used. In all places the code inherits from base Scanner and hooks on enter method. If somebody would implement it different and would return all the model elements in one long list and then would process that long list .... how would you call it? Just performance problem? Bad design?
And it is the same case like current Query implementation. CtFunction is good for small amount of results. Pipeline can handle much more and it is better design for such cases. I like both approaches.

@monperrus
Copy link
Collaborator

monperrus commented Dec 23, 2016 via email

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 23, 2016

Yes we can handle them separately. But in this order
1 lazy result sending
2 scope based queries

After lazy result sending is finished. There is nothing more needed, because scope based queries and many other kinds of model analysis and refactoring algorithms can be easily implemented without need for any extra interaces or changes in the spoon core.

@monperrus monperrus changed the title feature: add support for scope-based queries feature: add support for pipeline-based queries Dec 23, 2016
@monperrus
Copy link
Collaborator

But in this order
1 pipeline
2 scope based queries

OK. Renamed this issue accordingly. Now we focus the discussion on pipelines.

Does you current step design in QueryImpl support pipelines? Is it only a matter of having CtConsumer public?

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 23, 2016

only CtConsumer public

no it is not enough. I need
1 CtConsumer
2 CtQueryStep (see above)
3 a new map function in CtQuery, which accepts CtQueryStep as the only parameter.

Then for example scope based query can be then implemented like this

query.map(new OverridenMethodQuery())...

class OverridenMethodQuery implements CtQueryStep {
void evaluate(CtElement input, Consumer out) {
input.factory.getRootPackage().filterChildren(new OverridenMethodFilter(input)).forEach(out);
}
}

by the way, I have just learned design pattern Pipeline and Chain of responsibility and I see that we shouldn't call this PR pipeline. I have not found the name of design pattern I need.
Current implementation of CtQueryImpl is already implemented as a kind of pipeline.

@pvojtechovsky
Copy link
Collaborator Author

and current CtQuery does not supports forEach(Consumer) so instead of

.forEach(out)

in the eaxample above, the client has to actually write

.map((ele)->{out.accept(ele);return false;}).list()

@pvojtechovsky
Copy link
Collaborator Author

And this is the final solution with optimized performance

query.map(new OverridenMethodQuery())...

class OverridenMethodQuery implements CtQueryStep {
  OverridenMethodFilter filter = new OverridenMethodFilter();
  CtQuery query = new CtQueryImpl().filterChildren(filter);
  void evaluate(CtElement input, Consumer out) {
    filter.setMethod(input);
    query.evaluate(input.factory.getRootPackage(),out);
  }
}

@monperrus
Copy link
Collaborator

To me, it seems that lazy evaluation gives you pipelining for free, with no change in the API:

# no pipeline, current implem
forEachSequential(steps, inputs):
for s : steps
   results = s.apply(inputs)
   inputs = results

instead we can have something like:

# simple version where each step yield no or one result
forEachPipeline(steps, inputs):
for i : inputs
   # applying the pipeline
   for s: steps
      j = i
      if j != null:
          j = s.apply(j)
      if laststep: results.add(j)

WDYT?

@pvojtechovsky
Copy link
Collaborator Author

I think that we speak all the time about different things.
I am trying to explain that returning 10 000 items by return longList; is not good idea and You are all the time offering solution like return shortList;
I guess we need a break :-)
Merry Christmas!

@pvojtechovsky pvojtechovsky changed the title feature: add support for pipeline-based queries feature: add support for lazy result sending in queries Dec 25, 2016
@pvojtechovsky
Copy link
Collaborator Author

please forget about pipelines. It was a pure misunderstanding. We already have a pipeline in CtQueryImpl. I have updated my answers above, and I have changed the name of this issues.

I will try now to explain the problem. There is really no feature missing in CtQueryImpl. What is missing, is support for processing of results in a lazy way (I do not know the correct design pattern name for that, so do not focus on the name "lazy result", but have a look at this 2 examples:

  1. Code which works now
query.map(new OverridenMethodQuery())...

class OverridenMethodQuery implements CtFunction<CtMethod, CtMethod> {
  CtMethod evaluate(CtMethod input) {
    return input.factory.getRootPackage().filterChildren(new OverridenMethodFilter(input)).list();
  }
}
  1. Code which can be written after lazy result sending is supported
query.map(new OverridenMethodQuery())...

class OverridenMethodQuery implements CtQueryStep {
   void evaluate(CtElement input, Consumer out) {
      input.factory.getRootPackage().filterChildren(new OverridenMethodFilter(input)).forEach(out);
   }
}

Both examples gives exactly same results, but the second one will perform faster with lower memory consumption in cases if OverridenMethodQuery would produce 1 000 000 results.
Yes OverridenMethodQuery will never return 1M results, but as you can see is in the big picture above, I need to write filters, which will do that. I used OverridenMethodQuery here as example, because it exists in current code, so it is easily understandable.

@monperrus
Copy link
Collaborator

Do you mean that if the final result of list() contains 1 000 000 elements, we actually build that list? that's the performance problem?

@pvojtechovsky
Copy link
Collaborator Author

yes it is

  1. performance problem
    and
  2. it is not needed because results of one step can be sent as input of next step without collecting them in any list.
    Actualy all 1M result items has to be collected in the list and then the code of CtQueryImpl will iterate this list and call nextStep.accept(item) 1M times.
    I suggest to optionnaly!support another way: To let current step call the nextStep.accept(item) 1M times directly - without any useless allocation of a helper list

@pvojtechovsky
Copy link
Collaborator Author

It is not performance problem, if you run it once. But I see the Query as a core feture of analyses, refactorings and TemplateMatcher and I think it we be evaluated many many times ... and then allocation of useless lists is a bad idea.

@monperrus
Copy link
Collaborator

One solution is that CtQuery implements Iterator and we thus add a method iterator() and then on can write:

 for(CtStatement s : e.filterChildren(x)) {
    doSomething(s)
 }

with no useless list involved.

@pvojtechovsky
Copy link
Collaborator Author

I think, it is very difficult task to implement

Iterator iter = e.filterChildren(x).iterator()

without helper list, because the way how CtScanner.enter method is called. Show me that if you think opposite!
What is bad on interface method like this:

interface CtQueryStep {
   void evaluate(Object input, Consumer<Object> outputConsumer);
}

??? Why do you try to invent different solution???

Note: I already had solution for this issue in #1018 and I still think it is the most efficient solution. It needs only little code in spoon core (CtQueryStep#evaluate). It needs only little code in client's Query,

void evaluate(CtElement input, Consumer out) {
      input.factory.getRootPackage().filterChildren(new OverridenMethodFilter(input)).forEach(out);
   }

it is fastest and it has minimal memory overhead. Why do you not like it? Why do you try to invent something else? Is it another misunderstanding? Did you understood my solution?

Yes, I see some problems and possible improvements in original #1018 (before you deleted support of lazy result sending) too. But the core idea is OK. I suggest to discuss these problems. I am sure we can found an elegant solutions for them.

at the beginning you wrote:

We will first extensively discuss the design and contracts

Who is we?
A) Pavel and Martin
B) Inria Spoon team

May be I am just too active? And you just need time to discuss it? Is this discussion helpful for you?

@monperrus
Copy link
Collaborator

This discussion is very helpful to understand the interesting problems you raise and find a good solution. Thanks a lot for it, Pavel!

??? Why do you try to invent different solution???

Oh no, we're just trying to identify the simplest and most intuitive solution. And a standard Java iterator is such a simple solution

I think, it is very difficult task to implement Iterator iter = e.filterChildren(x).iterator()

OK, but conceptually is this what we need? Is this what we would like to write? (I'd say yes)

without helper list?

Could you elaborate on what you mean by helper list in this context?

@pvojtechovsky
Copy link
Collaborator Author

Thanks for confirming that we are on good way ;-)

I think, it is very difficult task to implement Iterator iter = e.filterChildren(x).iterator()

OK, but conceptually is this what we need? Is this what we would like to write? (I'd say yes)

I am sure No, because such implementation is really very difficult. Just try it, it is better then any further discussion. I can actually only imagine the one solution - running a helper Thread and some thread synchronization. Such solution is complicated, slow and resource consuming. May be you know better solution. Then show me that and we can discuss if it is better then CtQueryStep#evaluate(Object, Consumer)

Could you elaborate on what you mean by helper list in this context?

I mean this:

e.filterChildren(x).iterator()
can be written as
e.filterChildren(x).list().iterator()

which is the solution, which creates helper list with 1M items.

@monperrus
Copy link
Collaborator

e.filterChildren(x).list().iterator()is NOT a solution, which creates helper list with 1M items.

We all agree on this.

OK, but conceptually is this what we need? Is this what we would like to write? (I'd say yes)

I am sure No, because such implementation is really very difficult.

I mean from a pure API viewpoint. Assuming this would be easy to implement, would it be what we want?

 for(CtStatement s : e.filterChildren(x)) {
    doSomething(s)
 }

?

So having CtQuery#map(CtQueryStep) in the only feasible solution to you (as opposed to iterator())? I prefer to call it foreach(CtQueryStep) (because it is equivalent to for(CtStatement s : e.filterChildren(x)) which is called a foreach). OK?

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 25, 2016

So having CtQuery#map(CtQueryStep) in the only feasible solution to you (as opposed to iterator())?

yes. Because Spoon CtScanner, which will be the core of the most of the future queries, works this way. It is not possible to iterate over children element visited by CtScanner and it is not efficient to collect children elements in a long list. The most compatible with CtScanner#enter method is the possibility to send the actually visited child element into next query step using call of an function - I suggest to call void CtConsumer<T>#accept(T).... or anything similar, but it must be a one function call for each visited element.

I prefer to call it foreach(CtQueryStep)

Do you mean to implement CtQuery#map(CtQueryStep), but to change the method name map to foreach?
So clients will then write code like

List result = e.map(..some CtFunction lambda ...).foreach(... some CtQueryStep based query ...).foreach(... some other CtQueryStep based query ...).list();
for(Object item : list) {
   ...  a code which consumes result of query
}

?
I would prefer to not use foreach here because of standard java 8 List#forEach(Consumer) method, which I would like to see at the end of query instead of list. Like this:

e.map(..some CtFunction lambda ...).map(... some CtQueryStep based query ...).map(... some other CtQueryStep based query ...).forEach(... a code which consumes result of query...)

note that second example has exactly same behavior.

@pvojtechovsky
Copy link
Collaborator Author

Note: I do not persist on any names. I am open to rename map to anything else, if you think it helps clients to understand. But I think that forEach would be really confusing at that place.

@monperrus
Copy link
Collaborator

Added info about our convergence points at the beginning of this thread in the section requirements.

  1. about reusing CtScanner: what about method scan() which is used differently from enter?

  2. what will method forEach return?

  3. about the link with Java8, what about renaming QueryStep into CtConsumer?

@monperrus
Copy link
Collaborator

evaluate(CtElement input) is a step in wrong direction

OK, there was a misunderstanding: according to your previous message, I thought we had reached an agreement on those two points. It's not perfect, it's not final, but it's a valuable intermediate step. Keep in mind that what matters is the API, not the implementation. In Spoon we often go for a more complex implementation in order to have a more intuitive and usable API (the ideal API being when you understand it just with the names and types, without reading the Javadoc).

Let's come back to your house metaphor. What I'm doing is to evaluate whether what we put in Spoon is valuable and usable only for your house (bad) or valuable for other houses as well (good): lazy foreach meets this criterion (RQ1), reusable queries as well (RQ4). That's already a huge step forward and I thank you for this.

@pvojtechovsky
Copy link
Collaborator Author

Thanks for explanation. I like the idea to finish API first too. The problem is that till now you suggested API (some parts), which was not acceptable for me, because the suggested API would cause extra complexity in client's code too and would be a bottleneck for some future powerful query features.

Let's discuss API

RQ1

If you understand CtQuery#forEach(CtConsumer outputConsumer) implementation of #1076 as solution for RQ1, then I agree. This function is really valuable for everybody. I agree with this API.

reusable queries as well (RQ4)

If you like reusable queries as agreed feature for next PR, then I agree too. But note that:

  • reusable query is implemented by CtBaseQueryImpl (which is nothing else then some code moved from CtQueryImpl)
  • the API to evaluate reusable query needs two input parameters:
  1. Object input - input element - which represents the starting point of the query
  2. CtConsumer outputConsumer - the consumer, which will process the results of the query.
    Note that this approach is 100% compatible with CtQuery#forEach(CtConsumer outputConsumer), which you already like as good API of query with bound input (represented by CtQuery).

So I suggest to have following function as evaluation API of reusable queries:

void CtBaseQuery#apply(Object input, CtConsumer outputConsumer).

I am open to rename this method if apply is not intuitive enough. But note that this method is a method inherited from CtLazyFunction! I need this inheritance, because then we can use one query as mapping step of another query.
I see other places where CtLazyFunction should be used in future. One of them is CtScanner, which should implement it in near future. So I am looking for a method name, which is generic enough.

@monperrus
Copy link
Collaborator

the API to evaluate reusable query needs two input parameters:

Yes, I agree, an input and a query, hence CtQuery.evaluate(CtElement input) (this is the first argument, input the second) or CtElement.evaluate(CtQuery query).

My main problems with apply(Object input, CtConsumer outputConsumer) is that it uses a CtConsumer which is not the abstraction that captures the query. It exposes in the API an implementation concern.

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 29, 2016

I do not understad how that will be used. Please provide a client´s code which uses your API.
We probably again understand different thing behind the term "reusable query". So show me the use case and I understand what is your API for.

@monperrus
Copy link
Collaborator


// reusable query
CtQuery q =  new OverridenMethodQuery();

List l1 = q.evaluate(aMethod).list();
List l2 = q.evaluate(anotherMethod).list();

// listless foreach
q.evaluate(aMethod).forEach( new CtConsumer() {
   void accept(CtMethod m) {
      System.out.println(m);
   }
}); 

@pvojtechovsky
Copy link
Collaborator Author

Thanks for nice example. I understand it. It is super that we spooke about same thing :)))

What is good?
It is readable if client knows what to do.
It needs only minimum changes to existing API.

I see these problems:

  1. the method evaluate does not evaluates the query, but only binds the input element to the query. It is the reason, why I did not understood how to use it. Solution: rename it?
  2. This API works with single thread only. So the reusability of such API is limited to one thread. May be it is not problem now. But it is a design issue if we speak about reusable queries
  3. CtQuery is interface for both use cases A) element based query B) reusable query.
    3a) Therefore one can write bad code like
  • element.map(...).evaluate(element2).list(). What it does?
  • q.list(); without binding input element.
    3b) Therefore CtQuery API is more complicated then needed.
  1. once we release that we have to support that, even it will be not needed, because we will may be create better API later. And the spoon API will stay less understandable then it might be.

My main problems with apply(Object input, CtConsumer outputConsumer) is that it uses a CtConsumer which is not the abstraction that captures the query. It exposes in the API an implementation concern.

Why it should capture query? I do not understand You. Please explain it. It is probably because you (same like me) expected different usage of that API.

Here is the client code which uses apply method based API

// reusable query
CtLazyFunction q =  new OverridenMethodQuery()
//or optionally this will work too
CtBaseQuery q = new CtBaseQueryImpl().map(...).filterChildren(...);

// listless foreach
q.apply(aMethod, new CtConsumer() {
   void accept(CtMethod m) {
      System.out.println(m);
   }
}); 

Note:

  • I am not using CtQuery for reusable queries, but CtBaseQuery, which can be used to build complex query using map and filterChildren methods and finally can be executed several times using apply method of CtLazyFunction.
  • Each query use case is covered by one API A) CtQuery, B) CtBaseQuery. Each API is as simple as possible and avoids missuse.
  • CtLazyFunction/CtBaseQuery apply method can be used by multiple threads ... after internal implementation allows that too.

I understand evaluate based API. Please confirm that You understand apply based API or ask me.
Then we can discuss next steps.

@monperrus
Copy link
Collaborator

Excellent. I'll answer to all your points but first:

q.apply(aMethod, new CtConsumer() {
   void accept(CtMethod m) {
      System.out.println(m);
   }
}); 

Is it a shortcut for the following? How is this different from:

q.evaluate(aMethod).forEach(new CtConsumer() {
   void accept(CtMethod m) {
      System.out.println(m);
   }
}); 

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 29, 2016

Yes, It is the same.

How is this different from:

See the list of problems above. It is the list of differences.
And note that CtBaseQuery has no method like list or forEach. There is only apply method, which can be used to evaluate query. So the client cannot use it wrong and it is clear for him/her what is the correct method to call. API is simpler then evaluate based. But there are 2 APIs in query engine. One for element based query - CtQuery and second for reusable query CtBaseQuery.

@monperrus
Copy link
Collaborator

monperrus commented Dec 29, 2016

But there are 2 APIs in query engine. One for element based query - CtQuery and second for reusable query CtBaseQuery.

I'd prefer to avoid this and have a single API.

the method evaluate does not evaluates the query, but only binds the input element to the query. It is the reason, why I did not understood how to use it. Solution: rename it?

OK for renaming. bindTo? prepareFor?`

This API works with single thread only. So the reusability of such API is limited to one thread. May be it is not problem now.

I would say it's not a problem for now. We can document it and make it ready for concurrency afterwards. Baby steps first :-)

element.map(...).evaluate(element2).list(). What it does?

Runtime exception?

q.list(); without binding input element.

Runtime exception?

@pvojtechovsky
Copy link
Collaborator Author

I have two questions:

But there are 2 APIs in query engine. One for element based query - CtQuery and second for reusable query CtBaseQuery.

I'd prefer to avoid this and have a single API.

I guess you prefer one API, because it is simpler for clients right? Or is there any other reason?

Here is an example of Query which is based on apply API

public class CtFieldReferenceQuery implements CtLazyFunction<CtField<?>, CtFieldReference<?>> {

	@Override
	public void apply(CtField<?> field, CtConsumer<CtFieldReference<?>> outputConsumer) {
		if (field.hasModifier(ModifierKind.PRIVATE)) {
			searchForPrivateField(field, outputConsumer);
		} else if (field.hasModifier(ModifierKind.PUBLIC)) {
			searchForPublicField(field, outputConsumer);
		} else if (field.hasModifier(ModifierKind.PROTECTED)) {
			searchForProtectedField(field, outputConsumer);
		} else {
			searchForPackageProtectedField(field, outputConsumer);
		}
	}
	private void searchForPrivateField(CtField<?> field, CtConsumer<CtFieldReference<?>> outputConsumer) {
		field.getTopLevelType().filterChildren(new DirectReferenceFilter(field.getReference())).forEach(outputConsumer);
	}
	private void searchForProtectedField(CtField<?> field, CtConsumer<CtFieldReference<?>> outputConsumer) {
		field.getFactory().getModel().getRootPackage()
			.filterChildren(new SubtypeFilter(field.getDeclaringType().getReference()))
			.filterChildren(new DirectReferenceFilter(field.getReference()))
			.forEach(outputConsumer);
	}
	private void searchForPublicField(CtField<?> field, CtConsumer<CtFieldReference<?>> outputConsumer) {
		field.getFactory().getModel().getRootPackage().filterChildren(new DirectReferenceFilter(field.getReference())).forEach(outputConsumer);
	}
	private void searchForPackageProtectedField(CtField<?> field, CtConsumer<CtFieldReference<?>> outputConsumer) {
		field.getTopLevelType().getPackage().filterChildren(new DirectReferenceFilter(field.getReference())).forEach(outputConsumer);
	}
}

It is already working with PR #1076.

What interface class it should extend if we use evaluate based API?

@monperrus
Copy link
Collaborator

I guess you prefer one API, because it is simpler for clients right?

Yes

What interface class it should extend if we use evaluate based API?

What about make it extend class QueryImpl directly (instead of implementing an interface) (do you remember our discussion on a possible class OverridenMethodQuery extends QueryImpl?)

@pvojtechovsky
Copy link
Collaborator Author

But there are 2 APIs in query engine. One for element based query - CtQuery and second for reusable query CtBaseQuery.

I'd prefer to avoid this and have a single API.

I guess you prefer one API, because it is simpler for clients right?

Yes

I am glad that we both want simple API for spoon clients. It is probably unbelievable at first sight, but I think that these 2 simple APIs are simpler/better for clients, then 1 more complex API.

Why 2 is simpler then 1?
Because the only reason, why I designed these 2 APIs is to make it simple for clients. There is no other internal reason for 2 APIs.

You probably think: If there are 2 APIs then client must choose, which one to use, so it is more complex then 1 API, where this choice is not needed.
But it is not true in this case, because this decision is done automatically by the context where the query is used. There are two contexts:

  1. the query, which starts on CtElement
anElement.map(...).forEach(...);
//or 
List aList = anElement.map(...).list();

Note: that in this context client automatically gets CtQuery API. In this context there is no reason to confuse client with methods like evaluate, bindTo or prepareFor. Client does not need these methods here at all, therefore it is simpler for him/her when these methods are not there.

  1. the reusable query, which is created without initial CtElement
CtBaseQuery query = factory.Query().map(...);

//... loop for each searched element ...
query.apply(element, new CtConsumer() {
   void accept(CtElement e) {
      System.out.println(e);
   }
}); 

Note: that in this context client automatically gets CtBaseQuery API. In this context there is no reason to confuse client with methods like evaluate, bindTo or prepareFor followed by forEach or list. Client has only one simple method apply. I have no problem to rename it, if the name is not good. But again one method is simpler then 2 methods, which has to be called in correct order, which has to be learned in documentation.

Baby steps in wrong direction are steps in wrong direction, even when parents love their baby. ;-)

@monperrus
Copy link
Collaborator

What if one simply wants the list of elements?

CtBaseQuery query = factory.Query().map(...);
List l1 = query.evaluate(aMethod).list();

What if one wants to refine a reusable query as follows?

CtBaseQuery query = factory.Query().map(...);
...
// refining the query
query.map(...).filterChildren(...)

For those two cases, one needs either one single interface or CtBaseQuery extends CtQuery. I'm fine with both.

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 30, 2016

What if one simply wants the list of elements?

Good question. There are several solutions:
A) Nobody needs it in case of reusable queries.
B) If somebody will ak for it, then we can make a baby step and add support for that.
C) client can already write code like this

CtBaseQuery query = factory.Query().map(...);
List l1 = new ArrayList();
//here can be optional loop, which collects results of all the query evaluations in one list, 
//without building intermediate useless lists
//{
query.apply(element, (item)->l1.add(item));
//}end of loop

What if one wants to refine a reusable query as follows?

CtBaseQuery query = factory.Query().map(...);
...
// refining the query
query.map(...).filterChildren(...)

No problem. This code already works. See CtBaseQuery API in PR #1076.

CtBaseQuery extends CtQuery

no, it would kill the idea of that design. Please read the changes in #1076 and you will see what I am speaking about.

@monperrus
Copy link
Collaborator

CtBaseQuery extends CtQuery

no, it would kill the idea of that design

Why? Currently, there is a huge copy'n'paste between CtQuery and CtBaseQuery (nearly all methods of CtQuery are duplicated in CtBaseQuery).

@pvojtechovsky
Copy link
Collaborator Author

The reason is fluent API - The type of returned value, which is different. So it is only semantic copy. But implementation is different.
In one case we have to return CtQuery.
In second case we have to return CtBaseQuery.

I tried to used generic type for returned value, but I failed, because the return type MUST depend on the arguments of the map method. If anybody founds a way how to avoid copy past of declaration of these methods then I will warmly welcome that. I hate copying of code and I always do my best to avoid it.

@monperrus
Copy link
Collaborator

I understand.

Sometimes, flexible dynamic checks are better than rigid static typing. To me it's the case here.

This clearly calls for a unique interface CtQuery: we'll have all appropriate methods accessible on the returned object, and no duplicate methods.

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Dec 30, 2016

It is not design mistake that these methods seems to be copied insecond API. It is design pattern facade.

@monperrus
Copy link
Collaborator

we'll have all appropriate methods accessible on the returned object

including method apply(Object input, CtConsumer outputConsumer)?

Yes, especially if it's absolutely required for the implementation.

I'm not saying it's a mistake, I'm saying it's simpler and better to have forEach and apply in CtQuery (and evaluate if possible).

Pavel, we've made awesome progress on this topic!

@pvojtechovsky
Copy link
Collaborator Author

It is my first open source project. I am developing commercial SW for 21 years (16 java). I like the idea of open source, but I did not know that progress is 10 times slower :( then In my daily work.
Yes we did a big step in understanding of each other. Now you understood and accepted the solution, which I had implemented 3 weeks ago... and I have seen the problems of this solution, so I solved them in PR #1076. Now I am again 1 week further and I am disapointed with idea, that I will need next many weeks of discussions to get acceptance for that.
OK, I see, it is not Your mistake. It is principle of open source - at least two people has to have same problem and then they can solve it. You do not need/develop using queries so it is very hard for me to explain it to you. I will not give up like yesterday evening and will continue ... may be we found a way how to progress faster ;-)

@monperrus
Copy link
Collaborator

Yes, I know it's slow, essentially because of asynchronous communication. With one or two 15 minute Skype meetings, we would have probably converged directly. Maybe we can do this next time.

But it's really important that we're along the same page, so that we could maintain your code, and so that we can write together high-quality documentation of this new super feature.

@pvojtechovsky
Copy link
Collaborator Author

But it's really important that we're along the same page, so that we could maintain your code, and so that we can write together high-quality documentation of this new super feature.

I fully agree! And with skype it will probably not consume so much time :-)

@pvojtechovsky
Copy link
Collaborator Author

pvojtechovsky commented Jan 1, 2017

I mailed you my skype name and I have added the conceptual model and some explanations into PR #1076 . I suggest to discuss and then update this conceptual model description so it is always up to date and we can base future discussion on the agreed terms. I want to minimize misunderstandings.

I think I have finally understood your ideas how to implement the queries. These ideas were already implemented in my PR #1018 before you simplified it.
The forEach(CtConsumer) method was there
The evaluate, bindTo or prepareFor function is there too. See public CtQueryImpl#setInput(Object). It is just not part of public interface API, because I think that it makes the client API more complicated then needed. But let's discuss it later. First please focus on the description of concepts in PR #1076.

I do not need fast answer. Take your time

@pvojtechovsky
Copy link
Collaborator Author

Closed together with #1090

@monperrus
Copy link
Collaborator

monperrus commented Jan 12, 2017 via email

@pvojtechovsky
Copy link
Collaborator Author

You set the record

I am glad that it is not normal. :-) It was full of misunderstandings... Therefore it needed so long.
I believe other PRs will proceed much faster. Thanks for your patience too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants