Skip to content

Commit

Permalink
Add :root and :in selectors, fix variable bug
Browse files Browse the repository at this point in the history
This commit adds support for the `:in` and `:root` functions. The
`:in` function is a simpler way to check if a shape is in an expression,
typically used to test if a variable contains a shape or if a root
expression contains a shape. `:root` is used to create rooted common
subexpressions that are evaluated once against every shape in the
model. `:root` expressions are evaluate in an isolated context, so
any variables used or stored by them are not accessible outside the
root selector. `:root` selectors allows selection to be broken into
multiple steps and evaluate globally.

Let's say you want all number shapes that are used in operation
inputs, but not used in operation outputs. This can be done today
using the following expression:

```
service
$outputs(~> operation -[output]-> ~> number)
~> operation -[input]-> ~> number
:not([@: @{id} = @{var|outputs|id}])
```

With the addition of the ``:in` selector, this gets easier because we
can avoid using a scoped attribute selector:

```
service
$outputs(~> operation -[output]-> ~> number)
~> operation -[input]-> ~> number
:not(:in(${outputs}))
```

(Note: to make this work, I had to uncover and fix a bug in the
implementation of how we store variables. We previously used
`Collection#add` as a `Receiver`, but that method will return false if
it's already seen a shape, which is wrong.)

With the addition of `:root`, you can use a much simpler expression:

```
number
:in(:root(service ~> operation -[input]-> ~> number))
:not(:in(:root(service ~> operation -[output]-> ~> number)))
```

(Note: the result of root expressions are run once and cached. No
need to store them in a variable)

These expressions _seem_ to be exactly the same, however, the `:root`
expression gives a different result when working with models that
contain multiple services. In the first two expressions, if any
service uses shape X in input and not output, then X is a result.
However, in the `:root` expression, X is only part of the result if
no service uses it in their output shape closures.
  • Loading branch information
mtdowling committed Mar 23, 2023
1 parent 1eca2d0 commit 2b4f78e
Show file tree
Hide file tree
Showing 20 changed files with 660 additions and 182 deletions.
82 changes: 82 additions & 0 deletions docs/source-2.0/spec/selectors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1401,6 +1401,88 @@ trait applied to it:
service :not(-[trait]-> [trait|protocolDefinition])
.. _selector-in-function:

``:in``
-------

The ``:in`` function is used to test if a shape is contained within the
result of an expression. This function is most useful when testing if a
:ref:`variable <selector-variables>` or the result of a
:ref:`root <selector-root-function>` function contains a shape. The ``:in``
function requires exactly one selector. If a shape is contained in the
result of evaluating the selector, the shape is yielded from the function.

The following example finds all numbers that are used in service operation
inputs and not used in service operation outputs:

.. code-block:: none
:caption: :in example using variables
:name: in-variable-input-output-example
service
$outputs(~> operation -[output]-> ~> number)
~> operation -[input]-> ~> number
:not(:in(${outputs}))
.. note::

The above example returns the aggregate results of applying the selector
to every shape: if a model contains multiple services, and one of the
services uses a number 'X' in input and not output, but another service
uses 'X' in both input and output, 'X' is part of the matched shapes.
Use the :ref:`:root function <selector-root-function>` to match shapes
globally.


.. _selector-root-function:

``:root``
---------

The ``:root`` function evaluates a subexpression against *all* shapes in the
model and yields all matches. The ``:root`` function is useful for breaking
a selector down into smaller operations, and it works best when used with
:ref:`variables <selector-variables>` or the :ref:`:in function <selector-in-function>`.
The ``:root`` function requires exactly one selector.

The following example finds all numbers that are used in any operation inputs
and not used in any operation outputs:

.. code-block:: none
number
:in(:root(service ~> operation -[input]-> ~> number))
:not(:in(:root(service ~> operation -[output]-> ~> number)))
.. note::

The above example is similar to ":ref:`in-variable-input-output-example`"
but works independent of services. That is, if a model contains multiple
services, and one of the services uses a number 'X' in input and not
output, but another service uses 'X' in both input and output, 'X'
*is not* part of the matched shapes.


:root functions are isolated subexpressions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The expression evaluated by a ``:root`` expression is evaluated in an isolated
context from the rest of the expression. The selector provided to a ``:root``
function cannot access variables defined outside the function, and variables
defined in the selector do not persist outside the selector.


:root functions are evaluated at most once
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There is no need to store the result of a ``:root`` function in a variable
because ``:root`` selector functions are considered global common
subexpressions and are evaluated at most once during the selection process.
Implementations MAY choose to evaluate ``:root`` expressions eagerly or
lazily, though they MUST evaluate ``:root`` expressions no more than once.


``:topdown``
------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import java.util.Map;
import java.util.Set;
import java.util.function.Consumer;
import java.util.logging.Logger;
import java.util.stream.Stream;
import software.amazon.smithy.build.model.SmithyBuildConfig;
import software.amazon.smithy.cli.ArgumentReceiver;
Expand All @@ -41,6 +42,8 @@

final class SelectCommand extends ClasspathCommand {

private static final Logger LOGGER = Logger.getLogger(SelectCommand.class.getName());

SelectCommand(String parentCommandName, DependencyResolver.Factory dependencyResolverFactory) {
super(parentCommandName, dependencyResolverFactory);
}
Expand Down Expand Up @@ -117,6 +120,7 @@ int runWithClassLoader(SmithyBuildConfig config, Arguments arguments, Env env, L
Model model = CommandUtils.buildModel(arguments, models, env, env.stderr(), true, config);
Selector selector = options.selector();

long startTime = System.nanoTime();
if (!options.vars()) {
sortShapeIds(selector.select(model)).forEach(stdout::println);
} else {
Expand All @@ -130,6 +134,8 @@ int runWithClassLoader(SmithyBuildConfig config, Arguments arguments, Env env, L
});
stdout.println(Node.prettyPrintJson(new ArrayNode(result, SourceLocation.NONE)));
}
long endTime = System.nanoTime();
LOGGER.fine(() -> "Select time: " + ((endTime - startTime) / 1000000) + "ms");

return 0;
}
Expand Down
10 changes: 10 additions & 0 deletions smithy-model/src/main/java/software/amazon/smithy/model/Model.java
Original file line number Diff line number Diff line change
Expand Up @@ -802,6 +802,16 @@ public boolean contains(Object o) {
public Iterator<Shape> iterator() {
return shapeMap.values().iterator();
}

@Override
public Stream<Shape> stream() {
return shapes();
}

@Override
public Stream<Shape> parallelStream() {
return shapes().parallel();
}
};
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@

package software.amazon.smithy.model.selector;

import java.util.Collection;
import java.util.List;
import software.amazon.smithy.model.Model;
import software.amazon.smithy.model.shapes.Shape;

/**
Expand All @@ -33,122 +35,38 @@ private AndSelector() {}
static InternalSelector of(List<InternalSelector> selectors) {
switch (selectors.size()) {
case 0:
// This happens when selectors are optimized (i.e., the first internal
// selector is a shape type and it gets applied in Model.shape() before
// pushing shapes through the selector.
return InternalSelector.IDENTITY;
case 1:
// If there's only a single selector, then no need to wrap.
return selectors.get(0);
case 2:
// Cases 2-7 are optimizations that make selectors about
// 40% faster based on JMH benchmarks (at least on my machine,
// JDK 11.0.5, Java HotSpot(TM) 64-Bit Server VM, 11.0.5+10-LTS).
// I stopped at 7 because, it needs to stop somewhere, and it's lucky.
return (c, s, n) -> {
return selectors.get(0).push(c, s, (c2, s2) -> {
return selectors.get(1).push(c2, s2, n);
});
};
case 3:
return (c, s, n) -> {
return selectors.get(0).push(c, s, (c2, s2) -> {
return selectors.get(1).push(c2, s2, (c3, s3) -> {
return selectors.get(2).push(c3, s3, n);
});
});
};
case 4:
return (c, s, n) -> {
return selectors.get(0).push(c, s, (c2, s2) -> {
return selectors.get(1).push(c2, s2, (c3, s3) -> {
return selectors.get(2).push(c3, s3, (c4, s4) -> {
return selectors.get(3).push(c4, s4, n);
});
});
});
};
case 5:
return (c, s, n) -> {
return selectors.get(0).push(c, s, (c2, s2) -> {
return selectors.get(1).push(c2, s2, (c3, s3) -> {
return selectors.get(2).push(c3, s3, (c4, s4) -> {
return selectors.get(3).push(c4, s4, (c5, s5) -> {
return selectors.get(4).push(c5, s5, n);
});
});
});
});
};
case 6:
return (c, s, n) -> {
return selectors.get(0).push(c, s, (c2, s2) -> {
return selectors.get(1).push(c2, s2, (c3, s3) -> {
return selectors.get(2).push(c3, s3, (c4, s4) -> {
return selectors.get(3).push(c4, s4, (c5, s5) -> {
return selectors.get(4).push(c5, s5, (c6, s6) -> {
return selectors.get(5).push(c6, s6, n);
});
});
});
});
});
};
case 7:
return (c, s, n) -> {
return selectors.get(0).push(c, s, (c2, s2) -> {
return selectors.get(1).push(c2, s2, (c3, s3) -> {
return selectors.get(2).push(c3, s3, (c4, s4) -> {
return selectors.get(3).push(c4, s4, (c5, s5) -> {
return selectors.get(4).push(c5, s5, (c6, s6) -> {
return selectors.get(5).push(c6, s6, (c7, s7) -> {
return selectors.get(6).push(c7, s7, n);
});
});
});
});
});
});
};
return new IntermediateAndSelector(selectors.get(0), selectors.get(1));
default:
return new RecursiveAndSelector(selectors);
InternalSelector result = selectors.get(selectors.size() - 1);
for (int i = selectors.size() - 2; i >= 0; i--) {
result = new IntermediateAndSelector(selectors.get(i), result);
}
return result;
}
}

static final class RecursiveAndSelector implements InternalSelector {

private final List<InternalSelector> selectors;
private final int terminalSelectorIndex;
static final class IntermediateAndSelector implements InternalSelector {
private final InternalSelector leftSelector;
private final InternalSelector rightSelector;

private RecursiveAndSelector(List<InternalSelector> selectors) {
this.selectors = selectors;
this.terminalSelectorIndex = this.selectors.size() - 1;
IntermediateAndSelector(InternalSelector leftSelector, InternalSelector rightSelector) {
this.leftSelector = leftSelector;
this.rightSelector = rightSelector;
}

@Override
public boolean push(Context context, Shape shape, Receiver next) {
// This is safe since the number of selectors is always >= 2.
return selectors.get(0).push(context, shape, new State(1, next));
public boolean push(Context ctx, Shape shape, Receiver next) {
return leftSelector.push(ctx, shape, (c, s) -> rightSelector.push(c, s, next));
}

private final class State implements Receiver {

private final int position;
private final Receiver downstream;

private State(int position, Receiver downstream) {
this.position = position;
this.downstream = downstream;
}

@Override
public boolean apply(Context context, Shape shape) {
if (position == terminalSelectorIndex) {
return selectors.get(position).push(context, shape, downstream);
} else {
return selectors.get(position).push(context, shape, new State(position + 1, downstream));
}
}
@Override
public Collection<? extends Shape> getStartingShapes(Model model) {
return leftSelector.getStartingShapes(model);
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ final class AttributeSelector implements InternalSelector {
private final List<AttributeValue> expected;
private final AttributeComparator comparator;
private final boolean caseInsensitive;
private final Function<Model, Collection<? extends Shape>> optimizer;

AttributeSelector(
List<String> path,
Expand All @@ -54,32 +55,34 @@ final class AttributeSelector implements InternalSelector {
this.expected.add(AttributeValue.literal(validValue));
}
}
}

static AttributeSelector existence(List<String> path) {
return new AttributeSelector(path, null, null, false);
}

@Override
public Function<Model, Collection<? extends Shape>> optimize() {
// Optimization for loading shapes with a specific trait.
// This optimization can only be applied when there's no comparator,
// and it doesn't matter how deep into the trait the selector descends.
if (comparator == null
&& path.size() >= 2
&& path.get(0).equals("trait") // only match on traits
&& !path.get(1).startsWith("(")) { // don't match projections
return model -> {
optimizer = model -> {
// The trait name might be relative to the prelude, so ensure it's absolute.
String absoluteShapeId = Trait.makeAbsoluteName(path.get(1));
ShapeId trait = ShapeId.from(absoluteShapeId);
return model.getShapesWithTrait(trait);
};
} else {
return null;
optimizer = Model::toSet;
}
}

static AttributeSelector existence(List<String> path) {
return new AttributeSelector(path, null, null, false);
}

@Override
public Collection<? extends Shape> getStartingShapes(Model model) {
return optimizer.apply(model);
}

@Override
public boolean push(Context context, Shape shape, Receiver next) {
if (matchesAttribute(shape, context)) {
Expand Down
Loading

0 comments on commit 2b4f78e

Please sign in to comment.