Skip to content

Commit

Permalink
Add batch query support for drop step [tp-tests]
Browse files Browse the repository at this point in the history
- Reuse multi-query  optimization for TinkerPop's
- Change restriction on eligible multi-query traversals and allow multi-query optimizations to be used for queries with  steps
- Add release template for JanusGraph 1.1.0

Signed-off-by: Oleksandr Porunov <[email protected]>
  • Loading branch information
porunov committed Jun 1, 2024
1 parent d9f89ed commit 72db163
Show file tree
Hide file tree
Showing 26 changed files with 754 additions and 87 deletions.
77 changes: 77 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ All currently supported versions of JanusGraph are listed below.
| JanusGraph | Storage Version | Cassandra | HBase | Bigtable | ScyllaDB | Elasticsearch | Solr | TinkerPop | Spark | Scala |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| 1.0.z | 2 | 3.11.z, 4.0.z | 2.5.z | 1.3.0, 1.4.0, 1.5.z, 1.6.z, 1.7.z, 1.8.z, 1.9.z, 1.10.z, 1.11.z, 1.14.z | 5.y | 6.y, 7.y, 8.y | 8.y | 3.7.z | 3.2.z | 2.12.z |
| 1.1.z | 2 | 3.11.z, 4.0.z | 2.5.z | 1.3.0, 1.4.0, 1.5.z, 1.6.z, 1.7.z, 1.8.z, 1.9.z, 1.10.z, 1.11.z, 1.14.z | 5.y | 6.y, 7.y, 8.y | 8.y | 3.7.z | 3.2.z | 2.12.z |

!!! info
Even so ScyllaDB is marked as `N/A` prior version 1.0.0 it was actually supported using `cql` storage option.
Expand All @@ -49,6 +50,82 @@ The versions of JanusGraph listed below are outdated and will no longer receive

## Release Notes

### Version 1.1.0 (Release Date: ???)

/// tab | Maven
```xml
<dependency>
<groupId>org.janusgraph</groupId>
<artifactId>janusgraph-core</artifactId>
<version>1.1.0</version>
</dependency>
```
///

/// tab | Gradle
```groovy
compile "org.janusgraph:janusgraph-core:1.1.0"
```
///

**Tested Compatibility:**

* Apache Cassandra 3.11.10, 4.0.6
* Apache HBase 2.5.0
* Oracle BerkeleyJE 7.5.11
* ScyllaDB 5.1.4
* Elasticsearch 6.0.1, 6.6.0, 7.17.8, 8.10.4
* Apache Lucene 8.11.1
* Apache Solr 8.11.1
* Apache TinkerPop 3.7.3
* Java 8, 11

**Installed versions in the Pre-Packaged Distribution:**

* Cassandra 4.0.6
* Elasticsearch 7.14.0

#### Changes

For more information on features and bug fixes in 1.1.0, see the GitHub milestone:

- <https://github.com/JanusGraph/janusgraph/milestone/27?closed=1>

#### Assets

* [JavaDoc](https://javadoc.io/doc/org.janusgraph/janusgraph-core/1.1.0)
* [GitHub Release](https://github.com/JanusGraph/janusgraph/releases/tag/v1.1.0)
* [JanusGraph zip](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-1.1.0.zip)
* [JanusGraph zip with embedded Cassandra and ElasticSearch](https://github.com/JanusGraph/janusgraph/releases/download/v1.1.0/janusgraph-full-1.1.0.zip)

#### Upgrade Instructions

##### Batched Queries Enhancement: Introduction of `JanusGraphNoOpBarrierVertexOnlyStep`

In previous versions, when a query that could benefit from batch-query optimization (multi-query) was executed without
a user-defined barrier step, JanusGraph would inject a `NoOpBarrierStep` by default. This approach allowed batching
for edges and properties, which do not gain advantages from multi-query optimization.

Starting with JanusGraph 1.1.0, this behavior has been improved. The system now injects a
`JanusGraphNoOpBarrierVertexOnlyStep` instead of the standard `NoOpBarrierStep` when no barrier steps are detected.
This change ensures that batching is applied exclusively to vertices, which do benefit from batch queries,
while excluding edges and properties from the batching process.

If a user explicitly defines a `.barrier()` step in the query, the system will continue to use the `NoOpBarrierStep` as expected.

##### Batch Query Optimizations Now Support Traversals Containing the `drop()` Step

Starting with JanusGraph 1.1.0, batch optimizations for vertex removal have been introduced in the `drop()` step and
are enabled by default. Previously, any batch optimization would be skipped for queries containing at least one
`drop()` step. However, with this update, such queries are now eligible for batch query optimization (multi-query).

Please note that the `LazyBarrierStrategy` (a TinkerPop strategy) is disabled for any query that includes at least one `drop()` step.

To disable the `drop()` step optimization and maintain the previous behavior, users can set the following configuration:
```
query.batch.drop-step-mode=none
```

### Version 1.0.0 (Release Date: October 21, 2023)

/// tab | Maven
Expand Down
1 change: 1 addition & 0 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -365,6 +365,7 @@ Configuration options to configure batch queries optimization behavior

| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.batch.drop-step-mode | Batching mode for `drop()` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all` - Drops all vertices in a batch.<br>- `none` - Skips drop batching optimization.<br> | String | all | MASKABLE |
| query.batch.enabled | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. If `false` then all other configuration options under `query.batch` namespace are ignored. | Boolean | true | MASKABLE |
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)<br>- `required_properties_only` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)<br>- `required_and_next_properties` - Prefetch the same properties as with `required_properties_only` mode, but also prefetch<br>properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.<br>In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.<br>In case the next step is one of the properties access steps with limited scope of properties, those properties will be<br>pre-fetched together in the same multi-query.<br>In case the next step is one of the properties access steps with unspecified scope of property keys then this mode<br>behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not<br>`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` - Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
| query.batch.label-step-mode | Labels pre-fetching mode for `label()` step. Used only when `query.batch.enabled` is `true`.<br>Supported modes:<br>- `all` - Pre-fetch labels for all vertices in a batch.<br>- `none` - Skips vertex labels pre-fetching optimization.<br> | String | all | MASKABLE |
Expand Down
3 changes: 2 additions & 1 deletion docs/operations/batch-processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ Batched query processing takes into account two types of steps:

1. Batch compatible step. This is the step which will execute batch requests. Currently, the list of such steps
is the next: `out()`, `in()`, `both()`, `inE()`, `outE()`, `bothE()`, `has()`, `values()`, `properties()`, `valueMap()`,
`propertyMap()`, `elementMap()`, `label()`.
`propertyMap()`, `elementMap()`, `label()`, `drop()`.
2. Parent step. This is a parent step which has local traversals with the same start. Such parent steps also implement the
interface `TraversalParent`. There are many such steps, but as for an example those could be: `and(...)`, `or(...)`,
`not(...)`, `order().by(...)`, `project("valueA", "valueB", "valueC").by(...).by(...).by(...)`, `union(..., ..., ...)`,
Expand Down Expand Up @@ -331,3 +331,4 @@ See configuration option `query.batch.has-step-mode` to control properties pre-f
See configuration option `query.batch.properties-mode` to control properties pre-fetching behaviour for `values`,
`properties`, `valueMap`, `propertyMap`, and `elementMap` steps.
See configuration option `query.batch.label-step-mode` to control labels pre-fetching behaviour for `label` step.
See configuration option `query.batch.drop-step-mode` to control drop batching behaviour for `drop` step.
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__;
import org.apache.tinkerpop.gremlin.process.traversal.step.filter.DropStep;
import org.apache.tinkerpop.gremlin.process.traversal.step.filter.HasStep;
import org.apache.tinkerpop.gremlin.process.traversal.step.util.WithOptions;
import org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.SubgraphStrategy;
Expand Down Expand Up @@ -59,6 +60,7 @@
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.RelationType;
import org.janusgraph.core.SchemaViolationException;
import org.janusgraph.core.Transaction;
import org.janusgraph.core.VertexLabel;
import org.janusgraph.core.VertexList;
import org.janusgraph.core.attribute.Cmp;
Expand Down Expand Up @@ -135,10 +137,12 @@
import org.janusgraph.graphdb.relations.StandardVertexProperty;
import org.janusgraph.graphdb.serializer.SpecialInt;
import org.janusgraph.graphdb.serializer.SpecialIntSerializer;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphDropStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphElementMapStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphHasStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertiesStep;
import org.janusgraph.graphdb.tinkerpop.optimize.step.JanusGraphPropertyMapStep;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryDropStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryHasStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryLabelStepStrategyMode;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryPropertiesStrategyMode;
Expand Down Expand Up @@ -214,6 +218,7 @@
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DB_CACHE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DB_CACHE_CLEAN_WAIT;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DB_CACHE_TIME;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.DROP_STEP_BATCH_MODE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.FORCE_INDEX_USAGE;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.HARD_MAX_LIMIT;
import static org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration.HAS_STEP_BATCH_MODE;
Expand Down Expand Up @@ -10023,11 +10028,7 @@ public void testMultiQueryDropsVertices() {

int verticesAmount = 42;

for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = tx.addVertex("id", i);
vertex.property("name", "name_test");
vertex.property("details", "details_" + i);
}
addVerticesForDropTest(verticesAmount, tx);

clopen();

Expand All @@ -10039,20 +10040,161 @@ public void testMultiQueryDropsVertices() {
.map(v -> (JanusGraphVertex) v)
.collect(Collectors.toList());

int actualCount = tx.multiQuery(vertices).drop();
int actualCount = tx.multiQuery(vertices).drop().size();
clopen();

assertEquals(verticesAmount, actualCount);

int afterDropCount = tx.traversal()
.V()
.has("name", "name_test")
.toList()
.size();
long afterDropCount = getVerticesForDropTestCount(tx.traversal());

assertEquals(0, afterDropCount);
}

@Test
public void testMultiQueryDropsStrategyModes() {

mgmt.makePropertyKey("id").dataType(Integer.class).cardinality(Cardinality.SINGLE).make();
PropertyKey nameProp = mgmt.makePropertyKey("name").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.makePropertyKey("details").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.buildIndex("nameIndex", Vertex.class).addKey(nameProp).buildCompositeIndex();

finishSchema();

long verticesAmount = 42;

// Mode: NONE

addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.NONE.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
TraversalMetrics profileT = graph.traversal().V().drop().profile().next();
assertTrue(profileT.getMetrics().stream().anyMatch(metrics -> metrics.getName().equals(DropStep.class.getSimpleName())));
graph.tx().commit();
assertEquals(0, getVerticesForDropTestCount());

// Mode: ALL

addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.ALL.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
profileT = graph.traversal().V().drop().profile().next();
assertEquals("true", profileT.getMetrics().stream().filter(metrics -> metrics.getName().equals(JanusGraphDropStep.class.getSimpleName())).findAny().get().getAnnotation("multi"));
graph.tx().commit();
assertEquals(0, getVerticesForDropTestCount());

// `limit` with `drop` step.

addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.NONE.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
int limitSize = 2;
profileT = graph.traversal().V().limit(limitSize).drop().profile().next();
assertTrue(profileT.getMetrics().stream().anyMatch(metrics -> metrics.getName().equals(DropStep.class.getSimpleName())));
graph.tx().commit();
long afterDropCount = getVerticesForDropTestCount();
assertEquals(verticesAmount-limitSize, afterDropCount);
graph.traversal().V().drop().iterate();
graph.tx().commit();
addVerticesForDropTest(verticesAmount);
graph.tx().commit();
clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.ALL.getConfigName());
assertEquals(verticesAmount, getVerticesForDropTestCount());
profileT = graph.traversal().V().limit(limitSize).drop().profile().next();
assertEquals("true", profileT.getMetrics().stream().filter(metrics -> metrics.getName().equals(JanusGraphDropStep.class.getSimpleName())).findAny().get().getAnnotation("multi"));
graph.tx().commit();
afterDropCount = getVerticesForDropTestCount();
assertEquals(verticesAmount-limitSize, afterDropCount);
}

@Test
public void testMetaPropertiesDrop(){
mgmt.makePropertyKey("id").dataType(Integer.class).cardinality(Cardinality.SINGLE).make();
PropertyKey nameProp = mgmt.makePropertyKey("name").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.makePropertyKey("details").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.buildIndex("nameIndex", Vertex.class).addKey(nameProp).buildCompositeIndex();

finishSchema();

long verticesAmount = 42;

for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = tx.addVertex("id", i);
VertexProperty<String> property = vertex.property("name", "name_test");
property.property("details", "details_" + i);
}
graph.tx().commit();

clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.ALL.getConfigName());

assertEquals(verticesAmount, graph.traversal().V().properties("name").properties("details").count().next());

graph.traversal().V().properties("name").properties("details").drop().hasNext();

assertEquals(0, graph.traversal().V().properties("name").properties("details").count().next());

graph.tx().commit();

assertEquals(0, graph.traversal().V().properties("name").properties("details").count().next());
assertEquals(verticesAmount, graph.traversal().V().has("name").count().next());
}

@Test
public void testEdgePropertiesDrop(){
mgmt.makePropertyKey("id").dataType(Integer.class).cardinality(Cardinality.SINGLE).make();
mgmt.makePropertyKey("name").dataType(String.class).cardinality(Cardinality.SINGLE).make();
mgmt.makeEdgeLabel("relate").make();

finishSchema();

long verticesAmount = 42;

for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = tx.addVertex("id", i);
Vertex vertex2 = tx.addVertex("id", i+verticesAmount);
vertex.addEdge("relate", vertex2).property("name", "name_"+i);
}

graph.tx().commit();

clopen(option(DROP_STEP_BATCH_MODE), MultiQueryDropStepStrategyMode.ALL.getConfigName());

assertEquals(verticesAmount, graph.traversal().E().properties("name").count().next());

graph.traversal().E().properties("name").drop().hasNext();

assertEquals(0, graph.traversal().E().properties("name").count().next());

graph.tx().commit();

assertEquals(0, graph.traversal().E().properties("name").count().next());
assertEquals(verticesAmount, graph.traversal().E().count().next());
}

private void addVerticesForDropTest(long verticesAmount){
addVerticesForDropTest(verticesAmount, graph);
}

private long getVerticesForDropTestCount(){
return getVerticesForDropTestCount(graph.traversal());
}

private void addVerticesForDropTest(long verticesAmount, Transaction tx){
for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = tx.addVertex("id", i);
vertex.property("name", "name_test");
vertex.property("details", "details_" + i);
}
}

private long getVerticesForDropTestCount(GraphTraversalSource g){
return g.V()
.has("name", "name_test")
.count().next();
}

@ParameterizedTest
@ValueSource(booleans = {true, false})
public void testParallelBackendOps(boolean parallelBackendOpsEnabled) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.janusgraph.diskstorage.configuration.WriteConfiguration;
import org.janusgraph.diskstorage.cql.CQLConfigOptions;
import org.janusgraph.graphdb.configuration.GraphDatabaseConfiguration;
import org.janusgraph.graphdb.tinkerpop.optimize.strategy.MultiQueryDropStepStrategyMode;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
Expand Down Expand Up @@ -65,6 +66,7 @@ public WriteConfiguration getConfiguration() {
config.set(GraphDatabaseConfiguration.STORAGE_BACKEND,"cql");
config.set(CQLConfigOptions.LOCAL_DATACENTER, "dc1");
config.set(GraphDatabaseConfiguration.USE_MULTIQUERY, true);
config.set(GraphDatabaseConfiguration.DROP_STEP_BATCH_MODE, MultiQueryDropStepStrategyMode.NONE.getConfigName());
return config.getConfiguration();
}

Expand Down Expand Up @@ -103,7 +105,7 @@ public Integer dropVertices() {
.map(v -> (JanusGraphVertex) v)
.collect(Collectors.toList());

dropCount = tx.multiQuery(vertices).drop();
dropCount = tx.multiQuery(vertices).drop().size();
} else {
dropCount = tx.traversal()
.V()
Expand All @@ -117,6 +119,27 @@ public Integer dropVertices() {
return dropCount;
}

@Benchmark
public Integer dropVerticesGremlinQuery() {

JanusGraphTransaction tx;
if (isMultiDrop) {
tx = graph.buildTransaction().setDropStepStrategyMode(MultiQueryDropStepStrategyMode.ALL).start();
} else {
tx = graph.buildTransaction().setDropStepStrategyMode(MultiQueryDropStepStrategyMode.NONE).start();
}

Integer dropCount = tx.traversal()
.V()
.has("name", "name_test")
.drop()
.toList()
.size();

tx.rollback();
return dropCount;
}

private void addVertices() {
for (int i = 0; i < verticesAmount; i++) {
Vertex vertex = graph.addVertex("id", i);
Expand Down
Loading

0 comments on commit 72db163

Please sign in to comment.