Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a VALUES clause to the query of a SERVICE clause to simplify the execution #1341

Merged
merged 13 commits into from
Jun 5, 2024
22 changes: 14 additions & 8 deletions src/engine/QueryPlanner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1641,6 +1641,13 @@
return {makeSubtreePlan<OptionalJoin>(_qec, a._qet, b._qet)};
}

// Check if one of the two Operations is a SERVICE. If so, we can try
// to simplify the Service Query using the result of the other operation.
if (auto opt = createJoinWithService(a, b, jcs)) {
candidates.push_back(std::move(opt.value()));
return candidates;
}

Check warning on line 1649 in src/engine/QueryPlanner.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/QueryPlanner.cpp#L1647-L1649

Added lines #L1647 - L1649 were not covered by tests

if (jcs.size() >= 2) {
// If there are two or more join columns and we are not using the
// TwoColumnJoin (the if part before this comment), use a multiColumnJoin.
Expand Down Expand Up @@ -1669,11 +1676,6 @@
candidates.push_back(std::move(opt.value()));
}

if (auto opt = createJoinWithService(a, b, jcs)) {
candidates.push_back(std::move(opt.value()));
return candidates;
}

// Test if one of `a` or `b` is a transitive path to which we can bind the
// other one.
if (auto opt = createJoinWithTransitivePath(a, b, jcs)) {
Expand Down Expand Up @@ -1783,20 +1785,24 @@
auto bRootOp = std::dynamic_pointer_cast<Service>(b._qet->getRootOperation());

// Exactly one of the two Operations can be a service.
if (aRootOp == bRootOp) {
if (static_cast<bool>(aRootOp) == static_cast<bool>(bRootOp)) {
return std::nullopt;
}

auto service = aRootOp ? aRootOp : bRootOp;
auto sibling = bRootOp ? a : b;

service->setSiblingTree(sibling._qet);

Check warning on line 1795 in src/engine/QueryPlanner.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/QueryPlanner.cpp#L1795

Added line #L1795 was not covered by tests

SubtreePlan plan = makeSubtreePlan<Join>(
service->getExecutionContext(), a._qet, b._qet, jcs[0][0], jcs[0][1]);
const auto& qec = service->getExecutionContext();

Check warning on line 1797 in src/engine/QueryPlanner.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/QueryPlanner.cpp#L1797

Added line #L1797 was not covered by tests

SubtreePlan plan =

Check warning on line 1799 in src/engine/QueryPlanner.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/QueryPlanner.cpp#L1799

Added line #L1799 was not covered by tests
jcs.size() == 1
? makeSubtreePlan<Join>(qec, a._qet, b._qet, jcs[0][0], jcs[0][1])
: makeSubtreePlan<MultiColumnJoin>(qec, a._qet, b._qet);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add unit tests for this planning (in the QueryPlannerTest.cpp and QueryPlannerTestHelpers.h you can add a matcher for a service clause, have a look at that code, and let me know if there is trouble)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have implemented a simple test for this in this commit, let me know if that is sufficient or provide feedback for further improvements.

mergeSubtreePlanIds(plan, a, b);

Check warning on line 1803 in src/engine/QueryPlanner.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/QueryPlanner.cpp#L1801-L1803

Added lines #L1801 - L1803 were not covered by tests
joka921 marked this conversation as resolved.
Show resolved Hide resolved

return plan;

Check warning on line 1805 in src/engine/QueryPlanner.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/QueryPlanner.cpp#L1805

Added line #L1805 was not covered by tests
}

// _____________________________________________________________________
Expand Down
32 changes: 17 additions & 15 deletions src/engine/Service.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,12 @@
serviceIriString.remove_suffix(1);
ad_utility::httpUtils::Url serviceUrl{serviceIriString};

// Try to optimize the Service Clause using it's sibling Operation.
// Try to simplify the Service Query using it's sibling Operation.
if (auto valuesClause = getSiblingValuesClause(); valuesClause.has_value()) {
auto openBracketPos = parsedServiceClause_.graphPatternAsString_.find('{');
parsedServiceClause_.graphPatternAsString_ =
"{ " + valuesClause.value() +
parsedServiceClause_.graphPatternAsString_.substr(2);
"{\n" + valuesClause.value() + '\n' +
parsedServiceClause_.graphPatternAsString_.substr(openBracketPos + 1);
}

// Construct the query to be sent to the SPARQL endpoint.
Expand Down Expand Up @@ -171,6 +172,7 @@
return {std::move(idTable), resultSortedOn(), std::move(localVocab)};
}

// ____________________________________________________________________________
std::optional<std::string> Service::getSiblingValuesClause() const {
joka921 marked this conversation as resolved.
Show resolved Hide resolved
if (siblingTree_ == nullptr) {
return std::nullopt;
Expand All @@ -179,44 +181,44 @@
const auto& siblingResult = siblingTree_->getResult();
if (siblingResult->size() >
RuntimeParameters().get<"service-max-value-rows">()) {
return std::nullopt;
}

Check warning on line 185 in src/engine/Service.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/Service.cpp#L184-L185

Added lines #L184 - L185 were not covered by tests

std::vector<ColumnIndex> commonColumnIndices;
const auto& siblingVars = siblingTree_->getVariableColumns();
std::string vars = "";
std::string vars = "(";
for (const auto& localVar : parsedServiceClause_.visibleVariables_) {
auto it = siblingVars.find(localVar);
if (it == siblingVars.end()) {
continue;
}

Check warning on line 194 in src/engine/Service.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/Service.cpp#L193-L194

Added lines #L193 - L194 were not covered by tests
vars += it->first.name() + " ";
commonColumnIndices.push_back(it->second.columnIndex_);
}
vars.pop_back();
if (commonColumnIndices.size() > 1) {
vars = "(" + vars + ")";
}
vars.back() = ')';

ad_utility::HashSet<std::string> rowSet;

std::string values = " { ";
for (size_t rowIndex = 0; rowIndex < siblingResult->size(); ++rowIndex) {
std::string row;
std::string row = "(";
for (size_t i = 0; i < commonColumnIndices.size(); ++i) {
const auto& optionalString = ExportQueryExecutionTrees::idToStringAndType(
siblingTree_->getRootOperation()->getIndex(),
siblingResult->idTable()(rowIndex, commonColumnIndices[i]),
siblingResult->localVocab());

if (optionalString.has_value()) {
row += optionalString.value().first;
if (i < commonColumnIndices.size() - 1) {
row += " ";
}
row += optionalString.value().first + ' ';
}
}
if (commonColumnIndices.size() > 1) {
row = "(" + row + ")";
row.back() = ')';

if (rowSet.contains(row)) {
continue;
}

Check warning on line 219 in src/engine/Service.cpp

View check run for this annotation

Codecov / codecov/patch

src/engine/Service.cpp#L218-L219

Added lines #L218 - L219 were not covered by tests

rowSet.insert(row);
values += row + " ";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all your someString += something + something other
use absl::StrAppend. It is more efficient and more idiomatic. (three places in this file)

}

Expand Down
5 changes: 4 additions & 1 deletion src/engine/Service.h
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,11 @@
GetTsvFunction getTsvFunction = sendHttpOrHttpsRequest,
std::shared_ptr<QueryExecutionTree> siblingTree = nullptr);

inline void setSiblingTree(std::shared_ptr<QueryExecutionTree> siblingTree) {
// Set the siblingTree (subTree that will later be joined with the Result of
// the Service Operation), used to reduce the Service Queries Complexity.
void setSiblingTree(std::shared_ptr<QueryExecutionTree> siblingTree) {
siblingTree_ = siblingTree;
}

Check warning on line 67 in src/engine/Service.h

View check run for this annotation

Codecov / codecov/patch

src/engine/Service.h#L65-L67

Added lines #L65 - L67 were not covered by tests

// Methods inherited from base class `Operation`.
std::string getDescriptor() const override;
Expand Down Expand Up @@ -90,6 +92,7 @@
// Compute the result using `getTsvFunction_`.
ResultTable computeResult() override;

// Get a VALUES clause that contains the values of the siblingTree's result.
std::optional<std::string> getSiblingValuesClause() const;
joka921 marked this conversation as resolved.
Show resolved Hide resolved

// Write the given TSV result to the given result object. The `I` is the width
Expand Down
Loading