Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zhanfu/csr Framework for the CSR structure #4688

Merged
merged 9 commits into from
Oct 27, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
110 changes: 47 additions & 63 deletions src/graph/executor/algo/IsomorExecutor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,77 +12,61 @@ namespace graph {
folly::Future<Status> IsomorExecutor::execute() {
// TODO: Replace the following codes with subgraph matching. Return type.
SCOPED_TIMER(&execTime_);
auto* subgraph = asNode<Subgraph>(node());
auto* isomor = asNode<Isomor>(node());
DataSet ds;
ds.colNames = subgraph->colNames();
ds.colNames = isomor->colNames();

uint32_t steps = subgraph->steps();
const auto& currentStepVal = ectx_->getValue(subgraph->currentStepVar());
DCHECK(currentStepVal.isInt());
auto currentStep = currentStepVal.getInt();
auto resultVar = subgraph->resultVar();
auto iterDV = ectx_->getResult(isomor->getdScanVOut()).iter();
auto iterQV = ectx_->getResult(isomor->getqScanVOut()).iter();
auto iterDE = ectx_->getResult(isomor->getdScanEOut()).iter();
auto iterQE = ectx_->getResult(isomor->getqScanEOut()).iter();

auto iter = ectx_->getResult(subgraph->inputVar()).iter();
auto gnSize = iter->size();
unsigned int v_count = iterDV->size();
unsigned int l_count = iterDV->size();
unsigned int e_count = iterDE->size();

ResultBuilder builder;
builder.value(iter->valuePtr());
unsigned int *offset = new unsigned int[v_count + 2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Please explain via comments why the size is +2
(2) Please use vector. Don't use new.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) as the offset is recording the starting position of the node 0 which suppose to be 0, and offset[1] should be the end of position node 0 and start of position node 1. The size of the offset should be at least v_count + 1;
(2) Ok

unsigned int *neighbors = new unsigned int[e_count * 2];
Copy link
Contributor

@wenhaocs wenhaocs Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember that for every edge A-B, we actually store 2 copies, A->B and B->A. You need to take this into consideration. They can be distinguished by edge type signs. Positive and Negative represent them respectively. Please think about it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

unsigned int *labels = new unsigned int[l_count];
// load data vertices id and tags
while (iterDV->valid()) {
const auto v = iterDV->getColumn(nebula::kVid); // check if v is a vertex
Copy link
Contributor

@wenhaocs wenhaocs Sep 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are v and v2. Please use meaningful names.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change it to the new version in the code, v is the content of the vertex id while v2 is the content of the label id.

auto v_id = v.getInt();
const auto v2 = iterDV->getColumn(1); // get label
auto l_id = v2.getInt();
// unsigned int v_id = (unsigned int)v.getInt(0);
labels[v_id] = l_id; // Tag Id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can vID used as the index of the labels array? There is no restriction on what vID will look like, e.g., may be from 1000 to 100000.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Therefore, it's better to use map here.

iterDV->next();
}

std::unordered_map<Value, int64_t> currentVids;
currentVids.reserve(gnSize);
historyVids_.reserve(historyVids_.size() + gnSize);
if (currentStep == 1) {
for (; iter->valid(); iter->next()) {
const auto& src = iter->getColumn(nebula::kVid);
currentVids.emplace(src, 0);
}
iter->reset();
// load edges degree
while (iterDE->valid()) {
auto s = iterDE->getEdgeProp("*", kSrc);
unsigned int src = s.getInt();
offset[src]++;
iterDE->next();
}
auto& biDirectEdgeTypes = subgraph->biDirectEdgeTypes();
while (iter->valid()) {
const auto& dst = iter->getEdgeProp("*", nebula::kDst);
auto findIter = historyVids_.find(dst);
if (findIter != historyVids_.end()) {
if (biDirectEdgeTypes.empty()) {
iter->next();
} else {
const auto& typeVal = iter->getEdgeProp("*", nebula::kType);
if (UNLIKELY(!typeVal.isInt())) {
iter->erase();
continue;
}
auto type = typeVal.getInt();
if (biDirectEdgeTypes.find(type) != biDirectEdgeTypes.end()) {
if (type < 0 || findIter->second + 2 == currentStep) {
iter->erase();
} else {
iter->next();
}
} else {
iter->next();
}
}
} else {
if (currentStep == steps) {
iter->erase();
continue;
}
if (currentVids.emplace(dst, currentStep).second) {
Row row;
row.values.emplace_back(std::move(dst));
ds.rows.emplace_back(std::move(row));
}
iter->next();
}

// load data edges
offset[0] = 0;
iterDE = ectx_->getResult(isomor->getdScanEOut()).iter();
while (iterDE->valid()) {
unsigned int src = iterDE->getEdgeProp("*", kSrc).getInt();
unsigned int dst = iterDE->getEdgeProp("*", kDst).getInt();

neighbors[offset[src + 1]] = dst;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not understand this. Please explain via comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) src and dst is the source and destination information.
(2) offset[src + 1] will be the position of the array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it and comment it in the new version

offset[src + 1]++;
iterDE->next();
}
iter->reset();
builder.iter(std::move(iter));
ectx_->setResult(resultVar, builder.build());
// update historyVids
historyVids_.insert(std::make_move_iterator(currentVids.begin()),
std::make_move_iterator(currentVids.end()));

delete offset;
delete neighbors;
delete labels;

ResultBuilder builder;

// Set result in the ds and set the new column name for the (isomor matching 's) result.
return finish(ResultBuilder().value(Value(std::move(ds))).build());
}

} // namespace graph
} // namespace nebula
12 changes: 12 additions & 0 deletions src/graph/planner/plan/Algo.h
Original file line number Diff line number Diff line change
Expand Up @@ -339,6 +339,18 @@ class Isomor final : public SingleInputNode {
return qctx->objPool()->makeAndAdd<Isomor>(
qctx, input, dScanVOut, qScanVOut, dScanEOut, qScanEOut);
}
const std::string& getdScanVOut() const {
return dScanVOut_;
}
const std::string& getqScanVOut() const {
return qScanVOut_;
}
const std::string& getdScanEOut() const {
return dScanEOut_;
}
const std::string& getqScanEOut() const {
return qScanEOut_;
}

private:
friend ObjectPool;
Expand Down