Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DI-based TypeHierarchy #18

Closed
wants to merge 51 commits into from
Closed

Conversation

fabianbs96
Copy link
Owner

@fabianbs96 fabianbs96 commented May 29, 2023

CMakeLists.txt Outdated Show resolved Hide resolved
lib/PhasarLLVM/HelperAnalyses.cpp Outdated Show resolved Hide resolved
lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved

// Initialize the transitive closure matrix with all as false
llvm::BitVector InitVector(VertexTypes.size(), false);

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TransitiveClosure.reserve(VertexTypes.size());

lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
Comment on lines 140 to 142
for (size_t I = 0; I < VTableSize; I++) {
IndexToFunctions.push_back(Init);
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think, the size of IndexToFunctions is tied to the max virtual index; rather it is the number of types. The sizes of the inner vectors depend on that, but on a per-type basis

lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
Comment on lines 162 to 166
IndexFunctions.push_back(FunctionToAdd);
// concatenate vectors of functions of this index
IndexToFunctions[VirtualIndex].insert(
IndexToFunctions.at(VirtualIndex).begin(), IndexFunctions.begin(),
IndexFunctions.end());
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather go with sth like

if (IndexToFunctions[TypeIndex].size() <=VirtualIndex){
  IndexToFunctions[TypeIndex].resize(VirtualIndex + 1);
}
IndexToFunctions[TypeIndex][VirtualIndex] = FunctionToAdd;

Comment on lines 258 to 260
for (const auto &Function : VTable.getAllFunctions()) {
OS << Function->getName() << "\n";
OS << Function->getName() << ", ";
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to take a look at llvm::interleaveComma

lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
}
EXPECT_TRUE(DBTH.hasType(DBTH.getType("Base")));
EXPECT_TRUE(DBTH.hasType(DBTH.getType("Child")));
EXPECT_TRUE(BaseSubTypes.find(DBTH.getType("Child")) != BaseSubTypes.end());
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
EXPECT_TRUE(BaseSubTypes.find(DBTH.getType("Child")) != BaseSubTypes.end());
EXPECT_TRUE(BaseSubTypes.count(DBTH.getType("Child")));

lib/PhasarLLVM/TypeHierarchy/DIBasedTypeHierarchy.cpp Outdated Show resolved Hide resolved
Comment on lines 150 to 154
if (VTableForBase->empty()) {
EXPECT_TRUE(false);
} else {
EXPECT_TRUE(VTableForBase->getFunction(0)->getName() == "_ZN4Base3barEv");
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (VTableForBase->empty()) {
EXPECT_TRUE(false);
} else {
EXPECT_TRUE(VTableForBase->getFunction(0)->getName() == "_ZN4Base3barEv");
}
ASSERT_FALSE(VTableForBase->empty());
EXPECT_EQ(VTableForBase->getFunction(0)->getName(), "_ZN4Base3barEv");

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same below

Comment on lines 106 to 107
llvm::outs() << "Current arg: " << Arg << "\n";
llvm::outs().flush();
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May use PhASAR's logger here instead

Comment on lines 162 to 164
if (IndexToFunctions[TypeIndex->getSecond()].size() <= VirtualIndex) {
IndexToFunctions[TypeIndex->getSecond()].resize(VirtualIndex);
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be redundant

Comment on lines 316 to 318
llvm::outs() << "TC.size(): " << TransitiveClosure.size()
<< " VT.size(): " << VertexTypes.size();
llvm::outs().flush();
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should print such messages to stderr (llvm::errs()) instead.

Comment on lines 24 to 28
EXPECT_TRUE(BaseType);
if (BaseType) {
EXPECT_TRUE(DBTH.hasType(BaseType));
EXPECT_TRUE(DBTH.hasVFTable(BaseType));
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
EXPECT_TRUE(BaseType);
if (BaseType) {
EXPECT_TRUE(DBTH.hasType(BaseType));
EXPECT_TRUE(DBTH.hasVFTable(BaseType));
}
ASSERT_NE(nullptr, BaseType);
EXPECT_TRUE(DBTH.hasType(BaseType));
EXPECT_TRUE(DBTH.hasVFTable(BaseType));

TypeToVertex.end());
size_t BaseTypeVertex = TypeToVertex[DerivedType->getBaseType()];

llvm::outs().flush();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why?

size_t BaseTypeVertex = TypeToVertex[DerivedType->getBaseType()];

llvm::outs().flush();
assert(TransitiveClosure.size() >= BaseTypeVertex);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>


llvm::outs().flush();
assert(TransitiveClosure.size() >= BaseTypeVertex);
assert(TransitiveClosure.size() >= ActualDerivedType);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line 75, I want to access TransitiveClosure at the indices BaseTypeVertex and ActualDerivedType.
Line 75: "TransitiveClosure[BaseTypeVertex][ActualDerivedType] = true;"
Shouldn't I check beforehand if these indices are valid, by checking that the TransitiveClosure.size() is bigger than the indices? That's why I put the asserts in line 73 and line 74 there. Is there something I'm missing?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should, but in the equality case you are still accessing out of bounds, as the first element has index 0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thank you. I interpreted the > as "see above", as in "why?"

Comment on lines 138 to 139
std::vector<std::vector<const llvm::Function *>> IndexToFunctions;
IndexToFunctions.resize(VertexTypes.size());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a constructor that allows you to initialize a std::vector with a particular size

Comment on lines 128 to 136
size_t VTableSize = 0;
for (const auto &Subprogram : Finder.subprograms()) {
if (Subprogram->getVirtualIndex() > VTableSize) {
VTableSize = Subprogram->getVirtualIndex();
}
}
// if the biggest virtual index is 2 for example, the vector needs to have a
// size of 3 (Indices: 0, 1, 2)
VTableSize++;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant?

continue;
}

const auto *const FunctionToAdd =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a lot of const ;-)

Comment on lines 156 to 158
if (TypeIndex->getSecond() >= IndexToFunctions.size()) {
continue;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is a weird handling of this case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure what I should do to improve this case. Is an assert better here, or should I use the phasar logger with log level "WARNING"?

Copy link

@MMory MMory Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an assertion should be the better choice. This should never happen and would be a reason to fail hard.

Comment on lines +289 to +290
void DIBasedTypeHierarchy::printAsDot(llvm::raw_ostream &OS) const {
OS << "digraph TypeHierarchy{\n";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would benefit from labels, so that you see them in the graph visualization.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What kind of labels? Do you mean like a legend that tells you how the graph is structured?

Copy link

@MMory MMory Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dot file format allows attributes for nodes and edges, one of which is label. There are also more that allow you to set colors, borders etc. when you plot it to a image or just in xdot.

By doing so you could introduce nodes by their id and then reference by id, for example:

0[label="Base"]
1[label="Child"]
1->0[label="public"]

The label for the edge is just an example. In fact, it might be worth thinking about what would be interesting properties of the edges.

This dump would be helpful for debugging, as you see the mapping between ids and names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will implement this, thank you

Comment on lines 41 to 43
TEST(DBTHTest, BasicTHReconstruction_2) {
LLVMProjectIRDB IRDB({unittest::PathToLLTestFiles +
"type_hierarchies/type_hierarchy_17_cpp_dbg.ll"});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the indices are inconsistent, that's very surprising and confusing. Is there a reason you omit the other cases? What about case 16 (ll index) ?

Comment on lines 68 to 71
// Since Child2 hasn't been created, it shouldn't exist and also not be found
// via DBTH.getType("Child2")
const auto &Child2Type = DBTH.getType("Child2");
EXPECT_FALSE(Child2Type);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that intentional? here you are relying on compiler behavior.

@fabianbs96 fabianbs96 marked this pull request as ready for review September 21, 2023 17:13
size_t CurrentRowIndex = 0;
for (const auto &Row : TransitiveClosure) {
for (const auto &Cell : Row.set_bits()) {
if (Row[Cell]) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this is always true. set_bits() lets you iterate over all the positions where 1 is set.

@fabianbs96
Copy link
Owner Author

Closed in favor of secure-software-engineering#702

@fabianbs96 fabianbs96 closed this Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants