Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fact generated by Gen not taken into account #12

Closed
MarcelHB opened this issue Sep 10, 2018 · 2 comments
Closed

Fact generated by Gen not taken into account #12

MarcelHB opened this issue Sep 10, 2018 · 2 comments
Assignees

Comments

@MarcelHB
Copy link
Contributor

Hello,

I'm evluating Phasar (HEAD revision as of today) for some use case. I picked the IFDS taint analysis tutorial [0] as a point for starting (i.e. probleme declaration file, Store/Load/Call FFs taken from there). Now, I have a minimal IR sample without any branches and declaration-only functions. Snippet:

  %14 = call i32 @MyAPI_create(i32 1140850688, i32 %12, i32 %13, i32* %7)
  %15 = load i32, i32* %9, align 4
...
  %18 = call i32 @MyAPI_free(i32* %7)

I'd like to put %7 of the first call into my fact-set and kill it by the second API call.

Code approach:

std::shared_ptr<psr::FlowFunction<const llvm::Value *>>
MyAPIUseProblem::getCallFlowFunction (const llvm::Instruction *callInstruction, const llvm::Function *targetFunction) {
  if (llvm::isa<llvm::CallInst>(callInstruction) || llvm::isa<llvm::InvokeInst>(callInstruction)) {
    llvm::ImmutableCallSite callSite(callInstruction);

    if (targetFunction->getName().equals("MyAPI_create")) {
      std::cout << "* Gen arg: " << callSite.getArgument(3) << std::endl;
      return std::make_shared<psr::Gen<const llvm::Value*>>(callSite.getArgument(3), zeroValue());
    } else if (targetFunction->getName().equals("MyAPI_free")) {
      std::cout << "* Kill arg: " << callSite.getArgument(0) << std::endl;
      return std::make_shared<psr::Kill<const llvm::Value*>>(callSite.getArgument(0));
    }
    ... 

getCallToRetFlow returns the identity.

Unfortunately, this argument at index 3 appears nowhere inside my facts, just the zero:

--- IFDS START RESULT RECORD ---
N: %14 = call i32 @MyAPI_create(i32 1140850688, i32 %12, i32 %13, i32* %7), !phasar.instruction.id !18, ID: 16 in function: main
D:	@zero_value = constant i2 0, align 4, ID: -1 	V:  BOTTOM

My debug prints indicate reaching the code, but the immediate load FF won't see anything either:

* Gen arg: 0x3c66848
* Load fact: 0x3c68208  // <- zero
...
* Kill arg: 0x3c66848

Do you have any idea what could be going wrong? Are the tutorial sources still working on the current revision?

Thanks for your help!

[0] http://phasar.org/wp-content/uploads/2018/06/taint_analysis_plugin.zip

@MarcelHB
Copy link
Contributor Author

Well, I got it reading the internal code a bit: Actually, my scan for the API calls has to go to getSummaryFlowFunction.

When treating the facts of declared-only function such as these in getCallFlowFunction, this one:

std::set<N> startPointsOf = icfg.getStartPointsOf(sCalledProcN);

of course remains empty and will never merge the facts despite collecting them.

I can live with this, it just caused a tiny bit of confusion to me at first. 🙂

I didn't lookup the algorithm's reference description right now, but let' say I don't treat something in getSummaryFlowFunction by simply returning nullptr. Then obviously, the code calls getCallFlowFunction instead without taking the facts into account for missing return points ... could we check this earlier to save some cycles (res has no further use then)?

@pdschubert pdschubert self-assigned this Sep 19, 2018
@pdschubert
Copy link
Member

Hi Marcel,

Thanks for looking into PhASAR - I hope that the tool can be useful to your application scenario.
Yes, you are right, when PhASAR finds a function (e.g. from libc) or llvm intrinsic for which no definition is available, it does not use the getCallFlowFunction that would usually perform the parameter mapping. Instead it tries to get a flow function by calling getSummaryFlowFunction.

A note on plug-in summaries: If you ever find yourself in the situation where you have to model the effects of a function for which a definition is available (but you do not wish to analyze it), you can return the KillAll flow function in getCallFlowFunction and then generate the desired data-flow facts in the getCallToRetFlowFunction factory. Thus, you avoid to follow the call.

Sure, I will see that we can perform the check earlier ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants