Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the call graph, the call edges related to dynamic proxy are missing. #123

Open
YunFy26 opened this issue Oct 14, 2024 · 6 comments
Open
Assignees

Comments

@YunFy26
Copy link

YunFy26 commented Oct 14, 2024

📝 Overall Description

### For the following demo

Service.java

public interface Service {
    void doSomething();
}

ServiceImpl.java

public class ServiceImpl implements Service {
    @Override
    public void doSomething() {
        System.out.println("Performing task in ServiceImpl...");
    }
}

MyInvocationHandler.java

public class MyInvocationHandler implements InvocationHandler {

    private final Object target;

    public MyInvocationHandler(Object target) {
        this.target = target;
    }

    @Override
    public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
        System.out.println("before method call...");
        // method invoke
        Object result = method.invoke(target, args);
        System.out.println("after method call...");
        return result;
    }

    public static Object getProxy(Object target) {
        return Proxy.newProxyInstance(
                target.getClass().getClassLoader(),
                target.getClass().getInterfaces(),
                new MyInvocationHandler(target)
        );
    }

}

Main.java

public class Main {

    public static void main(String[] args) {
        ServiceImpl service = new ServiceImpl();
        Service proxy = (Service) MyInvocationHandler.getProxy(service);
        proxy.doSomething();
    }
}

IR of Main.java

public static void main(java.lang.String[] r3) {
        org.example.proxy.ServiceImpl $r0;
        java.lang.Object $r1;
        org.example.proxy.Service r2;
        [0@L10] $r0 = new org.example.proxy.ServiceImpl;
        [1@L10] invokespecial $r0.<org.example.proxy.ServiceImpl: void <init>()>();
        [2@L11] $r1 = invokestatic <org.example.proxy.MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>($r0);
        [3@L11] r2 = (org.example.proxy.Service) $r1;
        [4@L12] invokeinterface r2.<org.example.proxy.Service: void doSomething()>();
        [5@L13] return;
    }

The call-graph as follows:

digraph G {
  node [color=".3 .2 1.0",shape=box,style=filled];
  edge [];
  "0" [label="<java.lang.Class: java.lang.Class[] getInterfaces()>",];
  "1" [label="<java.lang.Class: java.lang.ClassLoader getClassLoader()>",];
  "2" [label="<org.example.proxy.MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>",];
  "3" [label="<java.lang.Object: java.lang.Class getClass()>",];
  "4" [label="<org.example.proxy.MyInvocationHandler: void <init>(java.lang.Object)>",];
  "5" [label="<org.example.proxy.ServiceImpl: void <init>()>",];
  "6" [label="<java.lang.Object: void <init>()>",];
  "7" [label="<org.example.Main: void main(java.lang.String[])>",];
  "8" [label="<java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>",];
  "2" -> "8" [label="[6@L27] $r6 = invokestatic <java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>($r2, $r4, $r5);",];
  "2" -> "4" [label="[5@L29] invokespecial $r5.<org.example.proxy.MyInvocationHandler: void <init>(java.lang.Object)>(r0);",];
  "2" -> "3" [label="[0@L28] $r1 = invokevirtual r0.<java.lang.Object: java.lang.Class getClass()>();",];
  "2" -> "1" [label="[1@L28] $r2 = invokevirtual $r1.<java.lang.Class: java.lang.ClassLoader getClassLoader()>();",];
  "2" -> "0" [label="[3@L29] $r4 = invokevirtual $r3.<java.lang.Class: java.lang.Class[] getInterfaces()>();",];
  "2" -> "3" [label="[2@L29] $r3 = invokevirtual r0.<java.lang.Object: java.lang.Class getClass()>();",];
  "4" -> "6" [label="[0@L12] invokespecial %this.<java.lang.Object: void <init>()>();",];
  "5" -> "6" [label="[0@L3] invokespecial %this.<java.lang.Object: void <init>()>();",];
  "7" -> "2" [label="[2@L11] $r1 = invokestatic <org.example.proxy.MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>($r0);",];
  "7" -> "5" [label="[1@L10] invokespecial $r0.<org.example.proxy.ServiceImpl: void <init>()>();",];
}

The call edge maindoSomething is missing.

In the actual runtime call sequence, before doSomething is called, the method invoke of MyInvocationHandler will be called, and then doSomething is called through reflection within the invoke method.

After completing the pointer analysis, I reviewed the results of the analysis.

solver.csManager.callSites includes:

<org.example.Main: void main(java.lang.String[])>[4@L12] invokeinterface r2.doSomething()

solver.csManager.ptrManager.vars.map includes var r2 ,but the pointsToSet of r2 is null , As shown in Figure-1

At runtime, the type of r2 is jdk.proxy1.$Proxy0

public static void main(String[] args) {
        ServiceImpl service = new ServiceImpl();
        Service proxy = (Service) MyInvocationHandler.getProxy(service);
        System.out.println(proxy.getClass());   //class jdk.proxy1.$Proxy0
        proxy.doSomething();
    }

Since $Proxy0 is generated at runtime, Tai-e is unable to identify the allocation site for this object. So there is no Object mocked, which results in the missing call edge. Is my understanding correct?

According to #114

Regarding mocking IR, Tai-e currently supports mocking IR within method at the statement level but does not support mocking an entire class. We will take this into consideration in the future.

Does this imply that Tai-e does not yet natively support method calls in dynamic proxy? If Tai-e supports handling method calls within proxy classes, what configurations should I modify?

Moreover, I have observed that solver.csManager.objManager.objMap contains:(as shown in Figure-2

{ConstantObj@5877} "ConstantObj{java.lang.Class: org.example.proxy.ServiceImpl.class}" -> {HybridHashMap@5878}  size = 1

Why is org.example.proxy.ServiceImpl.classconsidered a ConstantObj?




Additionally, in tai-e-analyses.yml , I set the value of handle-invokedynamic to true. Tai-e output the IR of $Proxy:

public final class jdk.proxy1.$Proxy0 extends java.lang.reflect.Proxy implements org.example.proxy.Service {

    ...

    public final void doSomething() {
        java.lang.reflect.InvocationHandler $r2;
        java.lang.reflect.Method $r1;
        null-type %nullconst;
        java.lang.Throwable $r5, $r3;
        java.lang.reflect.UndeclaredThrowableException $r4;
        [0@L-1] $r2 = %this.<java.lang.reflect.Proxy: java.lang.reflect.InvocationHandler h>;
        [1@L-1] $r1 = <jdk.proxy1.$Proxy0: java.lang.reflect.Method m3>;
        [2@L-1] invokeinterface $r2.<java.lang.reflect.InvocationHandler: java.lang.Object invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[])>(%this, $r1, %nullconst);
        [3@L-1] return;
        [4@L-1] catch $r5;
        [5@L-1] throw $r5;
        [6@L-1] catch $r3;
        [7@L-1] $r4 = new java.lang.reflect.UndeclaredThrowableException;
        [8@L-1] invokespecial $r4.<java.lang.reflect.UndeclaredThrowableException: void <init>(java.lang.Throwable)>($r3);
        [9@L-1] throw $r4;

        try [0, 4), catch java.lang.Error at 4
        try [0, 4), catch java.lang.RuntimeException at 4
        try [0, 4), catch java.lang.Throwable at 6
    }

    ...

}

I have a few questions regarding this IR. Could you explain why the line number is shown as -1?

🎯 Expected Behavior

None

🐛 Current Behavior

None

🔄 Reproducible Example

No response

⚙️ Tai-e Arguments

🔍 Click here to see Tai-e Options
optionsFile: null
printHelp: false
classPath:
- ../Tai-e_Test/build/classes/java/main
appClassPath:
- ../Tai-e_Test/build/classes/java/main
mainClass: org.example.Main
inputClasses: []
javaVersion: 17
prependJVM: true
allowPhantom: true
worldBuilderClass: pascal.taie.frontend.soot.SootWorldBuilder
outputDir: output
preBuildIR: false
worldCacheMode: false
scope: APP
nativeModel: true
planFile: null
analyses:
ir-dumper: ""
cg: ""
cfg: ""
pta: "plugins:[pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin]"
onlyGenPlan: false
keepResult:
- $KEEP-ALL

🔍 Click here to see Tai-e Analysis Plan
- id: ir-dumper
options: {}
- id: pta
options:
  cs: 1-obj
  only-app: true
  implicit-entries: false
  distinguish-string-constants: reflection
  merge-string-objects: true
  merge-string-builders: true
  merge-exception-objects: true
  handle-invokedynamic: true
  propagate-types:
  - reference
  advanced: null
  dump: false
  dump-ci: false
  dump-yaml: false
  expected-file: null
  reflection-inference: string-constant
  reflection-log: null
  taint-config: null
  taint-config-providers: []
  taint-interactive-mode: false
  plugins:
  - pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin
  time-limit: -1
- id: cg
options:
  algorithm: pta
  dump: true
  dump-methods: true
  dump-call-edges: true
- id: throw
options:
  exception: explicit
  algorithm: intra
- id: cfg
options:
  exception: explicit
  dump: true

📜 Tai-e Log

🔍 Click here to see Tai-e Log
Writing log to /Users/yuntsy/My/Projects/Java/Tai-e/output/tai-e.log
java.version: 17.0.11
java.version.date: 2024-04-16
java.runtime.version: 17.0.11+7-LTS-207
java.vendor: Oracle Corporation
java.vendor.version: null
os.name: Mac OS X
os.version: 15.0.1
os.arch: aarch64
Tai-e Version: 0.5.1-SNAPSHOT
Tai-e Commit: 46448829b6c19ae414caea7b43bd7fb8792ac0a5
Writing analysis plan to /Users/yuntsy/My/Projects/Java/Tai-e/output/tai-e-plan.yml
WorldBuilder starts ...
10085 classes with 99482 methods in the world
WorldBuilder finishes, elapsed time: 1.62s
ir-dumper starts ...
Dumping IR in /Users/yuntsy/My/Projects/Java/Tai-e/output/tir
5 classes in scope (APP) of class analyses
ir-dumper finishes, elapsed time: 0.03s
pta starts ...
[Pointer analysis] elapsed time: 0.01s
-------------- Pointer analysis statistics: --------------
#var pointers:                12 (insens) / 12 (sens)
#objects:                     5 (insens) / 5 (sens)
#var points-to:               9 (insens) / 9 (sens)
#static field points-to:      0 (sens)
#instance field points-to:    1 (sens)
#array points-to:             1 (sens)
#reachable methods:           9 (insens) / 10 (sens)
#call graph edges:            10 (insens) / 10 (sens)
----------------------------------------
pta finishes, elapsed time: 0.11s
cg starts ...
Call graph has 9 reachable methods and 10 edges
Dumping call graph to /Users/yuntsy/My/Projects/Java/Tai-e/output/call-graph.dot
Dumping reachable methods to /Users/yuntsy/My/Projects/Java/Tai-e/output/reachable-methods.txt
Dumping call edges to /Users/yuntsy/My/Projects/Java/Tai-e/output/call-edges.txt
cg finishes, elapsed time: 0.01s
throw starts ...
14 methods in scope (APP) of method analyses
throw finishes, elapsed time: 0.00s
cfg starts ...
Dumping CFGs in /Users/yuntsy/My/Projects/Java/Tai-e/output/cfg
cfg finishes, elapsed time: 0.01s
Tai-e finishes, elapsed time: 1.88s

ℹ️ Additional Information

No response

@YunFy26 YunFy26 changed the title In the call graph, the call edges related to dynamic proxies are missing. In the call graph, the call edges related to dynamic proxy are missing. Oct 14, 2024
@zhangt2333
Copy link
Member

zhangt2333 commented Oct 15, 2024

Thank you for taking the time to provide such detailed information. This seems to be a rather important issue, we'll take the time to look into it after being free.

Before we investigate this issue further, we would like to conduct a user study to understand your experience with our GitHub Issue Template. Specifically, we want to determine if there are any organizational, descriptive or structural aspects of the template that make it difficult/undesirable for you to follow when submitting an issue.

@YunFy26
Copy link
Author

YunFy26 commented Oct 15, 2024

I apologize for not strictly adhering to the issue template format when submitting my issue. I’d like to explain the reason behind this.

When describing my example in the Overall Description, whether it’s for this issue or previous ones, I find it difficult to separate the Expected Behavior and Current Behavior from the Overall Description. When describing the issue, I always feel that placing Expected Behavior and Current Behavior as separate headings after the Overall Description creates a sense of “disconnection.” It feels like it disrupts the flow of the explanation.

Taking this submission as an example, I want to analyze the function calls related to dynamic proxies. I first provided a brief description in the title: “call edges related to dynamic proxy are missing.” Then, in the Overall Description, I started by offering a demo as a sample for analysis.

①Demo

Afterward, I presented the resulting call graph and explained the outcome of this analysis.

②The call edge main → doSomething is missing.

Next, I described the actual runtime call sequence:

③In the actual runtime call sequence, before doSomething is called, the method invoke of MyInvocationHandler will be called, and then doSomething is called through reflection within the invoke method.

In this process:

① is the Reproducible Example

② is the Current Behavior

③ is the Expected Behavior(Perhaps I didn’t describe it clearly enough. I should have included a call chain like: main -> invoke -> doSomething as Expected Behavior.)

If I strictly followed the template, the structure would probably look like this: I would first describe the issue in the Overall Description, then follow with either a ③②① or ②③① format.

Personally, I believe that describing the entire process directly in the Overall Description makes it easier to follow and understand. Therefore, I placed everything in the Description section. In this case, if I were to follow the template strictly, it would result in redundant content. That’s why I filled in “None” for both Expected Behavior and Current Behavior.

In fact, to ensure that others could understand more easily, I revised the content and format multiple times before submitting. (However, looking at it again now, it seems I should have used symbols like “·” or “>” to better organize the structure.)

Regarding the issue template, I personally believe that Expected Behavior and Current Behavior could be subheadings under the Overall Description, but this is just my personal opinion. You may want to gather feedback from other users to make a more informed decision.

@BryanHeBY
Copy link

Hi YunFy26, I set the value of handle-invokedynamic to true, but I still can't find the IR for $Proxy. Could you please provide me with an environment where this IR output can be reproduced, including the JDK environment, tai-e configuration options, etc.? I noticed that you enabled a custom plugin, pascal.taie.analysis.pta.plugin.CustomEntryPointPlugin. Would this plugin affect the result?

As for the question, Why is org.example.proxy.ServiceImpl.class considered a ConstantObj?, it's because it is the class object (of java.lang.Class type) literal, not the class itself.

@YunFy26
Copy link
Author

YunFy26 commented Oct 25, 2024

@BryanHeBY Apologies for mistakenly assuming that the value of handle-invokedynamic affected the IR output of $Proxy0.

In this repo, after running ./gradlew build, I navigated to build/classes/java/main and executed:

java -Djdk.proxy.ProxyGenerator.saveGeneratedFiles=true -cp . org.example.Main

This caused the bytecode file of the dynamic proxy class to be saved in build/classes/java/main/jdk/proxy1/$Proxy0.class, leading it to be recognized as an application class and subsequently loaded into Tai-e World. As a result, when executing ir-dumper, the IR for $Proxy0 is output as well.

This is unrelated to the missing call edges in method invocations within dynamic proxy classes.

I apologize for my limited expertise, which may have caused inconvenience to the Tai-e team members. I also sincerely appreciate the Tai-e team for addressing my questions.

@orlies
Copy link

orlies commented Oct 30, 2024

Tai-e currently does not support handling dynamic proxy. Dynamic proxy generate the bytecode for the proxy class as byte[] upon the first use, then load that proxy class through bytecode, and finally access it via reflection. The semantics of generating the proxy class bytecode are relatively complex, and byte[] is difficult to handle with static analysis. Additionally, Tai-e does not currently support dynamic class loading. In summary, Tai-e does not support such static analysis at this time.

However, the behavior of dynamic proxy is not complex. A proxy class is generated based on the input interfaces. It holds an InvocationHandler, and all the interfaces' methods are delegated to the InvocationHandler (by the way, this does not involve invokeDynamic). We can use Tai-e's plugin system to easily handle the behavior of dynamic proxy in the pointer analysis.

Here, I can provide you with two methods for reference:

Method 1

After generating the proxy classes files (.class file) in the runtime (using -Djdk.proxy.ProxyGenerator.saveGeneratedFiles=true), let Tai-e bypass the code that generates the proxy class (skip the code in Proxy.newProxyInstance) and directly use the generated proxy class. The specific plugin code is as follows:

Plugin code

CustomModel manages the initialization of the concrete plugin

public class CustomModel extends CompositePlugin {

    @Override
    public void setSolver(Solver solver) {
        addPlugin(new IRProxyModel(solver));
    }
}

IRProxyModel models the semantics of Proxy's newProxyInstance method through generating IR

public class IRProxyModel extends IRModelPlugin {

  private final Set<JClass> proxyClasses;

  private final Type invocationHandlerType;

  IRProxyModel(Solver solver) {
      super(solver);
      proxyClasses = getProxyClasses();
      invocationHandlerType = typeSystem.getClassType("java.lang.reflect.InvocationHandler");
  }

  private Set<JClass> getProxyClasses() {
      JClass proxy = hierarchy.getClass("java.lang.reflect.Proxy");
      String regex = "jdk\\.proxy\\d+\\.\\$Proxy\\d+";
      Predicate<String> pattern = Pattern.compile(regex).asMatchPredicate();
      return hierarchy.getAllSubclassesOf(proxy).stream()
              .filter(c -> pattern.test(c.getName()))
              .collect(Collectors.toSet());
  }

  @InvokeHandler(signature = "<java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>")
  public List<Stmt> newProxyInstance(Invoke invoke) {
      List<Stmt> stmts = new ArrayList<>();
      Var result = invoke.getResult();
      Var invocationHandler = invoke.getInvokeExp().getArg(2);
      if (result != null) {
          for (var proxy : proxyClasses) {
              for (var method : proxy.getDeclaredMethods()) {
                  if (!method.isConstructor()) {
                      continue;
                  }
                  if (method.getParamCount() == 1 && method.getParamType(0).equals(invocationHandlerType)) {
                      stmts.add(new New(invoke.getContainer(), result, new NewInstance(proxy.getType())));
                      stmts.add(new Invoke(invoke.getContainer(),
                              new InvokeSpecial(method.getRef(), result, List.of(invocationHandler))));
                  }
              }
          }
      }
      return stmts;
  }
}

You need to activate the plugin by declaring it in the configuration file. You also need to enable the reflection analysis (because dynamic proxy will use reflection) and disable the only-app option (because you need to analyze the <init> method for Proxy). The call graph is as follow (only the part from main to doSomething):

Partial call graph
...

  "122" [label="<MyInvocationHandler: java.lang.Object invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[])>",];
  "13887" [label="<jdk.proxy1.$Proxy0: void doSomething()>",];
  "14293" [label="<Main: void main(java.lang.String[])>",];
  "14857" [label="<ServiceImpl: void doSomething()>",];

...

  "14293" -> "13887" [label="[4@L8] invokeinterface r2.<Service: void doSomething()>();",];
  "13887" -> "122" [label="[2@L-1] invokeinterface $r2.<java.lang.reflect.InvocationHandler: java.lang.Object invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[])>(%this, $r1, %nullconst);",];
  "122" -> "14857" [label="[4@L17] $r5 = invokevirtual method.<java.lang.reflect.Method: java.lang.Object invoke(java.lang.Object,java.lang.Object[])>($r4, args);",];

...

The plugin models Proxy.newProxyInstance (the API for generating dynamic proxy object) through a piece of code (IR) for generating all proxy class objects (through the pre-generated proxy classes' files). After that, Tai-e will directly use these proxy class objects for analysis. Note that generating all proxy class objects in any call to newProxyInstance may indeed reduce precision, but Tai-e will impose some constraints during object propagation based on the type of the object (for example, an object that is actually of type A cannot be cast to type B). You can also achieve higher precision modeling by yourself through the remaining parameters of the API.

This method requires running the program in advance to generate all proxy classes, which is not that 'static'.

Method 2

Since the logic for generating dynamic proxy code is not that difficult—requiring the proxy class to directly delegate to the InvocationHandler for the actual operations—we can model the semantics of such delegation. The specific plugin code is as follows:

Plugin code

CustomModel manages the initialization of the concrete plugin

public class CustomModel extends CompositePlugin {

    @Override
    public void setSolver(Solver solver) {
        addPlugin(new SemanticProxyModel(solver));
    }
}

SemanticProxyModel model the semantics of delegation

public class SemanticProxyModel extends AnalysisModelPlugin {

    private static final Logger logger = LogManager.getLogger(SemanticProxyModel.class);
    
    private static final Descriptor PROXY_DESC = () -> "ProxyObj";

    private static final Descriptor REFLECTION_DESC = () -> "ReflectionMetaObj";
    
    private static final Descriptor ARGS_ARRAY_DESC = () -> "Object[Args]Obj";
    
    private final JClass object;
    
    private final JClass proxy;
    
    private final Set<MethodRef> proxiedMethods;

    SemanticProxyModel(Solver solver) {
        super(solver);
        object = Objects.requireNonNull(hierarchy.getJREClass(ClassNames.OBJECT));
        proxy = Objects.requireNonNull(hierarchy.getJREClass("java.lang.reflect.Proxy"));
        proxiedMethods = Set.of(
                Objects.requireNonNull(object.getDeclaredMethod("hashCode")).getRef(),
                Objects.requireNonNull(object.getDeclaredMethod("equals")).getRef(),
                Objects.requireNonNull(object.getDeclaredMethod("toString")).getRef());
    }

    @Override
    public void onStart() {
        handlers.keySet().forEach(solver::addIgnoredMethod);
    }

    @InvokeHandler(signature = "<java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>", argIndexes = {2})
    public void newProxyInstance(Context context, Invoke invoke, PointsToSet invocationHandler) {
        Var result = invoke.getResult();
        if (result != null) {
            invocationHandler.forEach(
                    csObj -> {
                        // generate special mock object for newProxyInstance 
                        Obj obj = heapModel.getMockObj(PROXY_DESC, csObj, NullType.NULL);
                        CSMethod csMethod = csManager.getCSMethod(context, invoke.getContainer());
                        Context heapContext = selector.selectHeapContext(csMethod, obj);
                        solver.addVarPointsTo(context, result, heapContext, obj);
                    }
            );
        }
    }

    @Override
    public void onUnresolvedCall(CSObj recv, Context context, Invoke invoke) {
        if (!CSObjs.hasDescriptor(recv, PROXY_DESC)) {
            return;
        }
        MethodRef method = invoke.getMethodRef();
        JClass clazz = method.getDeclaringClass();
        if (clazz.equals(proxy) || (clazz.equals(object) && !proxiedMethods.contains(method))) {
            // the method is directly called
            CSCallSite csCallSite = csManager.getCSCallSite(context, invoke);
            JMethod callee = method.resolve();
            Context calleeContext = selector.selectContext(
                    csCallSite, recv, callee);
            CSMethod csCallee = csManager.getCSMethod(calleeContext, callee);
            solver.addCallEdge(new Edge<>(CallGraphs.getCallKind(invoke),
                    csCallSite, csCallee));
            solver.addVarPointsTo(calleeContext, callee.getIR().getThis(), recv);
        } else {
            // the method is actually delegated to InvocationHandler.invoke method
            CSObj invocationHandler = (CSObj) recv.getObject().getAllocation();
            JMethod callee = ((ClassType) invocationHandler.getObject().getType())
                    .getJClass().getDeclaredMethod("invoke");
            if (callee == null) {
                logger.warn("No invoke method for " + invocationHandler.getObject().getType());
                return;
            }
            CSCallSite csCallSite = csManager.getCSCallSite(context, invoke);
            Context calleeContext = selector.selectContext(
                    csCallSite, invocationHandler, callee);
            CSMethod csCallee = csManager.getCSMethod(calleeContext, callee);
            solver.addCallEdge(new ProxyCallEdge(csCallSite, csCallee, recv));
            solver.addVarPointsTo(calleeContext, callee.getIR().getThis(),
                    invocationHandler);
        }
    }

    @Override
    public void onNewCallEdge(Edge<CSCallSite, CSMethod> edge) {
        if (edge instanceof ProxyCallEdge proxyEdge) {
            // create arguments for InvocationHandler.invoke
            CSMethod csCallee = edge.getCallee();
            Context callerCtx = edge.getCallSite().getContext();
            Invoke callSite = edge.getCallSite().getCallSite();
            Context calleeCtx = csCallee.getContext();
            JMethod callee = csCallee.getMethod();
            InvokeExp invokeExp = callSite.getInvokeExp();
            // pass the first argument, which is reflection method
            solver.addVarPointsTo(callerCtx, callee.getIR().getParam(0), proxyEdge.getProxyObj());
            // pass the second argument, which is reflection method
            JMethod method = callSite.getMethodRef().resolve();
            Obj methodObj = heapModel.getMockObj(REFLECTION_DESC, method, typeSystem.getClassType(ClassNames.METHOD));
            Context mObjContext = selector.selectHeapContext(proxyEdge.getCallee(), methodObj);
            solver.addVarPointsTo(callerCtx, callee.getIR().getParam(1), mObjContext, methodObj);
            // pass the third argument, which is args in Object[]
            Type objs = typeSystem.getArrayType(typeSystem.getType(ClassNames.OBJECT), 1);
            Obj argsObj = heapModel.getMockObj(ARGS_ARRAY_DESC, callSite, objs);
            Context argsObjContext = selector.selectHeapContext(proxyEdge.getCallee(), argsObj);
            ArrayIndex arrayIdx = csManager.getArrayIndex(csManager.getCSObj(argsObjContext, argsObj));
            callSite.getInvokeExp().getArgs().forEach(
                    v -> {
                        CSVar csVar = csManager.getCSVar(callerCtx, v);
                        PointsToSet pts = solver.getPointsToSetOf(csVar);
                        solver.addPointsTo(arrayIdx, pts);
                    }
            );
            solver.addVarPointsTo(callerCtx, callee.getIR().getParam(2), argsObjContext, argsObj);
            // pass results to LHS variable
            Var lhs = callSite.getResult();
            if (lhs != null) {
                CSVar csLHS = csManager.getCSVar(callerCtx, lhs);
                for (Var ret : callee.getIR().getReturnVars()) {
                    CSVar csRet = csManager.getCSVar(calleeCtx, ret);
                    solver.addPFGEdge(csRet, csLHS, FlowKind.RETURN);
                }
            }
        }
    }
}

ProxyCallEdge is the special call edge handled by the SemanticProxyModel plugin

public class ProxyCallEdge extends OtherEdge<CSCallSite, CSMethod> {

    private final CSObj proxyObj;

    public ProxyCallEdge(CSCallSite callSite, CSMethod callee, CSObj proxyObj) {
        super(callSite, callee);
        this.proxyObj = proxyObj;
    }

    public CSObj getProxyObj() {
        return proxyObj;
    }
}

You need to activate the plugin by declaring it in the configuration file. You also need to enable the reflection analysis (because dynamic proxy will use reflection). You can set the only-app option to true. The call graph is as follow, Main.main -> MyInvocationHandler.invoke -> ServiceImpl.doSomething:

Call graph
digraph G {
  node [color=".3 .2 1.0",shape=box,style=filled];
  edge [];
  "0" [label="<MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>",];
  "1" [label="<java.lang.String: void <clinit>()>",];
  "2" [label="<java.lang.Object: void <init>()>",];
  "3" [label="<java.lang.Class: java.lang.ClassLoader getClassLoader()>",];
  "4" [label="<java.lang.reflect.Method: java.lang.Object invoke(java.lang.Object,java.lang.Object[])>",];
  "5" [label="<java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>",];
  "6" [label="<java.lang.Class: java.lang.Class[] getInterfaces()>",];
  "7" [label="<ServiceImpl: void doSomething()>",];
  "8" [label="<Main: void main(java.lang.String[])>",];
  "9" [label="<MyInvocationHandler: java.lang.Object invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[])>",];
  "10" [label="<ServiceImpl: void <init>()>",];
  "11" [label="<java.lang.System: void <clinit>()>",];
  "12" [label="<MyInvocationHandler: void <init>(java.lang.Object)>",];
  "13" [label="<java.lang.Object: java.lang.Class getClass()>",];
  "0" -> "13" [label="[0@L24] $r1 = invokevirtual r0.<java.lang.Object: java.lang.Class getClass()>();",];
  "0" -> "3" [label="[1@L24] $r2 = invokevirtual $r1.<java.lang.Class: java.lang.ClassLoader getClassLoader()>();",];
  "0" -> "12" [label="[5@L25] invokespecial $r5.<MyInvocationHandler: void <init>(java.lang.Object)>(r0);",];
  "0" -> "6" [label="[3@L25] $r4 = invokevirtual $r3.<java.lang.Class: java.lang.Class[] getInterfaces()>();",];
  "0" -> "13" [label="[2@L25] $r3 = invokevirtual r0.<java.lang.Object: java.lang.Class getClass()>();",];
  "0" -> "5" [label="[6@L23] $r6 = invokestatic <java.lang.reflect.Proxy: java.lang.Object newProxyInstance(java.lang.ClassLoader,java.lang.Class[],java.lang.reflect.InvocationHandler)>($r2, $r4, $r5);",];
  "8" -> "0" [label="[2@L7] $r1 = invokestatic <MyInvocationHandler: java.lang.Object getProxy(java.lang.Object)>($r0);",];
  "8" -> "10" [label="[1@L6] invokespecial $r0.<ServiceImpl: void <init>()>();",];
  "8" -> "9" [label="[4@L8] invokeinterface r2.<Service: void doSomething()>();",];
  "9" -> "7" [label="[4@L17] $r5 = invokevirtual method.<java.lang.reflect.Method: java.lang.Object invoke(java.lang.Object,java.lang.Object[])>($r4, args);",];
  "9" -> "4" [label="[4@L17] $r5 = invokevirtual method.<java.lang.reflect.Method: java.lang.Object invoke(java.lang.Object,java.lang.Object[])>($r4, args);",];
  "10" -> "2" [label="[0@L1] invokespecial %this.<java.lang.Object: void <init>()>();",];
  "12" -> "2" [label="[0@L9] invokespecial %this.<java.lang.Object: void <init>()>();",];
}

This plugin models Proxy.newProxyInstance through its semantics. This plugin generates a special MockObj, and Tai-e will use this plugin to handle method calls when attempting to invoke methods on that object (and for convenience, the mock object is modeled as a null type for propagation). For methods that need to be proxied, the plugin will generate a special call edge and create parameters to invoke the InvocationHandler.invoke method.

In reality, the object of the proxy class is a subclass that implements the proxied interfaces. To further improve this plugin, you can specially handle the propagation of the object through the interfaces parameter when calling the Proxy.newProxyInstance method. However, currently, Tai-e does not support interface-related reflection API, so you would need to implement the plugin by yourself. At the same time, Tai-e does not have good customization methods for object propagation, which may require relatively complex modifications.

As for the question, Could you explain why the line number is shown as -1?, -1 means this IR does not corresponds to a line in the source code, which is the case since the whole class $Proxy0 is automatically generated.

@YunFy26
Copy link
Author

YunFy26 commented Nov 4, 2024

Thank you for providing such a detailed solution, and apologies for the delayed response. I’ll proceed with handling the situation based on your suggestions. Also, I must add, Tai-e is truly a powerful and user-friendly analysis framework!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants