Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle "logical this adjustments” #187

Open
RolphWoggom opened this issue May 20, 2021 · 12 comments
Open

Handle "logical this adjustments” #187

RolphWoggom opened this issue May 20, 2021 · 12 comments
Assignees
Labels

Comments

@RolphWoggom
Copy link

When analyzing the Simpsons: Hit & Run demo JSON generation fails.

Executable, ApiDB, facts and results uploaded here, please let me know if you need any of the other files.

The error is ERROR: Duplicate key: '0x5f5d60':

root@74e326b16809:/workdir# ooanalyzer --help | grep RevID
RevID: 2becf22aa64577199a68741104f8e969554337df

root@74e326b16809:/workdir# partition --serialize=Simpsons.exe.ser --maximum-memory=128000 Simpsons.exe 
OPTI[INFO ]: Analyzing executable: Simpsons.exe
OPTI[INFO ]: ROSE stock partitioning took 207.678 seconds.
OPTI[INFO ]: Partitioned 467145 bytes, 152562 instructions, 37265 basic blocks, 57 data blocks and 2992 functions.
OPTI[INFO ]: Function partitioning took 1197.05 seconds.
OPTI[INFO ]: Writing serialized data to "Simpsons.exe.ser".
OPTI[INFO ]: Writing serialized data took 103.497 seconds.
OPTI[INFO ]: Partitioned 2040290 bytes, 608293 instructions, 148467 basic blocks, 14348 data blocks and 18426 functions.

root@74e326b16809:/workdir# ooanalyzer --serialize=Simpsons.exe.ser --maximum-memory 128000 --prolog-facts=Simpsons.exe.facts --threads=4 --per-function-timeout=6000 --apidb simpsons-api.json Simpsons.exe
OPTI[INFO ]: Analyzing executable: Simpsons.exe
OPTI[INFO ]: OOAnalyzer version 1.0.
OPTI[INFO ]: Reading serialized data from "Simpsons.exe.ser".
OPTI[INFO ]: Reading serialized data took 52.1446 seconds.
OPTI[INFO ]: Partitioned 2040290 bytes, 608293 instructions, 148467 basic blocks, 14348 data blocks and 18426 functions.
OOAN[WARN ]: Instruction 50139E: jmp       [5013B0+eax*4] had incomplete successors.
[INFO ]: Unable to find this-pointer for function at 0x005B2C70
[INFO ]: Unable to find this-pointer for function at 0x0050C7B0
[INFO ]: Unable to find this-pointer for function at 0x0043B9E0
OOAN[ERROR]: No new() methods were found.  Heap objects may not be detected.
OOAN[ERROR]: No delete() methods were found.  Object analysis may be impaired.
OPTI[WARN ]: OOAnalyzer did not perform C++ class analysis.
OPTI[INFO ]: OOAnalyzer analysis complete.

root@74e326b16809:/workdir# awk -F\( '{print $1}' Simpsons.exe.facts | sort | uniq -c
      1 % Object fact exporting complete.
      1 % Prolog facts autogenerated by OOAnalyzer.
  24827 callParameter
  42404 callReturn
  50728 callTarget
  39794 callingConvention
      1 fileInfo
   5555 funcOffset
  13463 funcParameter
   9868 funcReturn
  14243 initialMemory
  29314 methodMemberAccess
   6946 noCallsAfter
   6962 noCallsBefore
     44 possibleVBTableWrite
   1889 possibleVFTableWrite
  12112 possibleVirtualFunctionCall
   1478 rTTIBaseClassDescriptor
   1224 rTTIClassHierarchyDescriptor
   1596 rTTICompleteObjectLocator
   1320 rTTITypeDescriptor
   1512 returnsSelf
    198 thisPtrAllocation
   1464 thisPtrOffset
  17962 thisPtrUsage
    117 thunk
    139 uninitializedReads

root@74e326b16809:/workdir# ooprolog --facts Simpsons.exe.facts --results Simpsons.exe.results --json Simpsons.exe.json --log-level=6 >Simpsons.exe.log
ERROR: Duplicate key: '0x5f5d60'
ERROR: In:
ERROR:   [20] with_output_to(<stream>(0x563577a27940),exportJSON)
ERROR:   [19] setup_call_catcher_cleanup(user:open('Simpsons.exe.json',write,<stream>(0x563577a27940)),user:with_output_to(<stream>(0x563577a27940),exportJSON),_460,user:close(<stream>(0x563577a27940))) at /usr/local/lib/swipl/boot/init.pl:619
ERROR:   [16] catch(user:exportJSONTo('Simpsons.exe.json'),error(duplicate_key('0x5f5d60'),context(...,_558)),user:(...,...)) at /usr/local/lib/swipl/boot/init.pl:537
ERROR:   [15] catch_with_backtrace('<garbage_collected>','<garbage_collected>','<garbage_collected>') at /usr/local/lib/swipl/boot/init.pl:587
ERROR: 
ERROR: Note: some frames are missing due to last-call optimization.
ERROR: Re-run your program in debug mode (:- debug.) to get more detail.
@sei-eschwartz sei-eschwartz self-assigned this May 20, 2021
@sei-eschwartz
Copy link
Collaborator

[ '0x5f5d60':vftable{ ea:'0x5f5d60',
	   entries:entries{ '0':vftentry{ demangled_name:'',
					  ea:'0x548460',
					  import:false,
					  name:"virt_meth_0x548460",
					  offset:0,
					  type:meth
					}
			  },
	   length:1,
	   vftptr:'0x4'
	 },
  '0x5f5d60':vftable{ ea:'0x5f5d60',
	   entries:entries{ '0':vftentry{ demangled_name:'',
					  ea:'0x548460',
					  import:false,
					  name:"virt_meth_0x548460",
					  offset:0,
					  type:meth
					}
			  },
	   length:1,
	   vftptr:'0x0'
	 }
]

@sei-eschwartz
Copy link
Collaborator

The vftable in question is installed by 0x548110 and 0x548200, at different offsets. I think the oustanding question at this point is are these on the same class?

@RolphWoggom
Copy link
Author

From comparing this to the PS2 version with symbols it seems that:

  • 0x548200 is radSoundHalListener::radSoundHalListener(int)
  • 0x548110 is radSoundHalListener::~radSoundHalListener(void)
  • (maybe) 0x5f5d60 is radSoundHalListener::radSoundObject::__virtual_table

@sei-eschwartz
Copy link
Collaborator

So does it seem like 0x5f5d60 is legitimately installed at two different offsets in radSoundHalListener?

Can you get an object layout for radSoundHalListener from the PS2 version?

@sei-eschwartz
Copy link
Collaborator

Here is the class hierarchy courtesy of RTTI:

data:006433EC ; public struct radSoundHalListener /* mdisp:0 */ :
.data:006433EC ;   public struct IRadSoundHalListener /* mdisp:0 */ :
.data:006433EC ;     public struct IRefCount /* mdisp:0 */,
.data:006433EC ;   public struct radSoundObject /* mdisp:4 */ :
.data:006433EC ;     public class radRefCount /* mdisp:4 */ :
.data:006433EC ;       public class radObject /* mdisp:4 */ :
.data:006433EC ;         public class radBaseObject /* mdisp:4 */
.data:006433EC ; struct radSoundHalListener `RTTI Type Descriptor'

@sei-ccohen are a bit confused: Because of the negative offset accessed in 0x548110, we believed there was a virtual base involved. But according to the above, there is no virtual base on radSoundHalListener.

The key to understanding what is going on is understanding why 0x548110 thinks it is ok to reference the object at a negative offset. Most likely whatever is causing the offset difference is also causing us to get confused about the vftable being installed in two different offsets.

@sei-eschwartz
Copy link
Collaborator

We are still experimenting, but we were able to generate a negative offset in a virtual function. https://www.godbolt.org/z/feoKPMajx

Basically, this happens when a virtual function is only accessed from the second (or later?) base class, and the function accesses a member in the first base class. This needs a bit more thought, but it means that the object pointer for a virtual function in a derived class may not always be pointing at the start of the derived class!

@RolphWoggom
Copy link
Author

This needs a bit more thought, but it means that the object pointer for a virtual function in a derived class may not always be pointing at the start of the derived class!

Seems like this is happening here:

  • 0x548200 (constructor) installs:
    • this[0] = 0x5f5d6c (radSoundHalListener::__virtual_table)
    • this[1] = 0x5f5d60 (radSoundHalListener::radSoundObject::__virtual_table)
  • 0x548110 (destructor) installs:
    • this[-1] = 0x5f5d6c (radSoundHalListener::__virtual_table)
    • this[0] = 0x5f5d60 (radSoundHalListener::radSoundObject::__virtual_table)

Sounds like the problem encountered here was identified?

@sei-eschwartz
Copy link
Collaborator

Yes, I think that is what is happening there. We are thinking about the best way to fix it.

By the way, in the meantime, you should be able to manually remove the problematic vftables from the .results file to get the export to work.

@RolphWoggom
Copy link
Author

Great news! And thanks for the tip. I was able to export by removing finalInheritance(0x5f5d60, 0x5f5c04, 0x4, 0x5f5d60, false). and finalInheritance(0x5f7a80, 0x5f5bf4, 0x4, 0x5f7a80, false).. The second one doesn't use a negative offset directly but is preceded by it (near 0x567614).

Here is the class hierarchy courtesy of RTTI:

data:006433EC ; public struct radSoundHalListener /* mdisp:0 */ :
.data:006433EC ;   public struct IRadSoundHalListener /* mdisp:0 */ :
.data:006433EC ;     public struct IRefCount /* mdisp:0 */,
.data:006433EC ;   public struct radSoundObject /* mdisp:4 */ :
.data:006433EC ;     public class radRefCount /* mdisp:4 */ :
.data:006433EC ;       public class radObject /* mdisp:4 */ :
.data:006433EC ;         public class radBaseObject /* mdisp:4 */
.data:006433EC ; struct radSoundHalListener `RTTI Type Descriptor'

What tool was used to create this? Is that an IDA thing?

@sei-eschwartz
Copy link
Collaborator

Glad to hear you were able to get the JSON export to work.

Yes, the class hierarchy is an IDA feature I recently discovered by accident. It adds the hierarchy as a comment above the RTTI Type Descriptors.

I haven't used it, but I think https://github.com/astrelsky/Ghidra-Cpp-Class-Analyzer is a similar capability for Ghidra.

@sei-eschwartz
Copy link
Collaborator

I recently found that Jan Gray talked about this:

Consider next S::rvf(), which overrides R::rvf(). Most implementations note that S::rvf()
must have a hidden this parameter of type S*. Since R’s rvf vftable slot may be used when
this call occurs:

((R*)ps)->rvf(); // (((R)ps)->R::vfptr[1])((R*)ps)

Most implementations add another thunk to convert the R* passed to rvf into an S*. Some
also add an additional vftable entry to the end of S’s vftable to provide a way to call ps-
rvf() without first converting to an R*. MSC++ avoids this by intentionally compiling
S::rvf() so as to expect a this pointer which addresses not the S object but rather the R
embedded instance within the S. (We call this “giving overrides the same expected address
point as in the class that first introduced this virtual function”.) This is all done
transparently, by applying a “logical this adjustment” to all member fetches, conversions
from this, and so on, that occur within the member function. (Just as with multiple
inheritance member access, this adjustment is constant-folded into other member
displacement address arithmetic.)

@sei-eschwartz
Copy link
Collaborator

One thing I noticed recently is that there are actually stubs for logical this adjustments that we should be able to detect.

Here's an example for experimentation: https://godbolt.org/z/qs55rzaj6

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants