Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#2297: AARCH64: Implement cbr instrumentation #7005

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

jiegec
Copy link

@jiegec jiegec commented Sep 25, 2024

Implement cbr instrumentation for AARCH64, supporting cbz/cbnz/tbz/tbnz/bcond opcodes.

Issue: #2297

Implement cbr instrumentation for AARCH64, supporting
cbz/cbnz/tbz/tbnz/bcond opcodes.

Issue: DynamoRIO#2297
#define INSTR_CREATE_eor(dc, d, s) \
INSTR_CREATE_eor_shift(dc, d, d, s, OPND_CREATE_INT8(DR_SHIFT_LSL), \
OPND_CREATE_INT8(0))
#define INSTR_CREATE_eor(dc, d, s_or_imm) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing.

Ideally, every new addition of an INSTR_CREATE_ or XINSTR_CREATE_ should include the Doxygen comment block describing the macro, e.g.

/**                                                                       
 * Creates an EOR instruction with one output and two inputs.             
 * \param dc   The void * dcontext used to allocate memory for the instr_t.
 * \param rd   The output register.                                       
 * \param rn   The first input register.                                  
 * \param rm_or_imm   The second input register or immediate.             
 */

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added

#define INSTR_CREATE_eor_shift(dc, rd, rn, rm, sht, sha) \
instr_create_1dst_4src(dc, OP_eor, rd, rn, \
opnd_create_reg_ex(opnd_get_reg(rm), 0, DR_OPND_SHIFTED), \
opnd_add_flags(sht, DR_OPND_IS_SHIFT), sha)
#define INSTR_CREATE_csinc(dc, rd, rn, rm, cond) \
instr_create_1dst_3src(dc, OP_csinc, rd, rn, rm, cond)
#define INSTR_CREATE_ubfm(dc, rd, rn, lsb, width) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace the instr_create_1dst_3src(dc, OP_ubfm... in XINST_CREATE_slr_s() with this INSTR_CREATE_ubfm().

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

Document newly added instruction creation macros for EOR, CSINC and
UBFM.  Refactor XINST_CREATE_slr_s to use the new macro to create ubfm
instruction.

Fixes DynamoRIO#2297
Copy link
Contributor

@derekbruening derekbruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution. The main missing piece is testing: see the comments.

@@ -463,12 +463,12 @@
* they just need to know whether they need to preserve the app's flags, so maybe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please enable the samples that use this which are currently disabled:

  # dr_insert_cbr_instrument_ex is NYI
  add_sample_client(cbrtrace    "cbrtrace.c;utils.c"    "drmgr;drx")
  add_sample_client(hot_bbcount "hot_bbcount.c"         "drmgr;drreg;drbbdup;drx")

Those are run as tests, though w/o targeted correctness checks: just making sure they don't crash.

@@ -463,12 +463,12 @@
* they just need to know whether they need to preserve the app's flags, so maybe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please enable the count-ctis tests which use this, which will add regression tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since count-ctis test requires mbr instrumentation as well, I have added initial implementation for mbr instrumentation. But it sometimes return a very small number, possibly some index into the indirect branch cache? How to convert it back to actual address?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't realize it needed more: was just grepping for tests that use cbr. Makes sense to separate out the mbr. Is it easy to separate in the test? Or just separately locally in the test and confirm cbr works and state that in the PR description and say that the test will be enabled soon when mbr is added and then enable the test in a separate PR for mbr, so long as that comes in relatively soon (i.e., not months later with no cbr test in the meantime).

mbr is supposed to obtain a real address so that sounds like something is wrong there.

@@ -463,12 +463,12 @@
* they just need to know whether they need to preserve the app's flags, so maybe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR description to describe how you tested this, since there are no tests added/enabled by this PR (see prior comment which will add some).

instr_create_1dst_3src(dc, OP_csinc, rd, rn, rm, cond)

/**
* Creates an UBFM instruction with one output and three inputs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit: Do you pronounce this "uhb-fmmm"? My brain doesn't like the "an" because I say it "you-bee-eff-emm" and so I want "a" not "an" but if you start it with "uh" feel free to leave the "an".)

reg_id_t dir = DR_REG_NULL;
reg_id_t flags = DR_REG_NULL;
int opc;
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove.

"dr_insert_cbr_instrumentation must be applied to a cbr");
target = (ptr_uint_t)opnd_get_pc(instr_get_target(instr));

/* Compute branch direction */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: For all new comments, please capitalize and use end-punctuation: so end period here, and capitalize and add end periods to all the ones below: line 6352, line 6354, etc.

opnd_t reg_op = instr_get_src(instr, 1);
reg_id_t reg = opnd_get_reg(reg_op);
/* use dir register to compute direction */
dir = (reg == DR_REG_X0 || reg == DR_REG_W0) ? DR_REG_X1 : DR_REG_X0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO using reg_to_pointer_sized(reg) == DR_REG_X0 is better than checking both X0 and W0 since it will generalize to other architectures or SIMD registers which have more than 2 overlapping. Ditto below.

flags = (reg == DR_REG_X2 || reg == DR_REG_W2) ? DR_REG_X3 : DR_REG_X2;
/* save old value of flags register to SPILL_SLOT_2 */
dr_save_reg(dcontext, ilist, instr, flags, SPILL_SLOT_2);
/* save flags to flags register */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, instead of using CMP and having to save the flags is it better to branch and use the same CBZ/CBNZ to set dir_op to 0 or 1? Fewer loads+stores, but has a branch...my intuition says avoiding the store is worth having the branch. But probably should be profiled. OK to just put a XXX comment about it.

reg_id_t dir_same_width = DR_REG_NULL;

/* use dir register to compute direction */
if (DR_REG_START_64 <= reg && reg <= DR_REG_STOP_64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use reg_is_64bit()

} else {
/* 32-bit register */
dir = (reg == DR_REG_W0) ? DR_REG_X1 : DR_REG_X0;
dir_same_width = (reg == DR_REG_W0) ? DR_REG_W1 : DR_REG_W0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can use API routines to make this simpler, replacing this if/else lines 6378-6386 with this:

dir = (reg_to_pointer_sized(reg) == DR_REG_X0) ? DR_REG_X1 : DR_REG_X0;
dir_same_width = reg_resize_to_opsz(dir, opnd_get_size(reg));

@jiegec
Copy link
Author

jiegec commented Oct 28, 2024

Help is wanted to fix mbr instrumentation: it sometimes receives small integers instead of actual target address, possibly some indices into the basic box cache?

@derekbruening
Copy link
Contributor

Help is wanted to fix mbr instrumentation: it sometimes receives small integers instead of actual target address, possibly some indices into the basic box cache?

I assume it has the wrong register or something. There are no such "indices" in in-fragment code. Use the logs https://dynamorio.org/page_logging.html to see where the app's register values goes and which one is passed to the clean call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants