Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#2297: AARCH64: Implement cbr instrumentation #7005

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 44 additions & 9 deletions core/ir/aarch64/instr_create_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -463,12 +463,12 @@
* they just need to know whether they need to preserve the app's flags, so maybe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please enable the samples that use this which are currently disabled:

  # dr_insert_cbr_instrument_ex is NYI
  add_sample_client(cbrtrace    "cbrtrace.c;utils.c"    "drmgr;drx")
  add_sample_client(hot_bbcount "hot_bbcount.c"         "drmgr;drreg;drbbdup;drx")

Those are run as tests, though w/o targeted correctness checks: just making sure they don't crash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please enable the count-ctis tests which use this, which will add regression tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since count-ctis test requires mbr instrumentation as well, I have added initial implementation for mbr instrumentation. But it sometimes return a very small number, possibly some index into the indirect branch cache? How to convert it back to actual address?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't realize it needed more: was just grepping for tests that use cbr. Makes sense to separate out the mbr. Is it easy to separate in the test? Or just separately locally in the test and confirm cbr works and state that in the PR description and say that the test will be enabled soon when mbr is added and then enable the test in a separate PR for mbr, so long as that comes in relatively soon (i.e., not months later with no cbr test in the meantime).

mbr is supposed to obtain a real address so that sounds like something is wrong there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the PR description to describe how you tested this, since there are no tests added/enabled by this PR (see prior comment which will add some).

* we can just document that this may not write them.
*/
#define XINST_CREATE_slr_s(dc, d, rm_or_imm) \
(opnd_is_reg(rm_or_imm) \
? instr_create_1dst_2src(dc, OP_lsrv, d, d, rm_or_imm) \
: instr_create_1dst_3src(dc, OP_ubfm, d, d, rm_or_imm, \
reg_is_32bit(opnd_get_reg(d)) ? OPND_CREATE_INT(31) \
: OPND_CREATE_INT(63)))
#define XINST_CREATE_slr_s(dc, d, rm_or_imm) \
(opnd_is_reg(rm_or_imm) \
? instr_create_1dst_2src(dc, OP_lsrv, d, d, rm_or_imm) \
: INSTR_CREATE_ubfm(dc, d, d, rm_or_imm, \
reg_is_32bit(opnd_get_reg(d)) ? OPND_CREATE_INT(31) \
: OPND_CREATE_INT(63)))

/**
* This platform-independent macro creates an instr_t for a nop instruction.
Expand Down Expand Up @@ -658,14 +658,49 @@
instr_create_0dst_3src((dc), OP_tbnz, (pc), (reg), (imm))
#define INSTR_CREATE_cmp(dc, rn, rm_or_imm) \
INSTR_CREATE_subs(dc, OPND_CREATE_ZR(rn), rn, rm_or_imm)
#define INSTR_CREATE_eor(dc, d, s) \
INSTR_CREATE_eor_shift(dc, d, d, s, OPND_CREATE_INT8(DR_SHIFT_LSL), \
OPND_CREATE_INT8(0))

/**
* Creates an EOR instruction with one output and two inputs. For simplicity, the first
* input reuses the output register.
*
* \param dc The void * dcontext used to allocate memory for the instr_t.
* \param d The output register and the first input register.
* \param s_or_imm The second input register or immediate.
*/
#define INSTR_CREATE_eor(dc, d, s_or_imm) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing.

Ideally, every new addition of an INSTR_CREATE_ or XINSTR_CREATE_ should include the Doxygen comment block describing the macro, e.g.

/**                                                                       
 * Creates an EOR instruction with one output and two inputs.             
 * \param dc   The void * dcontext used to allocate memory for the instr_t.
 * \param rd   The output register.                                       
 * \param rn   The first input register.                                  
 * \param rm_or_imm   The second input register or immediate.             
 */

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added

opnd_is_immed(s_or_imm) \
? instr_create_1dst_2src(dc, OP_eor, d, d, s_or_imm) \
: INSTR_CREATE_eor_shift(dc, d, d, s_or_imm, OPND_CREATE_INT8(DR_SHIFT_LSL), \
OPND_CREATE_INT8(0))
#define INSTR_CREATE_eor_shift(dc, rd, rn, rm, sht, sha) \
instr_create_1dst_4src(dc, OP_eor, rd, rn, \
opnd_create_reg_ex(opnd_get_reg(rm), 0, DR_OPND_SHIFTED), \
opnd_add_flags(sht, DR_OPND_IS_SHIFT), sha)

/**
* Creates a CSINC instruction with one output and three inputs.
*
* \param dc The void * dcontext used to allocate memory for the instr_t.
* \param rd The output register.
* \param rn The first input register.
* \param rm The second input register.
* \param cond The third input condition code.
*/
#define INSTR_CREATE_csinc(dc, rd, rn, rm, cond) \
instr_create_1dst_3src(dc, OP_csinc, rd, rn, rm, cond)

/**
* Creates an UBFM instruction with one output and three inputs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit: Do you pronounce this "uhb-fmmm"? My brain doesn't like the "an" because I say it "you-bee-eff-emm" and so I want "a" not "an" but if you start it with "uh" feel free to leave the "an".)

*
* \param dc The void * dcontext used to allocate memory for the instr_t.
* \param rd The output register.
* \param rn The first input register.
* \param immr The second input immediate.
* \param imms The third input immediate.
*/
#define INSTR_CREATE_ubfm(dc, rd, rn, immr, imms) \
instr_create_1dst_3src(dc, OP_ubfm, rd, rn, immr, imms)

#define INSTR_CREATE_ldp(dc, rt1, rt2, mem) \
instr_create_2dst_1src(dc, OP_ldp, rt1, rt2, mem)
#define INSTR_CREATE_ldr(dc, Rd, mem) instr_create_1dst_1src((dc), OP_ldr, (Rd), (mem))
Expand Down
140 changes: 140 additions & 0 deletions core/lib/instrument.c
Original file line number Diff line number Diff line change
Expand Up @@ -6328,6 +6328,146 @@ dr_insert_cbr_instrumentation_help(void *drcontext, instrlist_t *ilist, instr_t
#elif defined(RISCV64)
/* FIXME i#3544: Not implemented */
ASSERT_NOT_IMPLEMENTED(false);
#elif defined(AARCH64)
dcontext_t *dcontext = (dcontext_t *)drcontext;
ptr_uint_t address, target;
reg_id_t dir = DR_REG_NULL;
reg_id_t flags = DR_REG_NULL;
int opc;
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove.

CLIENT_ASSERT(drcontext != NULL,
"dr_insert_cbr_instrumentation: drcontext cannot be NULL");
address = (ptr_uint_t)instr_get_translation(instr);
CLIENT_ASSERT(address != 0,
"dr_insert_cbr_instrumentation: can't determine app address");
CLIENT_ASSERT(instr_is_cbr(instr),
"dr_insert_cbr_instrumentation must be applied to a cbr");
target = (ptr_uint_t)opnd_get_pc(instr_get_target(instr));

/* Compute branch direction */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: For all new comments, please capitalize and use end-punctuation: so end period here, and capitalize and add end periods to all the ones below: line 6352, line 6354, etc.

opc = instr_get_opcode(instr);
if (opc == OP_cbnz || opc == OP_cbz) {
opnd_t reg_op = instr_get_src(instr, 1);
reg_id_t reg = opnd_get_reg(reg_op);
/* use dir register to compute direction */
dir = (reg == DR_REG_X0 || reg == DR_REG_W0) ? DR_REG_X1 : DR_REG_X0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO using reg_to_pointer_sized(reg) == DR_REG_X0 is better than checking both X0 and W0 since it will generalize to other architectures or SIMD registers which have more than 2 overlapping. Ditto below.

/* save old value of dir register to SPILL_SLOT_1 */
dr_save_reg(dcontext, ilist, instr, dir, SPILL_SLOT_1);
/* use flags register to save nzcv */
flags = (reg == DR_REG_X2 || reg == DR_REG_W2) ? DR_REG_X3 : DR_REG_X2;
/* save old value of flags register to SPILL_SLOT_2 */
dr_save_reg(dcontext, ilist, instr, flags, SPILL_SLOT_2);
/* save flags to flags register */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, instead of using CMP and having to save the flags is it better to branch and use the same CBZ/CBNZ to set dir_op to 0 or 1? Fewer loads+stores, but has a branch...my intuition says avoiding the store is worth having the branch. But probably should be profiled. OK to just put a XXX comment about it.

dr_save_arith_flags_to_reg(dcontext, ilist, instr, flags);

/* compare reg against zero */
instr_t *cmp = INSTR_CREATE_cmp(dcontext, reg_op, OPND_CREATE_INT(0));
MINSERT(ilist, instr, cmp);
/* compute branch direction */
opnd_t dir_op = opnd_create_reg(dir);
instr_t *cset = INSTR_CREATE_csinc(
dcontext, dir_op, OPND_CREATE_ZR(dir_op), OPND_CREATE_ZR(dir_op),
opnd_create_cond(opc == OP_cbnz ? DR_PRED_EQ : DR_PRED_NE));
MINSERT(ilist, instr, cset);
} else if (opc == OP_tbnz || opc == OP_tbz) {
opnd_t reg_op = instr_get_src(instr, 1);
reg_id_t reg = opnd_get_reg(reg_op);
reg_id_t dir_same_width = DR_REG_NULL;

/* use dir register to compute direction */
if (DR_REG_START_64 <= reg && reg <= DR_REG_STOP_64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use reg_is_64bit()

/* 64-bit register */
dir = (reg == DR_REG_X0) ? DR_REG_X1 : DR_REG_X0;
dir_same_width = (reg == DR_REG_X0) ? DR_REG_X1 : DR_REG_X0;
} else {
/* 32-bit register */
dir = (reg == DR_REG_W0) ? DR_REG_X1 : DR_REG_X0;
dir_same_width = (reg == DR_REG_W0) ? DR_REG_W1 : DR_REG_W0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can use API routines to make this simpler, replacing this if/else lines 6378-6386 with this:

dir = (reg_to_pointer_sized(reg) == DR_REG_X0) ? DR_REG_X1 : DR_REG_X0;
dir_same_width = reg_resize_to_opsz(dir, opnd_get_size(reg));

}
/* save old value of dir register to SPILL_SLOT_1 */
dr_save_reg(dcontext, ilist, instr, dir, SPILL_SLOT_1);

/* extract tst_bit from reg */
int tst_bit = opnd_get_immed_int(instr_get_src(instr, 2));
opnd_t dir_same_width_op = opnd_create_reg(dir_same_width);
instr_t *ubfm =
INSTR_CREATE_ubfm(dcontext, dir_same_width_op, reg_op,
OPND_CREATE_INT(tst_bit), OPND_CREATE_INT(tst_bit));
MINSERT(ilist, instr, ubfm);

/* invert result if tbz */
if (opc == OP_tbz) {
instr_t *eor =
INSTR_CREATE_eor(dcontext, dir_same_width_op, OPND_CREATE_INT(1));
MINSERT(ilist, instr, eor);
}
} else if (opc == OP_bcond) {
/* use dir register to compute direction */
dir = SCRATCH_REG0;
/* save old value of dir register to SPILL_SLOT_1 */
dr_save_reg(dcontext, ilist, instr, dir, SPILL_SLOT_1);
/* compute branch direction */
dr_pred_type_t pred = instr_get_predicate(instr);
opnd_t dir_op = opnd_create_reg(dir);
instr_t *cset = INSTR_CREATE_csinc(
dcontext, dir_op, OPND_CREATE_ZR(dir_op), OPND_CREATE_ZR(dir_op),
opnd_create_cond(instr_invert_predicate(pred)));
MINSERT(ilist, instr, cset);
} else {
CLIENT_ASSERT(false, "unknown conditional branch type");
return;
}

if (has_fallthrough) {
ptr_uint_t fallthrough = address + instr_length(drcontext, instr);
CLIENT_ASSERT(fallthrough > address, "wrong fallthrough address");
dr_insert_clean_call_ex(
drcontext, ilist, instr, callee,
/* Many users will ask for mcontexts; some will set; it doesn't seem worth
* asking the user to pass in a flag: if they're using this they are not
* super concerned about overhead.
*/
DR_CLEANCALL_READS_APP_CONTEXT | DR_CLEANCALL_WRITES_APP_CONTEXT, 5,
/* address of cbr is 1st parameter */
OPND_CREATE_INTPTR(address),
/* target is 2nd parameter */
OPND_CREATE_INTPTR(target),
/* fall-through is 3rd parameter */
OPND_CREATE_INTPTR(fallthrough),
/* branch direction is 4th parameter */
opnd_create_reg(dir),
/* user defined data is 5th parameter */
opnd_is_null(user_data) ? OPND_CREATE_INT32(0) : user_data);
} else {
dr_insert_clean_call_ex(
drcontext, ilist, instr, callee,
/* Many users will ask for mcontexts; some will set; it doesn't seem worth
* asking the user to pass in a flag: if they're using this they are not
* super concerned about overhead.
*/
DR_CLEANCALL_READS_APP_CONTEXT | DR_CLEANCALL_WRITES_APP_CONTEXT, 3,
/* address of cbr is 1st parameter */
OPND_CREATE_INTPTR(address),
/* target is 2nd parameter */
OPND_CREATE_INTPTR(target),
/* branch direction is 3rd parameter */
opnd_create_reg(dir));
}

/* Restore state */
if (opc == OP_cbnz || opc == OP_cbz) {
/* restore arith flags */
dr_restore_arith_flags_from_reg(dcontext, ilist, instr, flags);
/* restore old value of flags register */
dr_restore_reg(dcontext, ilist, instr, flags, SPILL_SLOT_2);
/* restore old value of dir register */
dr_restore_reg(dcontext, ilist, instr, dir, SPILL_SLOT_1);
} else if (opc == OP_bcond || opc == OP_tbnz || opc == OP_tbz) {
/* restore old value of dir register */
dr_restore_reg(dcontext, ilist, instr, dir, SPILL_SLOT_1);
} else {
CLIENT_ASSERT(false, "unknown conditional branch type");
}
#endif /* X86/ARM/RISCV64 */
}

Expand Down
Loading