Skip to content

Commit

Permalink
i#4014 dr$sim phys: Use physaddr markers in simulators (#5585)
Browse files Browse the repository at this point in the history
When -use_physical is set, the cache and TLB simulators read the new
virtual-to-physical translation markers and use them to simulate
physical addresses.

Changes drcachesim online mode to leave addresses virtual and insert markers
instead, just like offline.  Adds a compatibility change note and updates the docs.

Includes a fix for -cpu_scheduling where the cached last thread was not
reset on a cpu change with no thread change in between.

--------------------------------------------------
Tested: Ran manually and looked at logs.
Open to suggestions for how to automate testing.

$ rm -rf drmemtrace.sim*.dir; ninja && sudo bin64/drrun -stderr_mask 0 -t drcachesim -use_physical -offline -- suite/tests/bin/simple_app && sudo chown -R $USER drmemtrace.sim*.dir && bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -verbose 3 > OUT 2>&1
$ less OUT
translating virtual 0x7fed14de0050 to 0xf52ace050
::3036256.3036256::  @0xf52ace050 instr x3
translating virtual 0x7fed14de0053 to 0xf52ace053
::3036256.3036256::  @0xf52ace053 instr x5
translating virtual 0x7ffca0af9068 to 0xb3f975068
translating virtual 0x7fed14de0053 to 0xf52ace053
::3036256.3036256::  @0xf52ace053 write 0xb3f975068 x8

$ bin64/drrun -t drcachesim -indir drmemtrace.sim*.dir -use_physical -simulator_type TLB -verbose 3 > OUT 2>&1
$ les OUT
translating virtual 0x7f6f263a3050 to 0xf52ace050
::3080711.3080711::  @0x7070615f656c706d instr 0xf52ace050 x3
translating virtual 0x7f6f263a3053 to 0xf52ace053
::3080711.3080711::  @0x7070615f656c706d direct_call 0xf52ace053 x5
translating virtual 0x7ffdeaab3798 to 0xed4c52798
translating virtual 0x7f6f263a3053 to 0xf52ace053
--------------------------------------------------

Issue: #4014
  • Loading branch information
derekbruening authored and dolanzhao committed Aug 4, 2022
1 parent 405dde1 commit fe231aa
Show file tree
Hide file tree
Showing 11 changed files with 280 additions and 139 deletions.
3 changes: 3 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,9 @@ The changes between version \DR_VERSION and 9.0.1 include the following compatib
changes:
- Eliminated the -skip_syscall option to drrun and drinject, which is now always
on by default.
- Changed the drcachesim -use_physical option to not modify the regular trace
entry virtual addresses but to instead insert metadata containing translation
information for converting virtual to physical addresses.

Further non-compatibility-affecting changes include:
- Added AArchXX support for attaching to a running process.
Expand Down
16 changes: 10 additions & 6 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -208,14 +208,18 @@ droption_t<bool> op_coherence(
"Writes to cache lines will invalidate other private caches that hold that line.");

droption_t<bool> op_use_physical(
DROPTION_SCOPE_CLIENT, "use_physical", false, "Use physical addresses if possible",
"If available, the default virtual addresses will be translated to physical. "
"This is not possible from user mode on all platforms. "
"For -offline, the regular trace entries remain virtual, with a pair of markers of "
DROPTION_SCOPE_ALL, "use_physical", false, "Use physical addresses if possible",
"If available, metadata with virtual-to-physical-address translation information "
"is added to the trace. This is not possible from user mode on all platforms. "
"The regular trace entries remain virtual, with a pair of markers of "
"types #TRACE_MARKER_TYPE_PHYSICAL_ADDRESS and #TRACE_MARKER_TYPE_VIRTUAL_ADDRESS "
"inserted at some prior point for each new or changed page mapping to show the "
"corresponding physical addresses. This option may incur significant overhead "
"both for the physical translation and as it requires disabling optimizations.");
"corresponding physical addresses. If translation fails, a "
"#TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE is inserted. "
"This option may incur significant overhead "
"both for the physical translation and as it requires disabling optimizations."
"For -offline, this option must be passed to both the tracer (to insert the "
"markers) and the simulator (to use the markers).");

droption_t<unsigned int> op_virt2phys_freq(
DROPTION_SCOPE_CLIENT, "virt2phys_freq", 0, "Frequency of physical mapping refresh",
Expand Down
30 changes: 23 additions & 7 deletions clients/drcachesim/drcachesim.dox.in
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ trace in a metadata marker entry of type

Memory accesses (data loads and stores) are stored in #_memref_data_t.
The program counter of the instruction performing the memory access,
the virtual address (unless \ref sec_drcachesim_phys are enabled), and
the virtual address (convertable to physical: see \ref sec_drcachesim_phys), and
the size are provided.

\section sec_drcachesim_format_other Other Records
Expand Down Expand Up @@ -1266,12 +1266,28 @@ $ bin64/drrun -t drcachesim -simulator_type miss_analyzer -LL_miss_file rec.csv

The memory access tracing client gathers virtual addresses. On Linux, if
the kernel allows user-mode applications access to the \p
/proc/self/pagemap file, physical addresses may be used instead. This can
be requested via the \p -use_physical runtime option (see \ref
sec_drcachesim_ops). This works on current kernels but is expected to stop
working from user mode on future kernels due to recent security changes
(see
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ab676b7d6fbf4b294bf198fb27ade5b0e865c7ce).
/proc/self/pagemap file or the application can be run with root
privileges, information to translate virtual addresses to physical addresses may be included in the trace. This can be
requested via the \p -use_physical runtime option (see \ref
sec_drcachesim_ops). On older kernels the pagemap file was readable without
privileges:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=ab676b7d6fbf4b294bf198fb27ade5b0e865c7ce.

When \p -use_physical is enabled, the regular trace entries remain
virtual, with a pair of markers of types
#TRACE_MARKER_TYPE_PHYSICAL_ADDRESS and
#TRACE_MARKER_TYPE_VIRTUAL_ADDRESS inserted at some prior point for
each new page mapping to show the corresponding physical
addresses. If translation fails, a
#TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE is inserted.
Limited support for detecting changes in page mappings is provided via
the \p -virt2phys_freq option to periodically clear cached
translations.

Each analysis tool must decide whether to use this translation
information. The cache and TLB simulators provided are equipped to
read these markers and they use the marker data when \p -use_physical
is specified.

****************************************************************************
\page sec_drcachesim_core Core Simulation Support
Expand Down
2 changes: 2 additions & 0 deletions clients/drcachesim/simulator/analyzer_interface.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ get_cache_simulator_knobs()
knobs->sim_refs = op_sim_refs.get_value();
knobs->verbose = op_verbose.get_value();
knobs->cpu_scheduling = op_cpu_scheduling.get_value();
knobs->use_physical = op_use_physical.get_value();
return knobs;
}

Expand Down Expand Up @@ -162,6 +163,7 @@ drmemtrace_analysis_tool_create()
knobs.sim_refs = op_sim_refs.get_value();
knobs.verbose = op_verbose.get_value();
knobs.cpu_scheduling = op_cpu_scheduling.get_value();
knobs.use_physical = op_use_physical.get_value();
return tlb_simulator_create(knobs);
} else if (op_simulator_type.get_value() == HISTOGRAM) {
return histogram_tool_create(op_line_size.get_value(), op_report_top.get_value(),
Expand Down
84 changes: 47 additions & 37 deletions clients/drcachesim/simulator/cache_simulator.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2015-2021 Google, Inc. All rights reserved.
* Copyright (c) 2015-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -74,7 +74,7 @@ cache_simulator_create(const std::string &config_file)
cache_simulator_t::cache_simulator_t(const cache_simulator_knobs_t &knobs)
: simulator_t(knobs.num_cores, knobs.skip_refs, knobs.warmup_refs,
knobs.warmup_fraction, knobs.sim_refs, knobs.cpu_scheduling,
knobs.verbose)
knobs.use_physical, knobs.verbose)
, knobs_(knobs)
, l1_icaches_(NULL)
, l1_dcaches_(NULL)
Expand Down Expand Up @@ -193,7 +193,7 @@ cache_simulator_t::cache_simulator_t(std::istream *config_file)

init_knobs(knobs_.num_cores, knobs_.skip_refs, knobs_.warmup_refs,
knobs_.warmup_fraction, knobs_.sim_refs, knobs_.cpu_scheduling,
knobs_.verbose);
knobs_.use_physical, knobs_.verbose);

if (knobs_.data_prefetcher != PREFETCH_POLICY_NEXTLINE &&
knobs_.data_prefetcher != PREFETCH_POLICY_NONE) {
Expand Down Expand Up @@ -440,9 +440,6 @@ cache_simulator_t::process_memref(const memref_t &memref)
return true;
}

// We use a static scheduling of threads to cores, as it is
// not practical to measure which core each thread actually
// ran on for each memref.
int core;
if (memref.data.tid == last_thread_)
core = last_core_;
Expand All @@ -452,52 +449,65 @@ cache_simulator_t::process_memref(const memref_t &memref)
last_core_ = core;
}

if (type_is_instr(memref.instr.type) ||
memref.instr.type == TRACE_TYPE_PREFETCH_INSTR) {
// To support swapping to physical addresses without modifying the passed-in
// memref (which is also passed to other tools run at the same time) we use
// indirection.
const memref_t *simref = &memref;
memref_t phys_memref;
if (knobs_.use_physical) {
phys_memref = memref2phys(memref);
simref = &phys_memref;
}

if (type_is_instr(simref->instr.type) ||
simref->instr.type == TRACE_TYPE_PREFETCH_INSTR) {
if (knobs_.verbose >= 3) {
std::cerr << "::" << memref.data.pid << "." << memref.data.tid << ":: "
<< " @" << (void *)memref.instr.addr << " instr x"
<< memref.instr.size << "\n";
std::cerr << "::" << simref->data.pid << "." << simref->data.tid << ":: "
<< " @" << (void *)simref->instr.addr << " instr x"
<< simref->instr.size << "\n";
}
l1_icaches_[core]->request(memref);
} else if (memref.data.type == TRACE_TYPE_READ ||
memref.data.type == TRACE_TYPE_WRITE ||
l1_icaches_[core]->request(*simref);
} else if (simref->data.type == TRACE_TYPE_READ ||
simref->data.type == TRACE_TYPE_WRITE ||
// We may potentially handle prefetches differently.
// TRACE_TYPE_PREFETCH_INSTR is handled above.
type_is_prefetch(memref.data.type)) {
type_is_prefetch(simref->data.type)) {
if (knobs_.verbose >= 3) {
std::cerr << "::" << memref.data.pid << "." << memref.data.tid << ":: "
<< " @" << (void *)memref.data.pc << " "
<< trace_type_names[memref.data.type] << " "
<< (void *)memref.data.addr << " x" << memref.data.size << "\n";
std::cerr << "::" << simref->data.pid << "." << simref->data.tid << ":: "
<< " @" << (void *)simref->data.pc << " "
<< trace_type_names[simref->data.type] << " "
<< (void *)simref->data.addr << " x" << simref->data.size << "\n";
}
l1_dcaches_[core]->request(memref);
} else if (memref.flush.type == TRACE_TYPE_INSTR_FLUSH) {
l1_dcaches_[core]->request(*simref);
} else if (simref->flush.type == TRACE_TYPE_INSTR_FLUSH) {
if (knobs_.verbose >= 3) {
std::cerr << "::" << memref.data.pid << "." << memref.data.tid << ":: "
<< " @" << (void *)memref.data.pc << " iflush "
<< (void *)memref.data.addr << " x" << memref.data.size << "\n";
std::cerr << "::" << simref->data.pid << "." << simref->data.tid << ":: "
<< " @" << (void *)simref->data.pc << " iflush "
<< (void *)simref->data.addr << " x" << simref->data.size << "\n";
}
l1_icaches_[core]->flush(memref);
} else if (memref.flush.type == TRACE_TYPE_DATA_FLUSH) {
l1_icaches_[core]->flush(*simref);
} else if (simref->flush.type == TRACE_TYPE_DATA_FLUSH) {
if (knobs_.verbose >= 3) {
std::cerr << "::" << memref.data.pid << "." << memref.data.tid << ":: "
<< " @" << (void *)memref.data.pc << " dflush "
<< (void *)memref.data.addr << " x" << memref.data.size << "\n";
std::cerr << "::" << simref->data.pid << "." << simref->data.tid << ":: "
<< " @" << (void *)simref->data.pc << " dflush "
<< (void *)simref->data.addr << " x" << simref->data.size << "\n";
}
l1_dcaches_[core]->flush(memref);
} else if (memref.exit.type == TRACE_TYPE_THREAD_EXIT) {
handle_thread_exit(memref.exit.tid);
l1_dcaches_[core]->flush(*simref);
} else if (simref->exit.type == TRACE_TYPE_THREAD_EXIT) {
handle_thread_exit(simref->exit.tid);
last_thread_ = 0;
} else if (memref.marker.type == TRACE_TYPE_INSTR_NO_FETCH) {
} else if (memref.marker.type == TRACE_TYPE_MARKER &&
memref.marker.marker_type == TRACE_MARKER_TYPE_CPU_ID) {
last_thread_ = 0;
} else if (simref->marker.type == TRACE_TYPE_INSTR_NO_FETCH) {
// Just ignore.
if (knobs_.verbose >= 3) {
std::cerr << "::" << memref.data.pid << "." << memref.data.tid << ":: "
<< " @" << (void *)memref.instr.addr << " non-fetched instr x"
<< memref.instr.size << "\n";
std::cerr << "::" << simref->data.pid << "." << simref->data.tid << ":: "
<< " @" << (void *)simref->instr.addr << " non-fetched instr x"
<< simref->instr.size << "\n";
}
} else {
error_string_ = "Unhandled memref type " + std::to_string(memref.data.type);
error_string_ = "Unhandled memref type " + std::to_string(simref->data.type);
return false;
}

Expand Down
4 changes: 3 additions & 1 deletion clients/drcachesim/simulator/cache_simulator_create.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2017-2018 Google, Inc. All rights reserved.
* Copyright (c) 2017-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -67,6 +67,7 @@ struct cache_simulator_knobs_t {
, warmup_fraction(0.0)
, sim_refs(1ULL << 63)
, cpu_scheduling(false)
, use_physical(false)
, verbose(0)
{
}
Expand All @@ -87,6 +88,7 @@ struct cache_simulator_knobs_t {
double warmup_fraction;
uint64_t sim_refs;
bool cpu_scheduling;
bool use_physical;
unsigned int verbose;
};

Expand Down
91 changes: 85 additions & 6 deletions clients/drcachesim/simulator/simulator.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2015-2020 Google, Inc. All rights reserved.
* Copyright (c) 2015-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -42,10 +42,10 @@

simulator_t::simulator_t(unsigned int num_cores, uint64_t skip_refs, uint64_t warmup_refs,
double warmup_fraction, uint64_t sim_refs, bool cpu_scheduling,
unsigned int verbose)
bool use_physical, unsigned int verbose)
{
init_knobs(num_cores, skip_refs, warmup_refs, warmup_fraction, sim_refs,
cpu_scheduling, verbose);
cpu_scheduling, use_physical, verbose);
}

simulator_t::~simulator_t()
Expand All @@ -55,14 +55,15 @@ simulator_t::~simulator_t()
void
simulator_t::init_knobs(unsigned int num_cores, uint64_t skip_refs, uint64_t warmup_refs,
double warmup_fraction, uint64_t sim_refs, bool cpu_scheduling,
unsigned int verbose)
bool use_physical, unsigned int verbose)
{
knob_num_cores_ = num_cores;
knob_skip_refs_ = skip_refs;
knob_warmup_refs_ = warmup_refs;
knob_warmup_fraction_ = warmup_fraction;
knob_sim_refs_ = sim_refs;
knob_cpu_scheduling_ = cpu_scheduling;
knob_use_physical_ = use_physical;
knob_verbose_ = verbose;
last_thread_ = 0;
last_core_ = 0;
Expand All @@ -80,8 +81,9 @@ simulator_t::init_knobs(unsigned int num_cores, uint64_t skip_refs, uint64_t war
bool
simulator_t::process_memref(const memref_t &memref)
{
if (memref.marker.type == TRACE_TYPE_MARKER &&
memref.marker.marker_type == TRACE_MARKER_TYPE_CPU_ID && knob_cpu_scheduling_) {
if (memref.marker.type != TRACE_TYPE_MARKER)
return true;
if (memref.marker.marker_type == TRACE_MARKER_TYPE_CPU_ID && knob_cpu_scheduling_) {
int cpu = (int)(intptr_t)memref.marker.marker_value;
if (cpu < 0)
return true;
Expand All @@ -104,9 +106,86 @@ simulator_t::process_memref(const memref_t &memref)
++thread_counts_[min_core];
++thread_ever_counts_[min_core];
}
if (!knob_use_physical_)
return true;
if (memref.marker.marker_type == TRACE_MARKER_TYPE_PAGE_SIZE) {
if (page_size_ != 0 && page_size_ != memref.marker.marker_value) {
ERRMSG("Error: conflicting page size markers");
return false;
}
page_size_ = memref.marker.marker_value;
if (!IS_POWER_OF_2(page_size_)) {
ERRMSG("Error: page size %zu is not power of 2", page_size_);
return false;
}
} else if (memref.marker.marker_type == TRACE_MARKER_TYPE_PHYSICAL_ADDRESS) {
prior_phys_addr_ = memref.marker.marker_value;
} else if (memref.marker.marker_type == TRACE_MARKER_TYPE_VIRTUAL_ADDRESS) {
virt2phys_[page_start(memref.marker.marker_value)] = page_start(prior_phys_addr_);
} else if (memref.marker.marker_type ==
TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE) {
addr_t virt = memref.marker.marker_value;
virt2phys_[page_start(virt)] = page_start(synthetic_virt2phys(virt));
}
return true;
}

addr_t
simulator_t::synthetic_virt2phys(addr_t virt) const
{
// For a missing translation, we drop upper bits from the virtual address
// to create a synthetic physical address with arbitrarily the bottom 28 bits.
// XXX i#4014: Ideally we would detect a collision with an existing translation
// (when added new synthetic ones, and by adding a bit saying which entries are
// synthetic which we can check when we add new legitimate entries) We currently
// live with collisions with real translations under the assumption that missing
// translations are rare.
const addr_t SYNTHETIC_PHYS_BITS = 0xfffffff;
return virt & SYNTHETIC_PHYS_BITS;
}

addr_t
simulator_t::virt2phys(addr_t virt) const
{
addr_t phys_page = 0;
auto it = virt2phys_.find(page_start(virt));
if (it == virt2phys_.end()) {
// We handled TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE so this
// should not happen.
ERRMSG("Missing physical address marker for 0x%zx\n", virt);
phys_page = page_start(synthetic_virt2phys(virt));
} else
phys_page = it->second;
addr_t phys = phys_page | (virt & (page_size_ - 1));
if (knob_verbose_ >= 3) {
std::cerr << "translating virtual 0x" << std::hex << virt << " to 0x" << phys
<< std::dec << "\n";
}
return phys;
}

memref_t
simulator_t::memref2phys(memref_t memref) const
{
if (!type_has_address(memref.data.type))
return memref;
memref_t out = memref;
if (type_is_instr(memref.instr.type) ||
memref.instr.type == TRACE_TYPE_INSTR_NO_FETCH) {
out.instr.addr = virt2phys(memref.instr.addr);
} else if (memref.data.type == TRACE_TYPE_READ ||
memref.data.type == TRACE_TYPE_WRITE ||
type_is_prefetch(memref.data.type)) {
out.data.addr = virt2phys(memref.data.addr);
out.data.pc = virt2phys(memref.data.pc);
} else if (memref.data.type == TRACE_TYPE_INSTR_FLUSH ||
memref.data.type == TRACE_TYPE_DATA_FLUSH) {
out.flush.addr = virt2phys(memref.flush.addr);
out.flush.pc = virt2phys(memref.flush.pc);
}
return out;
}

int
simulator_t::find_emptiest_core(std::vector<int> &counts) const
{
Expand Down
Loading

0 comments on commit fe231aa

Please sign in to comment.