Skip to content

Commit

Permalink
i#4014 dr$sim phys: Add offline -use_physical support (#5516)
Browse files Browse the repository at this point in the history
Adds support for physical addresses to offline dr$sim traces.  To
support simulators wanting both virtual and physical addresses, and to
simplify post-processing where the virtual PC values are needed, the
regular trace entries remain all virtual.  A new marker type
TRACE_MARKER_TYPE_PHYSICAL_ADDRESS listing the corresponding physical
address is added.  The mappings are assumed to not change, allowing
just one marker for each newly-observed page.  This is done
per-thread.

An explicit TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE marker is
inserted on failure to translate, to prevent analyzers from having to
infer this due to the lack of the already-sparse markers.

Separately emitted pairs of virtual and physical address markers were
considered, with raw2trace inserting the physical at the right place,
but that presents complexities with buffer handoff and with the first
buffer.  Instead, the physical are inserted via memmove directly into
the buffer.  This does not seem to be a performance concern: the
translation lookup is the bottleneck.  Since the memmoves occur only on the
first instance of each page, they are much rarer than all the virtual-to-physical
translations.

Adds support for the new markers to the view tool.

Adds a Linux x86_64 test that runs a tiny asm app and ensures a
physical address marker is inserted.  The test needs to run as sudo,
along with its pre- and post- commands.  To avoid a confusing blocking
password query in local runs, a new set of tests controlled by a new
CMake option RUN_SUDO_TESTS is added.  It is set only for automated_ci,
where we assume a passwordless sudo.

A number of items remain for further work:
+ Performance is poor: the hashtable and caching need improvement.
+ There is a hardcoded limit on how many markers can be added
  per buffer.  Once this is exceeded, further markers are dropped.
  We should split the buffer to handle this.
+ We may want to add a mode that checks for mapping changes.
+ Missing privileges results in every physical address being 0 instead
  of showing the failure.  We need to check the capabilities to distinguish.
+ Better testing that we're actually getting physical addresses for online
  tests.
+ Better offline testing with larger apps.
+ Basic blocks that cross a page have only the first one translated.
+ A file descriptor per thread is used, which will not scale well with
  DR's descriptor protection and might hit rlimits.
+ Online traces still replace all virtual addresses with physical.
  We should break compatibility and transition them to use these markers,
  with dr$sim computing the physical addresses from the markers.

Issue: #4014
  • Loading branch information
derekbruening authored Jun 7, 2022
1 parent 4091617 commit d96d930
Show file tree
Hide file tree
Showing 17 changed files with 293 additions and 70 deletions.
3 changes: 3 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,9 @@ Further non-compatibility-affecting changes include:
choices under the -raw_compress option. Compressing with lz4 is now the
default (if built with lz4 support).
- Added drmodtrack_lookup_pc_from_index().
- Added -use_physical support to drcachesim offline traces using two new
marker types: #TRACE_MARKER_TYPE_PHYSICAL_ADDRESS and
#TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE.

The changes between version 9.0.1 and 9.0.0 include the following compatibility
changes:
Expand Down
25 changes: 25 additions & 0 deletions clients/drcachesim/common/trace_entry.h
Original file line number Diff line number Diff line change
Expand Up @@ -334,6 +334,20 @@ typedef enum {
*/
TRACE_MARKER_TYPE_WINDOW_ID,

/**
* The marker value contains the physical address corresponding to the subsequent
* entry's instruction fetch PC or data address. If translation failed, a
* #TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE will be present instead.
*/
TRACE_MARKER_TYPE_PHYSICAL_ADDRESS,

/**
* Indicates a failure to obtain the physical address corresponding to the
* subsequent entry's instruction fetch PC or data address. The marker value is
* undefined.
*/
TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE,

// ...
// These values are reserved for future built-in marker types.
// ...
Expand Down Expand Up @@ -371,6 +385,17 @@ type_is_prefetch(const trace_type_t type)
type == TRACE_TYPE_HARDWARE_PREFETCH;
}

/** Returns whether the type contains an address. */
static inline bool
type_has_address(const trace_type_t type)
{
return type_is_instr(type) || type == TRACE_TYPE_INSTR_NO_FETCH ||
type == TRACE_TYPE_INSTR_MAYBE_FETCH || type_is_prefetch(type) ||
type == TRACE_TYPE_READ || type == TRACE_TYPE_WRITE ||
type == TRACE_TYPE_INSTR_FLUSH || type == TRACE_TYPE_INSTR_FLUSH_END ||
type == TRACE_TYPE_DATA_FLUSH || type == TRACE_TYPE_DATA_FLUSH_END;
}

// This is the data format generated by the online tracer and produced after
// post-processing of raw offline traces.
// The reader_t class transforms this into memref_t before handing to analysis tools.
Expand Down
9 changes: 8 additions & 1 deletion clients/drcachesim/common/utils.h
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2015-2019 Google, Inc. All rights reserved.
* Copyright (c) 2015-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -47,6 +47,13 @@
// XXX: can we share w/ core DR?
#define IS_POWER_OF_2(x) ((x) != 0 && ((x) & ((x)-1)) == 0)

// XXX i#4399: DR should define a DEBUG-only assert.
#ifdef DEBUG
# define ASSERT(x, msg) DR_ASSERT_MSG(x, msg)
#else
# define ASSERT(x, msg) /* Nothing. */
#endif

#define BUFFER_SIZE_BYTES(buf) sizeof(buf)
#define BUFFER_SIZE_ELEMENTS(buf) (BUFFER_SIZE_BYTES(buf) / sizeof(buf[0]))
#define BUFFER_LAST_ELEMENT(buf) buf[BUFFER_SIZE_ELEMENTS(buf) - 1]
Expand Down
26 changes: 26 additions & 0 deletions clients/drcachesim/tests/offline-phys.templatex
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Hello world!
Output format:
<record#>: T<tid> <record details>
------------------------------------------------------------
1: T[0-9]+ <marker: version 3>
2: T[0-9]+ <marker: filetype 0x40>
3: T[0-9]+ <marker: cache line size [0-9]+>
4: T[0-9]+ <marker: tid [0-9]+ on core [0-9]+>
5: T[0-9]+ <marker: timestamp [0-9]+>
6: T[0-9]+ <marker: physical address for following entry: 0x[0-9a-f][0-9a-f]+>
7: T[0-9]+ ifetch .*
3 changes: 2 additions & 1 deletion clients/drcachesim/tests/raw2trace_unit_tests.cpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* **********************************************************
* Copyright (c) 2021 Google, Inc. All rights reserved.
* Copyright (c) 2021-2022 Google, Inc. All rights reserved.
* **********************************************************/

/*
Expand Down Expand Up @@ -38,6 +38,7 @@
#include <iostream>
#include <sstream>

#undef ASSERT
#define ASSERT(cond, msg, ...) \
do { \
if (!(cond)) { \
Expand Down
1 change: 1 addition & 0 deletions clients/drcachesim/tests/view_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
#include "../tracer/raw2trace.h"
#include "memref_gen.h"

#undef ASSERT
#define ASSERT(cond, msg, ...) \
do { \
if (!(cond)) { \
Expand Down
10 changes: 9 additions & 1 deletion clients/drcachesim/tools/invariant_checker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,14 @@ invariant_checker_t::parallel_shard_memref(void *shard_data, const memref_t &mem
"Incorrect instr count marker value");
}
}
if (shard->prev_entry_.marker.type == TRACE_TYPE_MARKER &&
(shard->prev_entry_.marker.marker_type == TRACE_MARKER_TYPE_PHYSICAL_ADDRESS ||
shard->prev_entry_.marker.marker_type ==
TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE)) {
// Physical address markers must be immediately prior to the virtual entry.
report_if_false(shard, type_has_address(memref.data.type),
"Physical addr marker not immediately prior to virtual addr");
}
if (type_is_instr(memref.instr.type) ||
memref.instr.type == TRACE_TYPE_PREFETCH_INSTR ||
memref.instr.type == TRACE_TYPE_INSTR_NO_FETCH) {
Expand Down Expand Up @@ -350,8 +358,8 @@ invariant_checker_t::parallel_shard_memref(void *shard_data, const memref_t &mem

#ifdef UNIX
shard->prev_prev_entry_ = shard->prev_entry_;
shard->prev_entry_ = memref;
#endif
shard->prev_entry_ = memref;

return true;
}
Expand Down
2 changes: 1 addition & 1 deletion clients/drcachesim/tools/invariant_checker.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,13 +73,13 @@ class invariant_checker_t : public analysis_tool_t {
prev_xfer_marker_.marker.marker_type = TRACE_MARKER_TYPE_VERSION;
last_xfer_marker_.marker.marker_type = TRACE_MARKER_TYPE_VERSION;
}
memref_t prev_entry_ = {};
memref_t prev_instr_ = {};
memref_t prev_xfer_marker_ = {}; // Cleared on seeing an instr.
memref_t last_xfer_marker_ = {}; // Not cleared: just the prior xfer marker.
#ifdef UNIX
// We only support sigreturn-using handlers so we have pairing: no longjmp.
std::stack<addr_t> prev_xfer_int_pc_;
memref_t prev_entry_ = {};
memref_t prev_prev_entry_ = {};
std::stack<memref_t> pre_signal_instr_;
// These are only available via annotations in signal_invariants.cpp.
Expand Down
8 changes: 8 additions & 0 deletions clients/drcachesim/tools/view.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,13 @@ view_t::process_memref(const memref_t &memref)
std::cerr << "<marker: window " << memref.marker.marker_value << ">\n";
last_window_[memref.marker.tid] = memref.marker.marker_value;
break;
case TRACE_MARKER_TYPE_PHYSICAL_ADDRESS:
std::cerr << "<marker: physical address for following entry: 0x" << std::hex
<< memref.marker.marker_value << std::dec << ">\n";
break;
case TRACE_MARKER_TYPE_PHYSICAL_ADDRESS_NOT_AVAILABLE:
std::cerr << "<marker: physical address not available>\n";
break;
default:
std::cerr << "<marker: type " << memref.marker.marker_type << "; value "
<< memref.marker.marker_value << ">\n";
Expand Down Expand Up @@ -370,6 +377,7 @@ view_t::process_memref(const memref_t &memref)
case TRACE_TYPE_INSTR_SYSENTER: std::cerr << "sysenter\n"; break;
default: error_string_ = "Uknown instruction type\n"; return false;
}
++num_disasm_instrs_;
return true;
}

Expand Down
6 changes: 3 additions & 3 deletions clients/drcachesim/tracer/instru.h
Original file line number Diff line number Diff line change
Expand Up @@ -199,7 +199,7 @@ class instru_t {
virtual int
get_instr_count(byte *buf_ptr) const = 0;
virtual addr_t
get_entry_addr(byte *buf_ptr) const = 0;
get_entry_addr(void *drcontext, byte *buf_ptr) const = 0;
virtual void
set_entry_addr(byte *buf_ptr, addr_t addr) = 0;

Expand Down Expand Up @@ -312,7 +312,7 @@ class online_instru_t : public instru_t {
int
get_instr_count(byte *buf_ptr) const override;
addr_t
get_entry_addr(byte *buf_ptr) const override;
get_entry_addr(void *drcontext, byte *buf_ptr) const override;
void
set_entry_addr(byte *buf_ptr, addr_t addr) override;

Expand Down Expand Up @@ -382,7 +382,7 @@ class offline_instru_t : public instru_t {
int
get_instr_count(byte *buf_ptr) const override;
addr_t
get_entry_addr(byte *buf_ptr) const override;
get_entry_addr(void *drcontext, byte *buf_ptr) const override;
void
set_entry_addr(byte *buf_ptr, addr_t addr) override;

Expand Down
13 changes: 9 additions & 4 deletions clients/drcachesim/tracer/instru_offline.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -248,12 +248,17 @@ offline_instru_t::get_instr_count(byte *buf_ptr) const
}

addr_t
offline_instru_t::get_entry_addr(byte *buf_ptr) const
offline_instru_t::get_entry_addr(void *drcontext, byte *buf_ptr) const
{
// TODO i#4014: To support -use_physical we would need to handle a PC
// entry here.
DR_ASSERT(!type_is_instr(get_entry_type(buf_ptr)));
offline_entry_t *entry = (offline_entry_t *)buf_ptr;
if (entry->addr.type == OFFLINE_TYPE_PC) {
// XXX i#4014: Use caching to avoid looup for last queried modbase.
app_pc modbase;
if (drmodtrack_lookup_pc_from_index(drcontext, entry->pc.modidx, &modbase) !=
DRCOVLIB_SUCCESS)
return 0;
return reinterpret_cast<addr_t>(modbase) + static_cast<addr_t>(entry->pc.modoffs);
}
return entry->addr.addr;
}

Expand Down
2 changes: 1 addition & 1 deletion clients/drcachesim/tracer/instru_online.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ online_instru_t::get_instr_count(byte *buf_ptr) const
}

addr_t
online_instru_t::get_entry_addr(byte *buf_ptr) const
online_instru_t::get_entry_addr(void *drcontext, byte *buf_ptr) const
{
trace_entry_t *entry = (trace_entry_t *)buf_ptr;
return entry->addr;
Expand Down
17 changes: 14 additions & 3 deletions clients/drcachesim/tracer/physaddr.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -116,15 +116,19 @@ physaddr_t::init()
}

bool
physaddr_t::virtual2physical(addr_t virt, OUT addr_t *phys)
physaddr_t::virtual2physical(addr_t virt, OUT addr_t *phys, OUT bool *from_cache)
{
#ifdef LINUX
if (phys == nullptr)
return false;
addr_t vpage = page_start(virt);
bool use_cache = true;
if (from_cache != nullptr)
*from_cache = false;
if (op_virt2phys_freq.get_value() > 0 && ++count_ >= op_virt2phys_freq.get_value()) {
// Flush the cache and re-sync with the kernel
// Flush the cache and re-sync with the kernel.
// XXX i#4014: Provide a similar option that doesn't flush and just checks
// whether mappings have changed?
use_cache = false;
last_vpage_ = PAGE_INVALID;
hashtable_clear(&v2p_);
Expand All @@ -134,6 +138,8 @@ physaddr_t::virtual2physical(addr_t virt, OUT addr_t *phys)
// Use cached values on the assumption that the kernel hasn't re-mapped
// this virtual page.
if (vpage == last_vpage_) {
if (from_cache != nullptr)
*from_cache = true;
*phys = last_ppage_ + page_offs(virt);
return true;
}
Expand All @@ -147,6 +153,8 @@ physaddr_t::virtual2physical(addr_t virt, OUT addr_t *phys)
ppage = 0;
last_vpage_ = vpage;
last_ppage_ = ppage;
if (from_cache != nullptr)
*from_cache = true;
*phys = last_ppage_ + page_offs(virt);
return true;
}
Expand Down Expand Up @@ -178,8 +186,11 @@ physaddr_t::virtual2physical(addr_t virt, OUT addr_t *phys)
NOTIFY(1, "v2p failure: entry is invalid for %p\n", vpage);
return false;
}
last_ppage_ = (addr_t)((entry & PAGEMAP_PFN) << page_bits_);
// TODO i#4014: On recent kernels unprivileged reads succeed but are passed
// a 0 PFN. Since that could be valid, we need to check our capabilities
// to decide. Until then, we're currently returning 0 for every unprivileged query.
// Store 0 as a sentinel since 0 means no entry.
last_ppage_ = (addr_t)((entry & PAGEMAP_PFN) << page_bits_);
hashtable_add(
&v2p_, reinterpret_cast<void *>(vpage),
reinterpret_cast<void *>(last_ppage_ == 0 ? ZERO_ADDR_PAYLOAD : last_ppage_));
Expand Down
6 changes: 4 additions & 2 deletions clients/drcachesim/tracer/physaddr.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,14 @@ class physaddr_t {
bool
init();

// If translation from "virt" to its corresponding physicall address is
// If translation from "virt" to its corresponding physical address is
// successful, returns true and stores the physical address in "phys".
// 0 is a possible valid physical address, as are large values beyond
// the amount of RAM due to holes in the physical address space.
// Returns in "from_cache" whether the physical address had been queried before
// and was available in a local cache (which is cleared at -virt2phys_freq).
bool
virtual2physical(addr_t virt, OUT addr_t *phys);
virtual2physical(addr_t virt, OUT addr_t *phys, OUT bool *from_cache = nullptr);

private:
#ifdef LINUX
Expand Down
Loading

0 comments on commit d96d930

Please sign in to comment.