-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AMD PMC/U support #222
Conversation
src/data/perf_stat.rs
Outdated
/* Get Model specific events */ | ||
platform_specific_counter = match cpu_info.model_name.as_str() { | ||
"AMD EPYC 9R14" => GENOA_CTRS.to_vec(), | ||
"AMD EPYC 7R13 Processor" => MILAN_CTRS.to_vec(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some processors may have different model name. For example, ""AMD EPYC 7R13 48-Core Processor" on c6a metal instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that because you're planning on making changes here to do, for example, a substring or prefix match? Or are you saying it because you want to know whether it matters if metal works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wanted to note it here. Have a prefix match for the first 13 characters and will push changes soon.
src/data/amd_genoa_perf_events.rs
Outdated
@@ -0,0 +1,24 @@ | |||
use crate::data::perf_stat::{NamedCtr, NamedTypeCtr, PerfType}; | |||
|
|||
static STALL_BACKEND_PKC: NamedTypeCtr = NamedTypeCtr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't PKC, this is a single event, it's just backend stalls.
name: "stall_backend_pkc", | ||
nrs: vec![STALL_BACKEND_PKC], | ||
drs: vec![CYCLES], | ||
scale: 167 //~= 1000/6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is really weird, why this value instead of just 1000? The ones below for milan are just 1000.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, I think its from each cycle being able to dispatch 6 instructions? This scale was from our documentation and AMD documentation.
src/data/amd_milan_perf_events.rs
Outdated
@@ -0,0 +1,35 @@ | |||
use crate::data::perf_stat::{NamedCtr, NamedTypeCtr, PerfType}; | |||
|
|||
static STALL_BACKEND_PKC1: NamedTypeCtr = NamedTypeCtr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not PKC
src/data/amd_milan_perf_events.rs
Outdated
name: "Backend-Stalls-1", | ||
config: 0xf7ae, | ||
}; | ||
static STALL_BACKEND_PKC2: NamedTypeCtr = NamedTypeCtr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not PKC
}; | ||
static STALL_BACKEND_PKC2: NamedTypeCtr = NamedTypeCtr { | ||
perf_type: PerfType::RAW, | ||
name: "Backend-Stalls-2", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any better specifiers for these besides just -1 and -2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, each one counts several backend stall types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not documented what exactly each one counts, perhaps the better thing to do is add them together. That assumes there's not overlap in what each one counts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked with Geoff, the sum of the two is the total backend stalls. Pushed changes
src/data/amd_perf_events.rs
Outdated
}; | ||
static L1_DATA: NamedTypeCtr = NamedTypeCtr { | ||
perf_type: PerfType::RAW, | ||
name: "L1-Data", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an L1 data access, or a L1 data miss?
In general a lot of these counters have names that are not specific enough. I haven't looked at our existing definitions, hopefully they aren't all like this.
src/data/perf_stat.rs
Outdated
/* Get Model specific events */ | ||
platform_specific_counter = match cpu_info.model_name.as_str() { | ||
"AMD EPYC 9R14" => GENOA_CTRS.to_vec(), | ||
"AMD EPYC 7R13 Processor" => MILAN_CTRS.to_vec(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that because you're planning on making changes here to do, for example, a substring or prefix match? Or are you saying it because you want to know whether it matters if metal works?
src/data/amd_perf_events.rs
Outdated
NamedCtr { | ||
name: "l3-mpki", | ||
nrs: vec![L3], | ||
nrs: vec![L1_ANY_FILLS_DRAM], | ||
drs: vec![INSTRUCTIONS], | ||
scale: 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name says L3, nrs says L1. Are you sure this is right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The counter is approximating L3 misses. "This is sorta l3 mpki, but ellides Prefetch misses from L2" - from our docs.
src/data/amd_milan_perf_events.rs
Outdated
@@ -19,17 +19,11 @@ static CYCLES: NamedTypeCtr = NamedTypeCtr { | |||
lazy_static! { | |||
pub static ref MILAN_CTRS: Vec<NamedCtr<'static>> = [ | |||
NamedCtr { | |||
name: "stall_backend_pkc1", | |||
nrs: vec![STALL_BACKEND_1], | |||
name: "stall_backend", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it still have the pkc in its name? Do whatever we do for other processors on this metric.
But having a metric without showing what its units/scale are seems incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cupcakes had nothing to do with this approval ...
Description of changes:
perf_stat
Testing:
Ran benchmarks on AMD milan and genoa family instances and compared output to results generated by perfrunbook and Intel/graviton based instances.
Example of ipc comparison between perfrunbook and aperf:
Note: We observed some incorrect data presumably from event multiplexing (up to 13 groups of 14 events). Values may be different than true counts due to this scaling, but trends remain accurate.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.