-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(minor): Add Benchmark parameters #274
feat(minor): Add Benchmark parameters #274
Conversation
Thanks! Will be a little bit of time before being able to review properly - mostly out this week - but one quick comment: do you think it would make sense to support ranges for int/doubles too? |
You're thinking something like this?
I don't see why we couldn't support something like that. So likely they could get the behavior they want by just string encoding the range and using that string as the value. For example: for range in [0...1, 1...10, 10...100] {
Benchmark(
"testFoo",
parameters: [
BenchmarkParameter("range", String(describing: range) )
]
) { /* do something with `range` */ }
} In summary, I'm certainly not opposed - though I'm not sure about the value proposition yet? Admittedly, my argument here begs the question why we bother supporting anything besides |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #274 +/- ##
==========================================
+ Coverage 65.32% 65.67% +0.35%
==========================================
Files 33 34 +1
Lines 3610 3653 +43
==========================================
+ Hits 2358 2399 +41
- Misses 1252 1254 +2
Continue to review full report in Codecov by Sentry.
|
c957cba
to
5c050b5
Compare
Hi, started to have a look at the PR now - I'd like to revert the discussion to the use site a bit, if this was the old approach to paramterization: let parameterization = (0...5).map { 1 << $0 } // 1, 2, 4, ...
parameterization.forEach { count in
Benchmark("ParameterizedWith\(count)") { benchmark in
for _ in 0 ..< count {
blackHole(Int.random(in: benchmark.scaledIterations))
}
}
} Then with the additional API, I would expect to do just the minimal change: parameterization.forEach { count in
Benchmark("ParameterizedTest", configuration: .init(parameters: [.init("count", count)])) { benchmark in
for _ in 0 ..< count {
blackHole(Int.random(in: benchmark.scaledIterations))
}
}
} To basically record the parameters in use in the new parameters array - instead of encoding it into the name as in the previous current sample - for use in display and later in the export. It's not a literal in this case, but a parameter that can vary - just as the currently supported parameterisation (just formalising the naming of the parameter that can vary for display/export purposes). Also, I'm not quite sure what the infinite/finite range is required for, would not expect to need that? Also, writing the code above, perhaps it's really a dictionary that makes sense rather than an array (of string:string)? E.g. parameterization.forEach { count in
Benchmark("ParameterizedTest", configuration: .init(parameters: ["count" : count.description])) { benchmark in
for _ in 0 ..< count {
blackHole(Int.random(in: benchmark.scaledIterations))
}
}
} Just to avoid adding multiple entries for the same key. And if we go this way, the value is just a string description of the parameter that was in use - as this will not be used in practice by the code - the actual parameter value is captured when the benchmark is created, so it can be any type then? Then a range works too; let range = 1...100
Benchmark("ParameterizedTest", configuration: .init(parameters: ["count" : range.description])) { benchmark in
for _ in range {
blackHole(Int.random(in: benchmark.scaledIterations))
}
}
} |
I agree with everything you've outlined.
This is worth talking about as it is something that will be crucial for the Influx exporter.
Now, the important distinction between The consequence of this distinction is that:
In summary, whatever the design for parameterization - it's important for at least my use case to have information about whether the parameters are finite or infinite range. I'm totally open to suggestions on how we might designate this better. |
Alright - I think we should also actually go back to your original proposal with a I don't think we want to leak details of influx formats though - I would probably suggest that you in that case adopt a convention on tags (either key naming or similar) that is used by the influx exporter - or consider a post processing step external that marks up the rows using such a convention/configuration after exporting the benchmark data? |
I agree with not leaking Influx related details. I'll return this to the tags nomenclature. I'll have to keep thinking about how to do the disambiguation of tags/fields in a post processing step or otherwise. It seems equally awkward to have a special implicit naming convention that only the influx exporter knows what to do with. I'll think about it some more. |
I would probably do a post processing step in your pipeline to handle that separately. |
5c050b5
to
e327a91
Compare
Alright, I made the changes we discussed to put the nomenclature back to Could you elaborate a bit on what the post-processing step you are suggesting might look like?
Neither one of these is ideal. # 2 is what I believe you are suggesting and I think it's doable - just means some more scripting in my CI pipeline. Was hoping we could get to a point where the InfluxExporter native to this repo more fully supports Influx outputs. But maybe there just really isn't an appropriate generalization that is still relevant to the other exporters. What do you think about a mechanism (I'd make a separate MR for this) for injecting exporter-specific information for a Benchmark into the Benchmark initializer? Something like: Benchmark(
/* params */,
exportConfigs: [ // [ExportConfiguration] where `ExportConfiguration` is a protocol
// This is where I could specify what `tag`s should be considered as `field`s during export
InfluxExportConfiguration(...)
]
) OR Benchmark(
/* params */,
exportConfigs: [
// This would better ensure no duplicate configs get passed.
.influx: InfluxExportConfiguration(...)
]
) This would allow the Benchmark implementer to specify some salient information about a benchmark to the exporter that may only be relevant to that may only be relevant to that particular exporter. This would help get around your concern (which I think is very valid) of leaking Influx specific information into the base Benchmark config. |
I was thinking of option #2, yep - I assume it would be a fairly straightforward script step - but I think we could also consider a separate PR for export configurations as you suggest (second one seems to make more sense then), e.g. Benchmark(
/* params */,
exportConfigurations: [
.influx: InfluxExportConfiguration(...)
]
) |
e327a91
to
08940ee
Compare
Awesome - I'll work on a followup PR after this one. I think this is ready for the final review. There are some CI checks that have been failing that are not clear to me how to address - I'd appreciate any pointers you have on running these checks locally. |
BenchmarkTool.swift needs to use the new
(and then baseName needs to be public). Otherwise a benchmark using tags will not properly match the regex matching. E.g. I tried adding tags to the Basic benchmark in the Benchmarks directory thus: Benchmark("Basic",
configuration: .init(metrics: [.wallClock, .throughput, .instructions], tags: ["a" : "b", "c" : "d"])) { _ in
} Then this broke:
and required this:
to match. Changing the regex matching to baseName instead sorts that. |
The SwiftLint should be run locally (you need to kill the .build directory first): hassila@ice ~/package-benchmark (feat(minor)/benchmark-parameters)> swiftlint .
Linting Swift files at paths .
Linting 'Int+Extensions.swift' (1/24)
Linting 'Benchmark.swift' (2/24)
Linting 'Statistics.swift' (3/24)
Linting 'BenchmarkRunner+ReadWrite.swift' (4/24)
Linting 'ARCStatsProducer.swift' (5/24)
Linting 'ARCStats.swift' (6/24)
Linting 'BenchmarkThresholds.swift' (7/24)
Linting 'BenchmarkRunner.swift' (9/24)
Linting 'Benchmark+ConvenienceInitializers.swift' (8/24)
Linting 'BenchmarkResult.swift' (10/24)
Linting 'BenchmarkInternals.swift' (12/24)
Linting 'BenchmarkMetric+Defaults.swift' (13/24)
Linting 'BenchmarkThresholds+Defaults.swift' (11/24)
Linting 'BenchmarkExecutor+Extensions.swift' (14/24)
Linting 'Blackhole.swift' (15/24)
Linting 'MallocStats.swift' (16/24)
Linting 'MallocStatsProducer+jemalloc.swift' (17/24)
Linting 'MallocStats+jemalloc-support.swift' (18/24)
Linting 'BenchmarkClock.swift' (19/24)
Linting 'OperatingSystemStatsProducer+Darwin.swift' (20/24)
Linting 'OperatingSystemStats.swift' (21/24)
Linting 'OperatingSystemStatsProducer+Linux.swift' (22/24)
Linting 'BenchmarkExecutor.swift' (23/24)
Linting 'BenchmarkMetric.swift' (24/24)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:13:36: warning: Blanket Disable Command Violation: The disabled 'type_body_length' rule should be re-enabled before the end of the file (blanket_disable_command)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:13:23: warning: Blanket Disable Command Violation: The disabled 'file_length,' rule should be re-enabled before the end of the file (blanket_disable_command)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:475:1: warning: File Length Violation: File should contain 400 lines or less: currently contains 475 (file_length)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:20:55: warning: Prefer Self in Static References Violation: Use `Self` to refer to the surrounding type name (prefer_self_in_static_references)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:24:60: warning: Prefer Self in Static References Violation: Use `Self` to refer to the surrounding type name (prefer_self_in_static_references)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:28:63: warning: Prefer Self in Static References Violation: Use `Self` to refer to the surrounding type name (prefer_self_in_static_references)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:32:68: warning: Prefer Self in Static References Violation: Use `Self` to refer to the surrounding type name (prefer_self_in_static_references)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:13:52: warning: Superfluous Disable Command Violation: 'file_length,' is not a valid SwiftLint rule; remove it from the disable command (superfluous_disable_command)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:436:37: warning: Superfluous Disable Command Violation: 'file_length,' is not a valid SwiftLint rule; remove it from the disable command (superfluous_disable_command)
/Users/hassila/package-benchmark/Sources/Benchmark/Benchmark.swift:447:36: warning: Superfluous Disable Command Violation: 'file_length,' is not a valid SwiftLint rule; remove it from the disable command (superfluous_disable_command)
Linting 'P90AbsoluteThresholds.swift' (1/13)
Linting 'DateTime.swift' (2/13)
Linting 'Basic+SetupTeardown.swift' (3/13)
Linting 'BenchmarkRunner+Basic.swift' (4/13)
Linting 'Histogram.swift' (5/13)
Linting 'Package.swift' (6/13)
Linting 'BenchmarkTests.swift' (7/13)
Linting 'OperatingSystemAndMallocTests.swift' (8/13)
Linting 'BenchmarkMetricsTests.swift' (9/13)
Linting 'StatisticsTests.swift' (10/13)
Linting 'AdditionalTests.swift' (11/13)
Linting 'BenchmarkResultTests.swift' (12/13)
Linting 'BenchmarkRunnerTests.swift' (13/13) |
08940ee
to
a7b3292
Compare
Sometimes it is convenient to run the same benchmark multiple times under slightly different scenarios. For example, benchmarking `func foo(_ n: Int)` for various parameterizations of `n`. This adds support for specifying tags for a benchmark but makes no functional changes beyond the name/identifier of a benchmark now including a parameter list when one was provided. This unlocks the potential for the benchmark exporters (such as the Influx exporter) to leverage these tags (by emitting appropriate tags/fields in the case of Influx) in their output.
a7b3292
to
4b34c19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! LGTM now - the remaining failing CI is due to other issues which we are investigating - something has broken API and code coverage checks recently.
Description
Sometimes it is convenient to run the same benchmark multiple times under slightly different scenarios. For example, benchmarking
func foo(_ n: Int)
for various parameterizations ofn
. This adds support for specifying the parameters of a benchmark but makes no functional changes beyond the name/identifier of a benchmark now including a parameter list when one was provided.This unlocks the potential for the benchmark exporters (such as the Influx exporter) to leverage these parameters (by emitting appropriate tags/fields in the case of Influx) in their output.
Resolves #267
How Has This Been Tested?
Added a new test to verify the output of
Benchmark.name
.Also ran the remaining tests to make sure no other behavior changed.
Minimal checklist:
DocC
code-level documentation for any public interfaces exported by the package