-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Control plane for submitting benchmark runs. #4231
Comments
[Triage] @rishabh6788 Will you be working on this issue? |
@dblock @reta Please review the proposal and provide your comments. |
This is a well thought through proposal that basically boils down to whether we want to reuse Jenkins or build something new. The cons of building something new easily outweighs any of the advantages, so I agree with the recommendation. A question about development of features in private. Can I easily reproduce this infrastructure for private setups? For example, if I use open source OpenSearch with an additional private plugin X, how can I reuse this setup in this workflow for testing changes in my version of the product that includes plugin X? Unrelated to the question above, my typical performance improvement cycle that I see is something like this. Help me understand how I'll interact with what you're proposing?
|
Thank you for the review and feedback @dblock. Regarding All the components used in the recommended approach, Regarding testing your local development changes with our platform all you need to provide is a publicly available url from where your artifact (x64 tarball for now) can be downloaded. At present many developers are uploading their local tarball to github by creating a dummy release in their fork, uploading tarball as attachment (5gb limit) and then using that link with opensearch-cluster-cdk to set up a cluster with their changes. Hope this answer's your question. This will be the starting point for us and we can come up with a proper automation where a developer just uploads the artifact to a custom UI along with other required parameters and is able to run the benchmark. Here's the response to other points you mentioned: 2 & 5, We are using https://github.com/opensearch-project/opensearch-benchmark-workloads that provides all the benchmark test specifications and data, so in case any new performance tests are getting added this repository they will be automatically picked up by the adhoc or nightly benchmark runs. With respect to changes to your branch, you just need to provide the artifact using a public url. 4, 6 & 7, As mentioned we already have nightly benchmarks running using the same setup for past 8 months across released version of OS, 2.x and main branch, so any new code that is being committed to mainline or 2.x is getting picked up by our nightly benchmark runs and reflected in the dashboards immediately. We use the same dashboard to create alerts and notifications for catching any regression or improvement. For e.g, the recent PR opensearch-project/OpenSearch#11390 to main line and 2.x showed significant improvements to aggregate queries and was picked up by our public dashboards. We can definitely improve upon our notification delivery to broadcast such events. See https://tinyurl.com/ukavvvhh for 2.x branch improvement and https://tinyurl.com/2wxw2t89 for mainline. Hope I was able to answer your queries. |
Tagging @msfroh @peternied @jainankitk @rishabhmaurya to get more feedback. |
Thanks for the proposal @rishabh6788 +1 in favour of recommended approach
This is somewhat surprising procedure taking into account that those changes should be coming from:
I would have imagined that self service platform would not be asking for arbitrary bits but only trusted sources of changes like the above (it also could be automated with |
Thank you for the feedback @reta.
With respect to your concern regarding Hope this answers your query. |
Thanks @rishabh6788 , I understand the solution but not the reasoning that leads to it. The
This basically means, as it stands as of today, only AWS employees (if I am not mistaken how Github teams work, the team members must be part of the organization) |
I think mentioning For, e.g., If you are working on a PR and want to run a benchmark against it and compare how it is doing against baseline, you should be able to do it using this platform since you are a validated contributor to the repo and we should be add you to the execution role. Same goes with any valid contributor of OpenSearch repository. @reta In future, you should be just be able to initiate a run by adding a label or comment on your PR. I think the goal of this project is to make performance benchmarking essential to most of the work that is happening on OpenSearch repo and at the same time make it easier and more accessible for all the community members who contribute to OpenSearch repo. Hope this helps. |
This is exactly what I refer to here . It is super straightforward to build the distribution out of pull request using tooling (this is Gradle task), and I think this would simplify the onboarding time even more. |
This is currently waiting on Jenkins version upgrade and splitting of single jenkins infra into dedicated jenkins for gradle-checks, build/test/release and benchmark use cases. Working with @Divyaasm to help complete the above mentioned pre-requisites. |
Given we are awaiting on Jenkins to be split into separate use-cases and security review for integrating github oidc with jenkins we are proposing a slight change in the authentication and authorization mechanism to keep things simple yet secure and move faster. The only requirement is that the user should have a valid aws account and an IAM user/role. There will be no cost incurred for using IAM service as it is free, see https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html. @dblock @reta |
Another option on the table is use lambda authorizer with api gateway and use github oauth to authenticate the user, but I do not want any github user to be able to submit the job. So I have an option to either maintain an internal database or file which I will check before authorizing the user or create a team in our github project and check if the authenticated user is part of that team. Let my muck around a bit and get back. |
Thanks @rishabh6788
So at the moment, none of the external contributors have AWS account(s) to access build infra (Jenkins, etc). I suspect with the suggested approach it stays the same, so it is not clear to me how those contributors would benefit from self-serviceable platform? |
@rishabh6788 In an ideal configuration, who would have access to submit these runs? If that group is maintainers, we've got ways to extract those lists that we consider authoritative via an API check that could be done as part of the authorization step, this would allow anyone that is a maintainer to have access.
If that group is broader, I think that GitHub's organization is the best aligned mechanism, with a team/special permission that was moderated for this kind of access. Decoupling the OpenSearch-Project org from the AWS seems like a prerequisite - @getsaurabh02 do you know if we have any updates, it seems we are hitting another access management related issue. |
Thank you @reta and @peternied for your feedback. I am deliberating using Why user
|
Okay, the list_contribtors api fetches users that were part of OpenSearch before it was forked from elastic, so cannot be trusted to be the authoritative list for granting authorization. |
How do you feel about a flow similar to releases?
|
Thanks for the feedback Db, yes this the expected flow once we have a rest api ready for submitting benchmark runs. Once that is out and working expected, I will start the proposal on how to submit benchmark runs just by commenting or adding label to PRs. |
Self-serviceable Performance Benchmark Platform
Purpose
The purpose of this issue is to brainstorm over different approaches to come up with a self-serviceable platform for running ad-hoc benchmark tests. We will go over the current state of affairs and then propose different alternatives to achieve the goal of a self-serviceable platform. This will enable developers to move away from manually setting up infra to run benchmark tests against their local changes and make the whole experience of running benchmarks seamless and less cumbersome.
Tenets
Background
In the rapidly changing analytics marketplace where different providers aim to provide almost similar ingestion and search solutions what sets them apart is how well they perform against each other and also across the release-cycle of that product. While performance has always been at the core of OpenSearch development cycle it was never centralized and there was no unified platform to track OpenSearch performance over its release cycle.
At the starting of this year we undertook the objective of streamlining performance benchmarking process and creating a centralized portal to view and analyze performance metrics across various versions of OpenSearch. This solved the long awaited problem of consistently running benchmark on daily basis across releases and in-development versions of OpenSearch and publishing metrics publicly for anyone to view those metrics. We added various enhacements and features in the development lifecycle to get consistent results and reduce variance. We can now run indexing-specific benchmark to track any regressions/improvements in the indexing path and same can be achieved for search roadmap where we use data snapshots to purely test search metrics without having any variance coming from index writers during benchmark run.
Opportunity
While we have made tremendous progress in setting up the nightly benchmark runs it is still not straight-forward for developers to do performance runs on ad-hoc basis. The developers still have to manage their own infra to set up OpenSearch cluster and then run benchmark using opensearch-benchmark to run benchmark against their local changes. Getting used to running opensearch-benchmark tool efficiently itself is a learning curve in itself.
Even though the nightly benchmark platform supports submitting ad-hoc performance runs against locally developed OpenSearch artifacts (x64 tarball) it is still not possible to open it up for devs to start using, mainly due to:
Recently we have been getting a lot of feedback to provide a platform for developers to submit ad-hoc benchmark runs and not to be dependent on engineering-effectiveness team to submit on their behalf.
The nightly benchmark platform has given us the opportunity to build upon the existing work that has been done and come up with a self-serviceable platform that developers can use to run ad-hoc benchmark runs without worrying about setting up infra and learning how opensearch-benchmark tool works. This way they will be able to leverage all the work we have done in achieving maximum efficiency to test specific code paths, i.e. indexing path or search path (using data restored from snapshots).
Proposed Solutions
Separate Jenkins Instance as core execution engine (Recommended)
The idea behind this proposal is that we set up a dedicated jenkins instance to orchestrate and execute performance runs. Below are the reasons why:
Next steps after Jenkins has been setup and starts running benchmark runs:
Pros:
Cons:
Create Benchmark Orchestrator from Scratch
In this approach we build a new benchmark Orchestrator service from scratch. Below are the high-level components that will be required:
Pros:
Cons:
Both the solution proposed will pave way for
micro-benchmarks
, running benchmarks against PR and updating results merely by adding a comment or label to the PR.We need to choose the solution with the least operational overhead.
The text was updated successfully, but these errors were encountered: