Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize clone method #187

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Dhruvit96
Copy link

What does this PR do?

This PR improves the performance of the Clone method for the M type in mapstr package.
It uses an inbuilt method for cloning maps which is more efficient.
ref: golang/go#58740

Why is it important?

This change is important as the Clone method is used before applying processors in the beats repo each time.
This small optimization will improve the performance of the Beats pipeline by a substantial margin.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works

@Dhruvit96 Dhruvit96 requested a review from a team as a code owner February 25, 2024 14:04
@Dhruvit96 Dhruvit96 requested review from faec and leehinman and removed request for a team February 25, 2024 14:04
Copy link

cla-checker-service bot commented Feb 25, 2024

💚 CLA has been signed

@pierrehilbert pierrehilbert added the Team:Elastic-Agent Label for the Agent team label Feb 26, 2024
@pierrehilbert pierrehilbert requested a review from rdner February 26, 2024 16:35
@faec
Copy link

faec commented Feb 26, 2024

Thanks for pointing this out! I'm setting up some benchmarks to see how this compares to the current version.

@dhruvit-crest
Copy link

Thanks for pointing this out! I'm setting up some benchmarks to see how this compares to the current version.

Thank you please let me know if I can help any way.

@faec
Copy link

faec commented Mar 1, 2024

I've done some benchmarks and the results seem odd. All the CPU numbers are neutral-leaning-higher with the new code. I've taken profiles and confirmed that everything is using the new optimized stdlib, and it does save ~10% of copying time (~1% of overall cpu in the benchmarks I ran), but then it loses that time back in increased GC costs.

The reported CPU reduction in the stdlib change is more like 85%, so this is a pretty big mismatch. I'm going to do some more testing to try and track this down.

One thing to add to the PR, though: the baseline version of Clone always returns a non-nil result, even when called on nil, whereas your version returns nil, and that causes crashes since Filebeat assumes the return value is non-nil. (It maybe shouldn't, when cloning a nil pointer, but that's the current behavior.) To avoid any surprise crashes can you add a nil check on the input to explicitly return a non-nil empty map?

@Dhruvit96
Copy link
Author

Thanks @faec for looking into it, I'm curious about what tests you have run. So far I have only used Go's framework for benchmarking and pprof library. If you could point to articles or tools that you're using or referring to, it would be a great help to my learning.

Also thanks for pointing out behavior mismatch. I'll add a nil check to maintain the current behavior.

@rdner rdner removed their request for review March 22, 2024 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants