Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza] Hash fingerprints #31317

Closed

Conversation

djaglowski
Copy link
Member

@djaglowski djaglowski commented Feb 19, 2024

Resolves #29617

Replaces #29691

Hashes fingerprints in order to make comparisons more efficient. The hash value is stored on the fingerprint struct but does not need to be saved to storage as it can be restored based on the bytes.

@djaglowski djaglowski force-pushed the pkg-stanza-hash-fingerprints branch 2 times, most recently from 520fd63 to 869976b Compare February 21, 2024 19:09
@djaglowski djaglowski force-pushed the pkg-stanza-hash-fingerprints branch from 869976b to 9b23404 Compare February 27, 2024 18:28
djaglowski added a commit that referenced this pull request Feb 28, 2024
Depends on #31251

This is in preparation for adding additional fields to the fingerprint
struct. The goal is to write the same data to storage while ensuring
consistency between the fields of the struct. This problem does not
present until there is a second field in the struct, but this PR
prepares for that problem without adding the new field. See #31317 for
additional detail.
@djaglowski djaglowski force-pushed the pkg-stanza-hash-fingerprints branch from 9b23404 to f9f7a3a Compare February 28, 2024 18:53
@djaglowski djaglowski force-pushed the pkg-stanza-hash-fingerprints branch from f9f7a3a to e282966 Compare February 28, 2024 18:57
@djaglowski djaglowski removed exporter/datadog Datadog components cmd/configschema configschema command cmd/otelcontribcol otelcontribcol command cmd/oteltestbedcol labels Feb 28, 2024
XinRanZhAWS pushed a commit to XinRanZhAWS/opentelemetry-collector-contrib that referenced this pull request Mar 13, 2024
…ry#31346)

Depends on open-telemetry#31251

This is in preparation for adding additional fields to the fingerprint
struct. The goal is to write the same data to storage while ensuring
consistency between the fields of the struct. This problem does not
present until there is a second field in the struct, but this PR
prepares for that problem without adding the new field. See open-telemetry#31317 for
additional detail.
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Mar 14, 2024
Copy link
Contributor

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Mar 28, 2024
@djaglowski
Copy link
Member Author

Now that #31516 is merged, I will revisit whether this actually improves performance.

@djaglowski djaglowski reopened this Mar 29, 2024
@djaglowski
Copy link
Member Author

Unfortunately, at least by this benchmark, the change does not improve performance. In fact it appears to degrade performance when working with many files.

- main -

BenchmarkFileInput/Single-10         	   11214	    103957 ns/op	      18 B/op	       0 allocs/op
BenchmarkFileInput/Glob-10           	   10406	    122575 ns/op	      40 B/op	       0 allocs/op
BenchmarkFileInput/MultiGlob-10      	    9894	    132695 ns/op	      23 B/op	       0 allocs/op
BenchmarkFileInput/MaxConcurrent-10  	    2606	    409411 ns/op	     204 B/op	       0 allocs/op
BenchmarkFileInput/FngrPrntLarge-10  	   12030	    100777 ns/op	      46 B/op	       0 allocs/op
BenchmarkFileInput/FngrPrntSmall-10  	   12045	    103833 ns/op	       3 B/op	       0 allocs/op
BenchmarkFileInput/NoFlush-10        	   10000	    106679 ns/op	      10 B/op	       0 allocs/op
BenchmarkFileInput/Many-10           	     496	   2268852 ns/op	   48817 B/op	      49 allocs/op


- w/ fingerprint hashing -

BenchmarkFileInput/Single-10         	    9274	    108644 ns/op	      11 B/op	       0 allocs/op
BenchmarkFileInput/Glob-10           	    9841	    121266 ns/op	      24 B/op	       0 allocs/op
BenchmarkFileInput/MultiGlob-10      	    9830	    129439 ns/op	      59 B/op	       0 allocs/op
BenchmarkFileInput/MaxConcurrent-10  	    2640	    444330 ns/op	     552 B/op	       0 allocs/op
BenchmarkFileInput/FngrPrntLarge-10  	   11170	    109383 ns/op	      55 B/op	       0 allocs/op
BenchmarkFileInput/FngrPrntSmall-10  	    9736	    106717 ns/op	       4 B/op	       0 allocs/op
BenchmarkFileInput/NoFlush-10        	    9637	    105138 ns/op	       7 B/op	       0 allocs/op
BenchmarkFileInput/Many-10           	     170	   7527812 ns/op	  143423 B/op	     145 allocs/op

It's certainly possible the additional changes would yield a better result but I have spent quite some time on this without any benefit so will close the PR and associated issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modify Fingerprint to Hashed Values
1 participant