Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[target allocator] Data race when writing / reading jobToScrapeConfig in Discoverer #1359

Closed
matej-g opened this issue Jan 10, 2023 · 0 comments · Fixed by #1413
Closed

[target allocator] Data race when writing / reading jobToScrapeConfig in Discoverer #1359

matej-g opened this issue Jan 10, 2023 · 0 comments · Fixed by #1413

Comments

@matej-g
Copy link
Contributor

matej-g commented Jan 10, 2023

When running the target allocator in our test environment, it would often crash for us with the following output (truncated to most relevant part):

{"level":"info","ts":1672998710.9367316,"logger":"allocator","msg":"Successfully started a collector pod watcher","component":"opentelemetry-targetallocator"}
fatal error: concurrent map read and map write

goroutine 6072 [running]:
reflect.mapaccess_faststr(0x23078e8?, 0x7f8e9acbcf38?, {0xc17459e450?, 0x34fba60?})
	/usr/local/go/src/runtime/map.go:1343 +0x1e
reflect.Value.MapIndex({0x2966a20?, 0xc000c80fc0?, 0xc17aa5f4a0?}, {0x273e3c0, 0xc1fd84bc30, 0x98})
	/usr/local/go/src/reflect/value.go:1664 +0xc5
github.com/mitchellh/hashstructure.(*walker).visit(0xc190f7f790, {0x2966a20?, 0xc000c80fc0?, 0xc000291d60?}, 0x0)
	/go/pkg/mod/github.com/mitchellh/[email protected]/hashstructure.go:225 +0x1685
github.com/mitchellh/hashstructure.Hash({0x2966a20?, 0xc000c80fc0}, 0x8?)
	/go/pkg/mod/github.com/mitchellh/[email protected]/hashstructure.go:108 +0x213
github.com/open-telemetry/opentelemetry-operator/cmd/otel-allocator/server.(*Server).ScrapeConfigsHandler(0xc000165860, {0x34f3110, 0xc3e789e460}, 0xc1fd84b5d0?)
	/app/server/server.go:93 +0x5c
net/http.HandlerFunc.ServeHTTP(0x100?, {0x34f3110?, 0xc3e789e460?}, 0x0?)
	/usr/local/go/src/net/http/server.go:2109 +0x2f
github.com/open-telemetry/opentelemetry-operator/cmd/otel-allocator/server.(*Server).PrometheusMiddleware.func1({0x34f3110, 0xc3e789e460}, 0xc5d6c20c60?)
	/app/server/server.go:138 +0x13f
net/http.HandlerFunc.ServeHTTP(0xc190c4ba00?, {0x34f3110?, 0xc3e789e460?}, 0x800?)
	/usr/local/go/src/net/http/server.go:2109 +0x2f
github.com/gorilla/mux.(*Router).ServeHTTP(0xc0000f43c0, {0x34f3110, 0xc3e789e460}, 0xc190c4b900)
	/go/pkg/mod/github.com/gorilla/[email protected]/mux.go:210 +0x1cf
net/http.serverHandler.ServeHTTP({0xc5d6c20ae0?}, {0x34f3110, 0xc3e789e460}, 0xc190c4b900)
	/usr/local/go/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc0a4573360, {0x34f46c0, 0xc0004a4180})
	/usr/local/go/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:3102 +0x4db

After taking a look, it seems to me there's concurrent writing in Discoverers ApplyConfig to the jobToScrapeConfig map and reading of this map via GetScrapeConfigs() in the scrape config handler during the hashing of the config.

I'd suggest to protect the map with a mutex / return only a copy of the map for reading.

Somewhat related to similar issue in #1040

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant