-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[service/proctelemetry] Unset HOST_PROC to make sure we use and report the actual process and introduce an option to allow user to overwrite the unset logic and use the value set in the environment variable #7434
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: bug_fix | ||
|
||
# The name of the component, or a single word describing the area of concern, (e.g. otlpreceiver) | ||
component: service/proctelemetry | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: Make sure OTEL internal metrics is reading and reporting its own process data when HOST_PROC is set set | ||
|
||
# One or more tracking issues or pull requests related to the change | ||
issues: [7434] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: If user wants to use the path defined in the environment variable HOST_PROC, then they can set `useHostProcEnvVar` to true |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -63,18 +63,27 @@ type processMetrics struct { | |
|
||
// RegisterProcessMetrics creates a new set of processMetrics (mem, cpu) that can be used to measure | ||
// basic information about this process. | ||
func RegisterProcessMetrics(ocRegistry *metric.Registry, mp otelmetric.MeterProvider, useOtel bool, ballastSizeBytes uint64) error { | ||
func RegisterProcessMetrics(ocRegistry *metric.Registry, mp otelmetric.MeterProvider, useOtel bool, ballastSizeBytes uint64, useHostProcEnvVar bool) error { | ||
var err error | ||
pm := &processMetrics{ | ||
startTimeUnixNano: time.Now().UnixNano(), | ||
ballastSizeBytes: ballastSizeBytes, | ||
ms: &runtime.MemStats{}, | ||
} | ||
|
||
hostproc, exists := os.LookupEnv("HOST_PROC") | ||
// unset HOST_PROC env variable so the process search occurs locally | ||
if !useHostProcEnvVar && exists { | ||
os.Unsetenv("HOST_PROC") | ||
} | ||
pm.proc, err = process.NewProcess(int32(os.Getpid())) | ||
if err != nil { | ||
return err | ||
} | ||
// restore immediately if it was previously set, don't use defer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we don't use I think a better approach for this would be for |
||
if !useHostProcEnvVar && exists { | ||
os.Setenv("HOST_PROC", hostproc) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you know if there's a chance of memoization or is there some guarantee from gopsutil that subsequent lookups w/ the restored env var will be reliably using the hostfs procfs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also this should be probably be deferred so that it always happens moving forward regardless of changes in error handling. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, just realized that subsequent stat updates are all though gopsutil so this would need to happen for each update cycle, which is almost certainly going to cause race conditions w/ the host metrics receiver. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yep, this is a bust! gopsutil re-use the hostproc on every call :(
|
||
} | ||
|
||
if useOtel { | ||
return pm.recordWithOtel(mp.Meter(scopeName)) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ package telemetry // import "go.opentelemetry.io/collector/service/telemetry" | |
|
||
import ( | ||
"fmt" | ||
"os" | ||
|
||
"go.uber.org/zap/zapcore" | ||
|
||
|
@@ -116,6 +117,10 @@ type MetricsConfig struct { | |
|
||
// Address is the [address]:port that metrics exposition should be bound to. | ||
Address string `mapstructure:"address"` | ||
|
||
// UseHostProcEnvVar when set to true, the metric server, through gopsutil, | ||
// lookup for the otel process in the proc path defined in the HOST_PROC environment variable. | ||
UseHostProcEnvVar bool `mapstructure:"useHostProcEnvVar"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we actually need to expose this? If it's unclear, I would remove this from the PR to avoid extending the public API surface |
||
} | ||
|
||
// TracesConfig exposes the common Telemetry configuration for collector's internal spans. | ||
|
@@ -134,6 +139,13 @@ func (c *Config) Validate() error { | |
if c.Metrics.Level != configtelemetry.LevelNone && c.Metrics.Address == "" { | ||
return fmt.Errorf("collector telemetry metric address should exist when metric level is not none") | ||
} | ||
// Validate that the hostproc env variable is set when metric server is enabled | ||
if c.Metrics.Level != configtelemetry.LevelNone && c.Metrics.Address != "" && c.Metrics.UseHostProcEnvVar { | ||
if _, exists := os.LookupEnv("HOST_PROC"); !exists { | ||
return fmt.Errorf("collector telemetry metric UueHostProcEnvVar " + | ||
"is set to true, but HOST_PROC env variavle is not set") | ||
} | ||
} | ||
|
||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue is that this pid needs to be from the host's perspective** and not gathered from the container's syscall.