-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CBG-3563 Implement automatic memory profiling #6904
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, just few small things I noticed. I tested locally by running the branch and seems to work well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor suggestions.
@@ -73,6 +76,20 @@ func DefaultStartupConfig(defaultLogFilePath string) StartupConfig { | |||
}, | |||
MaxFileDescriptors: DefaultMaxFileDescriptors, | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we shouldn't be doing the memlimit/mem calls when populating the config, and insterad this should instead be done in NewServerContext when the server context is created (in the scenario where config.API.HeapProfileCollectionThreshold is not specified). That also avoids logging here when logging isn't initialized, and also avoids this work altogether if the user's bootstrap config has explicitly set HeapProfileCollectionThreshold.
rest/config_startup.go
Outdated
CORS *auth.CORSConfig `json:"cors,omitempty"` | ||
HTTPS HTTPSConfig `json:"https,omitempty"` | ||
CORS *auth.CORSConfig `json:"cors,omitempty"` | ||
HeapProfileCollectionThreshold *uint64 `json:"heap_profile_collection_threshold,omitempty" help:"Threshold in bytes for collecting heap profiles automatically. If set, Sync Gateway will collect a memory profile when it exceeds this value. The default value will be set to 85% of the lesser of cgroup or system memory."` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure these belong in APIConfig (which has to do with the REST APIs). Was there a reason to put it here, as opposed to (say) BootstrapConfig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it doesn't belong in the bootstrap config because it doesn't have to do with couchbase server. I moved it to the toplevel.
I don't love that either, but I agree it doesn't below in API.
rest/server_context.go
Outdated
@@ -1666,7 +1667,18 @@ func (sc *ServerContext) logStats(ctx context.Context) error { | |||
// Marshal expvar map w/ timestamp to string and write to logs | |||
base.RecordStats(string(marshalled)) | |||
|
|||
return nil | |||
if sc.Config.API.HeapProfileDisableCollection { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The early returns here (1671 and 1677) feel out of place - it would be easy for someone to need to add something to logStats in future and just stick it at the end of the function. Maybe move all of this into a sc.collectMemoryProfile() function to encapsulate this?
rest/server_context.go
Outdated
@@ -178,6 +181,18 @@ func NewServerContext(ctx context.Context, config *StartupConfig, persistentConf | |||
sc.DatabaseInitManager = &DatabaseInitManager{} | |||
} | |||
|
|||
if config.HeapProfileCollectionThreshold != nil { | |||
sc.statsContext.heapProfileCollectionThreshold = int64(*config.HeapProfileCollectionThreshold) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like we should either define this as int64 or uint64 in both places, to avoid the risk of an unsafe cast when a user specifies a very large value in the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like using uint64 because that matches the values returned from gosputil or memlimit. The reason for not using it everywhere is that our stats are int64, so GoMemStatsHeapInUse
is int64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good just one thing I noticed then should be ready to go
} | ||
base.InfofCtx(ctx, base.KeyAll, "Memory usage %d exceeds threshold %d, collecting memory profile", currentMemory, profileCollectionThreshold) | ||
|
||
return sc.statsContext.collectMemoryProfile(ctx, sc.Config.Logging.LogFilePath, timestamp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like these lines (lines 1685 to 1695) can be removed? As these checks are done in collectMemoryProfile
This should write memory profiles like
pprof_heap_high_{timestamp}.pb.gz
to the log file directory, so sgcollect can pick them up.api.heap_profile_collection_threshold
is configurable but allow disabling withapi.disable_heap_profile_collection
https://docs.google.com/document/d/1Z8pqW-CEGpAdxSSQCakBf4tSKbmzZJGz2Ojlm2JmQ8o
Before merging this code, I want to test it in capella using their cgroup setup. After implementing this, this configuration should be documented (via DOC ticket).
Pre-review checklist
fmt.Print
,log.Print
, ...)base.UD(docID)
,base.MD(dbName)
)docs/api
Integration Tests
GSI=true,xattrs=true
https://jenkins.sgwdev.com/job/SyncGateway-Integration/2524/