Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mo-service: track active connections #18788

Merged
merged 2 commits into from
Sep 14, 2024
Merged

Conversation

reusee
Copy link
Contributor

@reusee reusee commented Sep 13, 2024

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #18725

What this PR does / why we need it:

track connections to help debug.


PR Type

Enhancement


Description

  • Implemented a new feature to track active connections in the mo-service using the conntrack library.
  • Added logging for connection statistics and alerts when the number of connections exceeds a specified threshold.
  • Updated go.mod and go.sum to include new dependencies for connection tracking and updated existing dependencies.

Changes walkthrough 📝

Relevant files
Enhancement
debug_connections.go
Implement active connection tracking and logging                 

cmd/mo-service/debug_connections.go

  • Added a new Go file for tracking active connections.
  • Implemented connection tracking using conntrack and netfilter.
  • Logs connection statistics and alerts when connections exceed a
    threshold.
  • Utilizes goroutines for concurrent processing of connection events.
  • +120/-0 
    Dependencies
    go.mod
    Update dependencies for connection tracking                           

    go.mod

  • Added new dependencies for conntrack and netfilter.
  • Updated existing dependencies to newer versions.
  • +10/-4   
    go.sum
    Update dependency checksums                                                           

    go.sum

    • Updated checksums for new and updated dependencies.
    +22/-10 

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @matrix-meow matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Sep 13, 2024
    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Performance Concern
    The connection tracking goroutine runs indefinitely without a way to stop it, which may lead to resource leaks if the service is restarted or needs to be shut down gracefully.

    Error Handling
    The error returned from c.Listen() is not properly handled. If an error occurs, the function returns without closing the connection or stopping the goroutine.

    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Best practice
    Implement graceful shutdown using context cancellation

    Consider using a context with cancellation for graceful shutdown. This allows for
    proper cleanup of resources and goroutine termination when the service is stopped.

    cmd/mo-service/debug_connections.go [58-64]

    +ctx, cancel := context.WithCancel(context.Background())
     go func() {
         defer c.Close()
    +    defer cancel()
     
         activeConns := make(map[uint32]*conntrack.Flow)
         ticker := time.NewTicker(time.Second * 7)
    +    defer ticker.Stop()
     
         for {
             select {
    +        case <-ctx.Done():
    +            return
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Using a context with cancellation is a best practice for managing goroutines and ensuring resources are properly cleaned up during shutdown, which is crucial for robust service operation.

    9
    Performance
    Use a concurrent map for better performance in high-concurrency scenarios

    Consider using a more efficient data structure for activeConns, such as a concurrent
    map (sync.Map) or a custom thread-safe map implementation. This can improve
    performance in scenarios with high concurrency.

    cmd/mo-service/debug_connections.go [61]

    -activeConns := make(map[uint32]*conntrack.Flow)
    +activeConns := sync.Map{}
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Switching to a concurrent map like sync.Map can improve performance in high-concurrency scenarios by reducing contention, making it a valuable enhancement for the connection tracking logic.

    8
    Reduce the buffer size of the events channel to optimize memory usage

    Consider using a buffered channel with a smaller capacity for events. The current
    capacity of 65536 might be excessive and could potentially lead to high memory usage
    if the channel fills up faster than it can be processed.

    cmd/mo-service/debug_connections.go [46]

    -events := make(chan conntrack.Event, 65536)
    +events := make(chan conntrack.Event, 1024)
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion to reduce the buffer size of the events channel is valid as it can help optimize memory usage. However, the exact buffer size should be determined based on expected load and performance testing.

    7
    Enhancement
    Separate thresholds for total connections and per-port connections

    The tooManyThreshold constant is used for both connection count and individual port
    statistics. Consider separating these thresholds for more fine-grained control over
    alerting and logging.

    cmd/mo-service/debug_connections.go [35-37]

     const (
    -    tooManyThreshold = 4096
    +    tooManyConnectionsThreshold = 4096
    +    tooManyPortConnectionsThreshold = 409 // 10% of total connections threshold
     )
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Separating thresholds allows for more precise control over connection management and alerting, which can enhance the monitoring and management of network connections.

    8

    @mergify mergify bot merged commit 9064eab into matrixorigin:main Sep 14, 2024
    18 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    3 participants