-
Notifications
You must be signed in to change notification settings - Fork 773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node pod probe daemon #1077
node pod probe daemon #1077
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1077 +/- ##
==========================================
- Coverage 50.29% 49.77% -0.52%
==========================================
Files 126 130 +4
Lines 17855 18348 +493
==========================================
+ Hits 8980 9133 +153
- Misses 7886 8215 +329
- Partials 989 1000 +11
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
) | ||
updateQueue := workqueue.NewNamedRateLimitingQueue( | ||
// Backoff duration from 500ms to 50~55s | ||
workqueue.NewItemExponentialFailureRateLimiter(500*time.Millisecond, 50*time.Second+time.Millisecond*time.Duration(rand.Intn(5000))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G404: Use of weak random number generator (math/rand instead of crypto/rand)
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
} | ||
queue := workqueue.NewNamedRateLimitingQueue( | ||
// Backoff duration from 500ms to 50~55s | ||
workqueue.NewItemExponentialFailureRateLimiter(500*time.Millisecond, 50*time.Second+time.Millisecond*time.Duration(rand.Intn(5000))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G404: Use of weak random number generator (math/rand instead of crypto/rand)
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
2ec676b
to
4b19381
Compare
InitialDelaySeconds: 100, | ||
}, | ||
}, | ||
PodProbeMarkerName: "ppm-1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typecheck: unknown field PodProbeMarkerName
in struct literal
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
InitialDelaySeconds: 100, | ||
}, | ||
}, | ||
PodProbeMarkerName: "ppm-1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typecheck: unknown field PodProbeMarkerName
in struct literal
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
4b19381
to
7e07dba
Compare
// current only support exec | ||
// todo: http, tcp | ||
if p.Exec != nil { | ||
return pb.exec.Probe(pb.newExecInContainer(containerID, p.Exec.Command, timeout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plz log the command, container and pod information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time the probe is executed, it prints whether the log is too large. Currently, the details are printed when the probe worker run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least plz logging the command if exec fails
pkg/daemon/podprobe/prober.go
Outdated
func (pb *prober) newExecInContainer(containerID string, cmd []string, timeout time.Duration) exec.Cmd { | ||
return &execInContainer{run: func() ([]byte, error) { | ||
// current ignore stdout | ||
_, stderr, err := pb.runtimeService.ExecSync(containerID, cmd, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider mix stdout and stderr together as the built-in probe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If probe success, return stdout, otherwise return stderr?
msg = msg[:maxMessageLength] | ||
} | ||
if err != nil || (result != probe.Success && result != probe.Warning) { | ||
return appsv1alpha1.ProbeFailed, msg, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider record event for probing result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is event necessary here? Currently kruise daemon prints the log.
7e07dba
to
0c70034
Compare
for key, worker := range c.workers { | ||
if _, ok := validProbe[key]; !ok { | ||
klog.Infof("NodePodProbe stop pod(%s/%s) container(%s) probe(%s) worker", key.podUID, key.containerName, key.probeName) | ||
worker.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove worker from c.workers ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time worker may be probing, so when the worker exits, it will be removed from c.workers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be race condition that the workers map will be concurrently write and delete by different goroutines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, currently relies on a locking mechanism to ensure. The current logic also refers to the kubelet implementation.
0c70034
to
f186e69
Compare
test/e2e/apps/podprobemarker.go
Outdated
nodeList, err := c.CoreV1().Nodes().List(context.TODO(), metav1.ListOptions{}) | ||
gomega.Expect(err).NotTo(gomega.HaveOccurred()) | ||
if len(nodeList.Items) == 0 { | ||
ginkgo.By(fmt.Sprintf("pod probe markers list nodeList is zero")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S1039: unnecessary use of fmt.Sprintf
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
f186e69
to
be4ba70
Compare
be4ba70
to
fcb8e1f
Compare
// Let the worker wait for a random portion of tickerPeriod before probing. | ||
// Do it only if the kruise daemon has started recently. | ||
if probeTickerPeriod > time.Since(w.probeController.start) { | ||
time.Sleep(time.Duration(rand.Float64() * float64(probeTickerPeriod))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
G404: Use of weak random number generator (math/rand instead of crypto/rand)
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
@@ -23,6 +23,8 @@ import ( | |||
"net/http" | |||
"sync" | |||
|
|||
"github.com/openkruise/kruise/pkg/daemon/podprobe" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort import
} | ||
|
||
klog.Infof("Starting NodePodProbe controller") | ||
// Launch one workers to process resources, for there is only one nodePodProbe per Node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one workers
a worker
// guards the cache | ||
sync.RWMutex | ||
// map of container ID -> probe Result | ||
cache map[string]Update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about sync.Map{}
without RWMutex?
fcb8e1f
to
5b65270
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
daemonruntime "github.com/openkruise/kruise/pkg/daemon/criruntime" | ||
"github.com/openkruise/kruise/pkg/daemon/imagepuller" | ||
daemonoptions "github.com/openkruise/kruise/pkg/daemon/options" | ||
"github.com/openkruise/kruise/pkg/daemon/podprobe" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typecheck: could not import github.com/openkruise/kruise/pkg/daemon/podprobe (-: could not load export data: no export data for "github.com/openkruise/kruise/pkg/daemon/podprobe")
ℹ️ Learn about @sonatype-lift commands
You can reply with the following commands. For example, reply with @sonatype-lift ignoreall to leave out all findings.
Command | Usage |
---|---|
@sonatype-lift ignore |
Leave out the above finding from this PR |
@sonatype-lift ignoreall |
Leave out all the existing findings from this PR |
@sonatype-lift exclude <file|issue|path|tool> |
Exclude specified file|issue|path|tool from Lift findings by updating your config.toml file |
Note: When talking to LiftBot, you need to refresh the page to see its response.
Click here to add LiftBot to another repo.
Was this a good recommendation?
[ 🙁 Not relevant ] - [ 😕 Won't fix ] - [ 😑 Not critical, will fix ] - [ 🙂 Critical, will fix ] - [ 😊 Critical, fixing now ]
Signed-off-by: liheng.zms <[email protected]>
5b65270
to
92b94b8
Compare
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: FillZpp The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
Signed-off-by: peng.xin <[email protected]> optimize workloadspread when suitable subset maxReplicas is niil (openkruise#1066) Signed-off-by: mingzhou.swx <[email protected]> Signed-off-by: mingzhou.swx <[email protected]> Co-authored-by: mingzhou.swx <[email protected]> Optimize performance of LabelSelector conversion (openkruise#1068) Signed-off-by: FillZpp <[email protected]> Signed-off-by: FillZpp <[email protected]> Support timezone for AdvancedCronJob (openkruise#1070) Signed-off-by: FillZpp <[email protected]> Signed-off-by: FillZpp <[email protected]> ignore if pod condition has been updated (openkruise#1074) Signed-off-by: mingzhou.swx <[email protected]> Signed-off-by: mingzhou.swx <[email protected]> Co-authored-by: mingzhou.swx <[email protected]> pod probe marker apis (openkruise#1073) * pod probe marker apis Signed-off-by: liheng.zms <[email protected]> * fix statefulset truncateHistory panic Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> pod probe marker controller (openkruise#1075) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> consider whether patch field is matched when assign existing pods to subset (openkruise#1083) Signed-off-by: mingzhou.swx <[email protected]> Signed-off-by: mingzhou.swx <[email protected]> Co-authored-by: mingzhou.swx <[email protected]> pod probe marker webhook (openkruise#1078) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> sidecarset support pods ns(kube-system, kube-public) (openkruise#1084) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> pod probe marker proposal (openkruise#1053) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> node pod probe daemon (openkruise#1077) Signed-off-by: liheng.zms <[email protected]> sidecarset support patch pod metadata proposal (openkruise#993) Signed-off-by: liheng.zms <[email protected]> Support predownload image in ads (openkruise#1057) Signed-off-by: Abner <[email protected]> * support predownload image for ads Add PreDownloadImageForDaemonSetUpdate featureGate (openkruise#1093) Signed-off-by: FillZpp <[email protected]> Signed-off-by: FillZpp <[email protected]> Add changelog for v1.3.0 (openkruise#1092) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> fix sidecarset inject annotations abnormal (openkruise#1101) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> fix InPlaceUpdateEnvFromMetadata bug (openkruise#1108) Co-authored-by: 郭已钦 <[email protected]> support uniteddeployment persistentVolumeClaimRetentionPolicy inherit from advancestatefulset template (openkruise#1110) Signed-off-by: cheyuexian <[email protected]> Signed-off-by: cheyuexian <[email protected]> Co-authored-by: cheyuexian <[email protected]> Add PreNormal Lifecycle Hook for CloneSet (openkruise#1071) * add pre-normal(pre-available) hook Signed-off-by: mingzhou.swx <[email protected]> * add pre-normal(pre-available) hook Signed-off-by: mingzhou.swx <[email protected]> Signed-off-by: mingzhou.swx <[email protected]> Co-authored-by: mingzhou.swx <[email protected]> partition support float percent (openkruise#1124) Signed-off-by: shiyan2016 <[email protected]> Signed-off-by: shiyan2016 <[email protected]> improve error hanlding of inplace update env from metadata (openkruise#1125) Signed-off-by: mingzhou.swx <[email protected]> Signed-off-by: mingzhou.swx <[email protected]> Co-authored-by: mingzhou.swx <[email protected]> pub support to configure Evict,Delete,Update Operation (openkruise#1126) Signed-off-by: liheng.zms <[email protected]> Signed-off-by: liheng.zms <[email protected]> feat: add watch whitelist feat: add watch whitelist add watch whitelist check add dynamic watch option feat: add persistent annotations format feat: add pps support custom workload Signed-off-by: peng.xin <[email protected]> fix: nodeSelector and nodeAffinity have not been modified Signed-off-by: peng.xin <[email protected]> fix: error string Signed-off-by: peng.xin <[email protected]> fix: error string Signed-off-by: peng.xin <[email protected]> fix: goimports Signed-off-by: peng.xin <[email protected]> refactor: refactoring ersistentPodAnnotations type Signed-off-by: peng.xin <[email protected]> - add statefulet group check - modify the pod create processing logic Signed-off-by: peng.xin <[email protected]> refactor: pps watch of whitelist workload Signed-off-by: peng.xin <[email protected]> fix: add statefulsetlike's pod controller Signed-off-by: peng.xin <[email protected]>
Signed-off-by: liheng.zms [email protected]
Ⅰ. Describe what this PR does
Ⅱ. Does this pull request fix one issue?
Ⅲ. Describe how to verify it
Ⅳ. Special notes for reviews