-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Logging as a dependency (with context) #714
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chuckha The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
} | ||
kp.Cert = EncodeCertPEM(x509Cert) | ||
kp.Key = EncodePrivateKeyPEM(privKey) | ||
func generateCACert(kp *v1alpha1.KeyPair, user string) (v1alpha1.KeyPair, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this function to only do one thing (generate certs) instead of one or the other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worthwhile to add a comment / description as part of the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the request. I like to reserve comments for when something unexpected is happening, but that's not really the case here. Can you elaborate on what you'd like changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I noticed the functions don't have any comments on what they do, just adding a small comment might be useful :)
cmd/manager/main.go
Outdated
@@ -71,12 +71,13 @@ func main() { | |||
// Initialize cluster actuator. | |||
clusterActuator := cluster.NewActuator(cluster.ActuatorParams{ | |||
Client: cs.ClusterV1alpha1(), | |||
LoggingContext: "[cluster actuator]", | |||
LoggingContext: "[\033[32mcluster actuator\033[0m]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the color makes it super easy to read, but I'm ok if this isn't good for whatever reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like the idea of adding color, but I'm not necessarily sure we can rely on the user's shell supporting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be configurable
@detiber I made a first pass at adding namespaces, any places I've missed please add a comment to the PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this!
Left a few comments.
pkg/cloud/aws/actuators/scope.go
Outdated
@@ -76,6 +77,7 @@ func NewScope(params ScopeParams) (*Scope, error) { | |||
ClusterClient: clusterClient, | |||
ClusterConfig: clusterConfig, | |||
ClusterStatus: clusterStatus, | |||
Logger: params.Logger, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use the information we have in the cluster and machine Scope to prefix WithName with apigroup + /namespace + [/machine/ or / cluster/] and avoid the repetition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I really like this idea, except I would say:
apigroup + /namespace + /cluster/ [ + machine/ ], since the machine will have a cluster (at least with the current code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even better!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeeee that sounds great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some other comments, they're mostly minor ones
@@ -115,7 +118,7 @@ func (a *Actuator) Delete(cluster *clusterv1.Cluster) error { | |||
} | |||
|
|||
if err := ec2svc.DeleteNetwork(); err != nil { | |||
klog.Errorf("Error deleting cluster %v: %v.", cluster.Name, err) | |||
a.log.Error(err, "Error deleting cluster", "cluster-name", cluster.Name, "cluster-namespace", cluster.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use the scoped logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it could
@@ -112,7 +117,7 @@ func (a *Actuator) isNodeJoin(scope *actuators.MachineScope, controlPlaneMachine | |||
return false, errors.Wrapf(err, "failed to verify existence of machine %q", m.Name()) | |||
} | |||
|
|||
klog.V(2).Infof("Machine %q should join the controlplane: %t", scope.Machine.Name, ok) | |||
a.log.V(2).Info("Machine joining control plane", "machine-name", scope.Machine.Name, "machine-namespace", scope.Machine.Name, "should-join-control-plane", ok) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, could this use the scoped logger?
} | ||
} | ||
|
||
// MachineConfigFromProviderSpec tries to decode the JSON-encoded spec, falling back on getting a MachineClass if the value is absent. | ||
func MachineConfigFromProviderSpec(clusterClient client.MachineClassesGetter, providerConfig clusterv1.ProviderSpec) (*v1alpha1.AWSMachineProviderSpec, error) { | ||
func MachineConfigFromProviderSpec(clusterClient client.MachineClassesGetter, providerConfig clusterv1.ProviderSpec, log logr.Logger) (*v1alpha1.AWSMachineProviderSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this function be made a method? Is it used anywhere else? If so, I'm not sure if the log lines should actually be in here or expect the caller would log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this needs refactoring. I'll add it to the list of follow on issues.
} | ||
} | ||
|
||
return &config, nil | ||
} | ||
|
||
func unmarshalProviderSpec(spec *runtime.RawExtension) (*v1alpha1.AWSMachineProviderSpec, error) { | ||
func unmarshalProviderSpec(spec *runtime.RawExtension, log logr.Logger) (*v1alpha1.AWSMachineProviderSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also added
return &Scope{ | ||
AWSClients: params.AWSClients, | ||
Cluster: params.Cluster, | ||
ClusterClient: clusterClient, | ||
ClusterConfig: clusterConfig, | ||
ClusterStatus: clusterStatus, | ||
Logger: params.Logger.WithName(params.Cluster.APIVersion).WithName(params.Cluster.Namespace).WithName(params.Cluster.Name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does chaining WithName
work here? It'd be useful to form a path here with proper prefixes like in @detiber's other comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chaining adds /
between each .WithName
} | ||
kp.Cert = EncodeCertPEM(x509Cert) | ||
kp.Key = EncodePrivateKeyPEM(privKey) | ||
func generateCACert(kp *v1alpha1.KeyPair, user string) (v1alpha1.KeyPair, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worthwhile to add a comment / description as part of the change
return false | ||
} | ||
secondTime, err := time.Parse(createDateTimestampFormat, aws.StringValue(i[j].CreationDate)) | ||
if err != nil { | ||
klog.Infof("unable to parse an AMI creation timestamp: %q", aws.StringValue(i[j].CreationDate)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing these might be making it hard for users to debug, is there any way to bubble this up? Opening a related issue if there is no simple solution is completely ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. This is part of #716.
return nil | ||
} | ||
|
||
instance, err := s.describeBastionInstance() | ||
if err != nil { | ||
if awserrors.IsNotFound(err) { | ||
klog.V(2).Info("bastion instance does not exist") | ||
s.scope.V(2).Info("bastion instance does not exist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These bastion logs should probably have a reference to the id or instance name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, agreed, will add that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Adds context for logs and removes excessive logging Signed-off-by: Chuck Ha <[email protected]>
scope, err := NewScope(ScopeParams{ | ||
AWSClients: params.AWSClients, | ||
Client: params.Client, Cluster: params.Cluster, | ||
Logger: params.Logger, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the WithName
context be added here instead of below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, otherwise machine will come before the APIVersion
/lgtm |
Adds context for logs and removes excessive logging Signed-off-by: Chuck Ha <[email protected]>
* Update the releasing docs (#689) * Add error reason to output if fail to checkout an account from boskos (#698) * Temporary workaround a data issue in boskos service (#699) * Update checkout_account.py to not reuse connections (#700) * Fix checkout_account.py (#702) * Make hack/checkin_account.py executable (#703) * Fix: all traffic ingress rule triggers fatal nil dereference (#697) * fix: respect all traffic security group rules (and others) For anything besides tcp, udp, icmp, and icmpv6 there is no applicable notion of "port range." AWS omits FromPort and ToPort in its responses, causing a fatal nil dereference when attempting to read any security groups with e.g. an "all traffic" rule. * fix: omit description when empty string * fix: handle more security groups without crashing This commit cleans up and clarifies a few of the less obvious components of the previous work. * fix: handle more security groups without crashing Address linter failures. * fix: handle more security groups without crashing Usage needs to match declaration. Computers are sticklers about that sort of thing. * fix: handle more security groups without crashing Add clarifying comment to serializer function. * Fixes a bug and adds tests for kubeadm defaults (#707) The pointers were not working as expected so the API is changing to be more functional and leverage kubernetes' DeepCopy function. * Update listed v1.14 AMIs to v1.14.1 (#708) * Update listed v1.14 AMIs to v1.14.1 * Update README with list of published AMIs/Kubernetes versions * GZIP user-data (#710) Signed-off-by: Vince Prignano <[email protected]> * Make sure Calico can talk IP-in-IP (#701) * MAke sure Calico can talk IP-in-IP * Add IP in IP protocol to the control plane security group * Add IPv4 protocol definition and make sure it's handled properly. * Make port ranges AWS complient and security groups more restrictive. * Fix security groups * Adds tests to kubeadm defaults (#709) Attempt at documenting the assumptions made in the kubeadm defaults code. Signed-off-by: Chuck Ha <[email protected]> * Logging (#713) * Adds logr as dependency Signed-off-by: Chuck Ha <[email protected]> * Use logr in the cluster actuator This only creates the logger. Does not yet swap out actual klog calls. Signed-off-by: Chuck Ha <[email protected]> * update bazel Signed-off-by: Chuck Ha <[email protected]> * update Signed-off-by: Chuck Ha <[email protected]> * Switch dep to use release-0.1 branch instead of version (#715) * Adds logr as dependency (#714) Adds context for logs and removes excessive logging Signed-off-by: Chuck Ha <[email protected]> * Ensure `make manifests` generates machines file for HA control plane too. (#720) * Add HA machines template * Introduce HA machines file in `make manifests` target * Add clusterawsadm as make dependency to manifests make target. (#721) Ensures manifests are generated from the current state of the source. Assuming $GOPATH/bin is in the $PATH * Update to Go 1.12 (#719) Signed-off-by: Vince Prignano <[email protected]> * Add ability to override Organization ID for image lookups (#723) * Add ability to override Organization ID for image lookups * Update pkg/cloud/aws/services/ec2/ami.go Co-Authored-By: detiber <[email protected]> * Add updated generated crd * feat: support customizing root device size (#718) * feat: support customizing root device size * chore: re-generate CRDs * fix: update formatting * chore: add comment describing Service.sdkToInstance * chore: make service.SDKToInstance public * Rename BUILD -> BUILD.bazel for consistency (#724) find . -type file -name BUILD -not -path "./vendor/*" | xargs -n1 -I{} -- git mv {} {}.bazel Preferred build name changed in 3788fb1 Fixes #722 * Adds retry-on-conflict during updates (#725) * Adds retry-on-conflict during updates Signed-off-by: Chuck Ha <[email protected]> * adds note about status update caveat Signed-off-by: Chuck Ha <[email protected]> * clarify errors/comments Signed-off-by: Chuck Ha <[email protected]> * Add the HA machines configuration to bazel (#733) Signed-off-by: Chuck Ha <[email protected]> * Ensure bazel is the correct version (#731) Signed-off-by: Chuck Ha <[email protected]> * Update OWNERS_ALIASES and SECURITY_CONTACTS (#712) * Fix the prow jobs (#735) Signed-off-by: Chuck Ha <[email protected]> * Fix markdown formatting (#736) * extract fmt from release tool (#738) Signed-off-by: Chuck Ha <[email protected]> * Use DEFAULT_REGION as the default and REGION as the supplied (#739) Signed-off-by: Chuck Ha <[email protected]> * e2e testing improvement (#743) * Bump kind version * Remove docker load in favor of kind load for e2e cluster Signed-off-by: Chuck Ha <[email protected]> * fix: Don't try to update root size when it's unset (#726) * fix: Don't try to update root size when it's unset This commit looks for empty RootDeviceSize in the spec and ignores it. Otherwise, none of our control plane machines were updating with this error: ``` E0418 23:07:48.250925 1 controller.go:214] Error updating machine "ns/controlplane-2": found attempt to change immutable state for machine "controlplane-2": ["Root volume size cannot be mutated from 8 to 0"] ``` * fix: updates without specifying a root volume size Add unit test. * fix: updates without specifying a root volume size Fix gofmt. * Scope nodeRef to workload cluster (#744) Signed-off-by: Vince Prignano <[email protected]> * Fix NPE on delete bastion host (#746) Signed-off-by: Vince Prignano <[email protected]> * Documentation for creating a new cluster on a different AWS account (#728) * Initial draft of documentation for Cluster creation using cross account role assumption * Update roleassumption.md Complete the document. * cleanup the documentation for roleassumption * Resolved the comments: role assumption documentation. * Fix minor issues - roleassumption.md * resolve more comments to roleassumption.md * Resolve more comments - roleassumption.md * include machines-ha.yaml.template in release artifacts (#741) * Update AWS sdk, improve log in machine actuator delete (#747) Signed-off-by: Vince Prignano <[email protected]> * Fixes the infinite reconcile loop (#748) * Uses patch for updating the cluster and machine specs - patch does not cause a re-reconcile in the capi controller * Uses update for updating the cluster and machine status - update for status is ok since it does not update any of the metadata no re-reconcile is necessary for the capi controller Signed-off-by: Chuck Ha <[email protected]> * Update Gopkg.lock and cleanup Makefile (#751) * Update cluster-api release-0.1 vendor (#750) Signed-off-by: Vince Prignano <[email protected]> * Reduce the number of re-reconciles (#752) Signed-off-by: Chuck Ha <[email protected]>
* Update the releasing docs (kubernetes-sigs#689) * Add error reason to output if fail to checkout an account from boskos (kubernetes-sigs#698) * Temporary workaround a data issue in boskos service (kubernetes-sigs#699) * Update checkout_account.py to not reuse connections (kubernetes-sigs#700) * Fix checkout_account.py (kubernetes-sigs#702) * Make hack/checkin_account.py executable (kubernetes-sigs#703) * Fix: all traffic ingress rule triggers fatal nil dereference (kubernetes-sigs#697) * fix: respect all traffic security group rules (and others) For anything besides tcp, udp, icmp, and icmpv6 there is no applicable notion of "port range." AWS omits FromPort and ToPort in its responses, causing a fatal nil dereference when attempting to read any security groups with e.g. an "all traffic" rule. * fix: omit description when empty string * fix: handle more security groups without crashing This commit cleans up and clarifies a few of the less obvious components of the previous work. * fix: handle more security groups without crashing Address linter failures. * fix: handle more security groups without crashing Usage needs to match declaration. Computers are sticklers about that sort of thing. * fix: handle more security groups without crashing Add clarifying comment to serializer function. * Fixes a bug and adds tests for kubeadm defaults (kubernetes-sigs#707) The pointers were not working as expected so the API is changing to be more functional and leverage kubernetes' DeepCopy function. * Update listed v1.14 AMIs to v1.14.1 (kubernetes-sigs#708) * Update listed v1.14 AMIs to v1.14.1 * Update README with list of published AMIs/Kubernetes versions * GZIP user-data (kubernetes-sigs#710) Signed-off-by: Vince Prignano <[email protected]> * Make sure Calico can talk IP-in-IP (kubernetes-sigs#701) * MAke sure Calico can talk IP-in-IP * Add IP in IP protocol to the control plane security group * Add IPv4 protocol definition and make sure it's handled properly. * Make port ranges AWS complient and security groups more restrictive. * Fix security groups * Adds tests to kubeadm defaults (kubernetes-sigs#709) Attempt at documenting the assumptions made in the kubeadm defaults code. Signed-off-by: Chuck Ha <[email protected]> * Logging (kubernetes-sigs#713) * Adds logr as dependency Signed-off-by: Chuck Ha <[email protected]> * Use logr in the cluster actuator This only creates the logger. Does not yet swap out actual klog calls. Signed-off-by: Chuck Ha <[email protected]> * update bazel Signed-off-by: Chuck Ha <[email protected]> * update Signed-off-by: Chuck Ha <[email protected]> * Switch dep to use release-0.1 branch instead of version (kubernetes-sigs#715) * Adds logr as dependency (kubernetes-sigs#714) Adds context for logs and removes excessive logging Signed-off-by: Chuck Ha <[email protected]> * Ensure `make manifests` generates machines file for HA control plane too. (kubernetes-sigs#720) * Add HA machines template * Introduce HA machines file in `make manifests` target * Add clusterawsadm as make dependency to manifests make target. (kubernetes-sigs#721) Ensures manifests are generated from the current state of the source. Assuming $GOPATH/bin is in the $PATH * Update to Go 1.12 (kubernetes-sigs#719) Signed-off-by: Vince Prignano <[email protected]> * Add ability to override Organization ID for image lookups (kubernetes-sigs#723) * Add ability to override Organization ID for image lookups * Update pkg/cloud/aws/services/ec2/ami.go Co-Authored-By: detiber <[email protected]> * Add updated generated crd * feat: support customizing root device size (kubernetes-sigs#718) * feat: support customizing root device size * chore: re-generate CRDs * fix: update formatting * chore: add comment describing Service.sdkToInstance * chore: make service.SDKToInstance public * Rename BUILD -> BUILD.bazel for consistency (kubernetes-sigs#724) find . -type file -name BUILD -not -path "./vendor/*" | xargs -n1 -I{} -- git mv {} {}.bazel Preferred build name changed in 3788fb1 Fixes kubernetes-sigs#722 * Adds retry-on-conflict during updates (kubernetes-sigs#725) * Adds retry-on-conflict during updates Signed-off-by: Chuck Ha <[email protected]> * adds note about status update caveat Signed-off-by: Chuck Ha <[email protected]> * clarify errors/comments Signed-off-by: Chuck Ha <[email protected]> * Add the HA machines configuration to bazel (kubernetes-sigs#733) Signed-off-by: Chuck Ha <[email protected]> * Ensure bazel is the correct version (kubernetes-sigs#731) Signed-off-by: Chuck Ha <[email protected]> * Update OWNERS_ALIASES and SECURITY_CONTACTS (kubernetes-sigs#712) * Fix the prow jobs (kubernetes-sigs#735) Signed-off-by: Chuck Ha <[email protected]> * Fix markdown formatting (kubernetes-sigs#736) * extract fmt from release tool (kubernetes-sigs#738) Signed-off-by: Chuck Ha <[email protected]> * Use DEFAULT_REGION as the default and REGION as the supplied (kubernetes-sigs#739) Signed-off-by: Chuck Ha <[email protected]> * e2e testing improvement (kubernetes-sigs#743) * Bump kind version * Remove docker load in favor of kind load for e2e cluster Signed-off-by: Chuck Ha <[email protected]> * fix: Don't try to update root size when it's unset (kubernetes-sigs#726) * fix: Don't try to update root size when it's unset This commit looks for empty RootDeviceSize in the spec and ignores it. Otherwise, none of our control plane machines were updating with this error: ``` E0418 23:07:48.250925 1 controller.go:214] Error updating machine "ns/controlplane-2": found attempt to change immutable state for machine "controlplane-2": ["Root volume size cannot be mutated from 8 to 0"] ``` * fix: updates without specifying a root volume size Add unit test. * fix: updates without specifying a root volume size Fix gofmt. * Scope nodeRef to workload cluster (kubernetes-sigs#744) Signed-off-by: Vince Prignano <[email protected]> * Fix NPE on delete bastion host (kubernetes-sigs#746) Signed-off-by: Vince Prignano <[email protected]> * Documentation for creating a new cluster on a different AWS account (kubernetes-sigs#728) * Initial draft of documentation for Cluster creation using cross account role assumption * Update roleassumption.md Complete the document. * cleanup the documentation for roleassumption * Resolved the comments: role assumption documentation. * Fix minor issues - roleassumption.md * resolve more comments to roleassumption.md * Resolve more comments - roleassumption.md * include machines-ha.yaml.template in release artifacts (kubernetes-sigs#741) * Update AWS sdk, improve log in machine actuator delete (kubernetes-sigs#747) Signed-off-by: Vince Prignano <[email protected]> * Fixes the infinite reconcile loop (kubernetes-sigs#748) * Uses patch for updating the cluster and machine specs - patch does not cause a re-reconcile in the capi controller * Uses update for updating the cluster and machine status - update for status is ok since it does not update any of the metadata no re-reconcile is necessary for the capi controller Signed-off-by: Chuck Ha <[email protected]> * Update Gopkg.lock and cleanup Makefile (kubernetes-sigs#751) * Update cluster-api release-0.1 vendor (kubernetes-sigs#750) Signed-off-by: Vince Prignano <[email protected]> * Reduce the number of re-reconciles (kubernetes-sigs#752) Signed-off-by: Chuck Ha <[email protected]>
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #641
Fixes #323
Special notes for your reviewer:
I'm adding comments to the parts you should pay attention to. Generally I've tried to make the logging a bit more consistent and have changed a few things, but left a few things unchanged that could use improvement. This now logs objects as json and we should probably clean up the logging, but it is much easier to understand what the manager is actually doing now.
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
Release note: