Skip to content

Latest commit

 

History

History
597 lines (440 loc) · 21.2 KB

CHANGELOG.md

File metadata and controls

597 lines (440 loc) · 21.2 KB

unreleased

v0.19.2

  • build(deps): bump github.com/spf13/viper from 1.18.1 to 1.18.2
  • build(deps): bump github/codeql-action from 2 to 3
  • build(deps): bump github.com/spf13/viper from 1.18.0 to 1.18.1
  • build(deps): bump github.com/spf13/viper from 1.17.0 to 1.18.0
  • chore: remove codeql workflow as nobody uses it
  • fix: config test with no config, should get error
  • build(deps): bump github.com/circonus-labs/go-apiclient from 0.7.23 to 0.7.24
  • build: add changelog config
  • build: skip go in lint workflow

v0.19.1

  • build: add after hook for grype on generated sboms
  • build: add .sbom for archive artifacts
  • build: add before hooks for go mod tidy, govulncheck and golangci-lint

v0.19.0

  • chore(goreleaser): remove archives.rlcp -- deprecated
  • fix: binary location for copy (docker)
  • fix: changelog typos and h1>h2
  • build(deps): bump github.com/spf13/cobra from 1.6.1 to 1.8.0
  • build(deps): bump github.com/circonus-labs/circonus-gometrics/v3 from 3.4.6 to 3.4.7
  • build(deps): bump github.com/spf13/viper from 1.15.0 to 1.17.0
  • fix: vulnerability golang.org/x/net to v0.17.0
  • build(deps): bump github.com/rs/zerolog from 1.29.0 to 1.31.0
  • fix(cv1): more implicit memory usage in loop
  • fix(cv1): implicit memory access
  • build(deps): bump go.uber.org/automaxprocs from 1.5.2 to 1.5.3

v0.18.0

  • feat: add interval, collect deadline, submit deadline to collect config startup message
  • feat: add uncompressed data size to stats
  • fix: clarify tracking messages
  • fix: typo in key name
  • feat: add kubelet ver and ver mode info msg
  • chore: add rlcp:true for future deprecation
  • chore: debug msgs for deadlines
  • feat: add --submit-deadline option
  • feat: add --collect-deadline option
  • chore: use submit and collect deadline options
  • chore: switch from submit metrics to flush collector metrics wrapper
  • chore: improved stats messaging
  • fix: make metric submitter private and add public wrapper with deadline

v0.17.0

  • fix(goreleaser): deprecated syntax
  • build(deps): bump github.com/circonus-labs/go-apiclient from 0.7.22 to 0.7.23
  • feat: implement 30s deadline with retry on metric submission
  • fix: update deps for security vulnerabilities
  • fix: tags for submission metrics

v0.16.1

  • feat: add start/finish status messages around collectors

v0.16.0

  • feat: add summary/stats into v2 collection (was supposed to have been deprecated in 1.18) but is still present and user is looking for usageNanoCores.

v0.15.0

  • chore: update warning when alert config file not found to not stutter the file nam
  • fix: clean non-semver GKE k8s version (metric filters)
  • feat: update skydns filter to include all metrics
  • chore: update warning when metric filter file not found to not stutter the file name
  • feat: debug messages for using config vs annotations
  • feat: emit collection config info msg
  • fix: struct alignment
  • fix: check K8SEnableDNSMetrics if scrape is false
  • fix: port message when using prot from config instead of annotation
  • fix: skip events older than 1m
  • fix: clean non-semver GKE k8s version (node collectors)
  • chore: clean up imports
  • fix: add GKE skydns filter
  • fix: drop NaN metrics when queueing
  • feat: go1.19 for strings.Cut

v0.14.0

  • feat: add node_name tag for pods
  • feat: add owner tag for pods
  • feat: add support for label_tags=* to turn all labels into tags for the given object

v0.13.0

2022-08-08

(833c904b)
Tags: v0.13.0

(ea5236bb)
Add support for CoreDNS. Add auto-detection of CoreDNS if detection of kube-dns fails. Add configuration key to enable users to set the default CoreDNS metrics scrape port.

(34650875)
Fix agent trying to add an erroneous tag to the check bundle.

(3c1bb387)
Revert change to move metric creation into conditional in cluster collector.

(328f105d)
Fix logic in creating default rulesets and metric filters to return required vals and determine cluster version before creating them.

(36652400)
Refactor code to be more easily readable and and check all errors.

(19fbcc25)
Fix whitespace issues in self-contained metricfilters. Fix error wrapping in k8s api call. Add k8s version const for comparison usage.

(3a0637ae)
Fix default metric filters for k8s v1.20+ to search for the correct metric names.

(4f532ab5)
This update adds support for CoreDNS, which is automatically detected if kube-dns does not exist. It also adds a configuration key to enable users to set the default CoreDNS metrics scrape port. We can expect that users will run kube-dns OR coredns, but not both.

(5576f42e)
This change will resolve the agent trying to add an erroneous tag to the check bundle.

(619f9973)
This just moves a call to AddText out of a conditional

(6820aa34)
This change will fix the logic to return the values required and correctly determine the cluster version for creating the default rules and filters.

(70216e56)
This change refactors code to be simpler (more easily readable) and check all errors

(8b8935a8)
This change is mostly for code quality reasons and will likely have minimal visibility to our users (aside from the ones who decide to audit the k8s agent).

(af4bcb01)
This change will fix the v1.20+ MetricFilters to search for the correct metric names.

(1d64029c)
dependency-name: github.com/spf13/cobra dependency-type: direct:production update-type: version-update:semver-minor

(f7aa1308)
dependency-name: actions/checkout dependency-type: direct:production update-type: version-update:semver-major

v0.12.6

  • upd: log error message on cn mismatch in tls verify
  • upd: lint issues
  • upd: add pre-build lint back in
  • add: build_lint.sh script
  • upd: lint config
  • fix: ensure probe and resource metrics collected in sequential mode
  • build(deps): bump github.com/spf13/viper from 1.10.0 to 1.10.1
  • build(deps): bump github.com/rs/zerolog from 1.26.0 to 1.26.1

v0.12.5

  • upd: update dependencies to latest versions

v0.12.4

  • add: namespace tag to all dc targets

v0.12.3

  • add: metric_filter configuration to helm chart files
  • build(deps): bump github.com/spf13/viper from 1.8.1 to 1.9.0

v0.12.2

  • upd: bump github.com/rs/zerolog from 1.24.0 to 1.25.0
  • upd: disable rule sets by default
  • upd: cgm v3.4.6
  • upd: ignore test/
  • upd: tighten tag limit enforcement
  • upd: struct align
  • build(deps): bump github.com/rs/zerolog from 1.23.0 to 1.24.0
  • build(deps): bump github.com/pelletier/go-toml from 1.9.3 to 1.9.4

v0.12.1

  • fix: broker cluster support for CA/CN validation
  • upd: struct align
  • upd: contributed helm chart

v0.12.0

  • add: configuration options for >v1.18 k8s node metrics
  • add: support v1.18+ deprecation of cadvisor endpoints
  • upd: min tls ver
  • upd: lint struct size
  • upd: dependencies
  • add: dependabot
  • upd: lint ver
  • upd: dependencies (viper/cobra/zerolog/etc.)

v0.11.7

  • upd: add metric type to nan detection err msg
  • upd: dc request timeouts
  • add: collect_deadline_timeout tracking metric
  • add: collection deadline tied to collection interval
  • upd: dependencies
  • add: automaxprocs
  • upd: syntax change for docker
  • upd: lint version 1.38
  • mrg: PR59 - custom/deployment.yaml mismatch with args_kubernetes.go

v0.11.6

v0.11.5

  • doc: update dynamic collector documentation
  • upd: centralize config parsing
  • upd: control setting, use label/annotation value as a boolean rather than comparison
  • add: more logging on setting parse failures
  • add: node/pod status skip if not ready

v0.11.4

  • add: dynamic collector filter to allow all dc metrics by default
  • add: only apply local filters if rule enabled
  • add: enable flag on filter rules
  • add: disable dynamic collector rule with no local filter
  • upd: dependencies (specifically cgm for better invalid tag messages)
  • upd: pass cluster config to check
  • fix: rollup setting parsing
  • fix: filter and log NaNs

v0.11.3

  • fix: default dynamic collector file extension .json->.yaml (must be yaml)

v0.11.2

  • add: support annotation/label/value for dynamic collection rollup

v0.11.1

  • add: rollup setting for dynamic collection
  • add: support annotation/label/value for schema

v0.11.0

  • add: dynamic collectors - define objects (endpoints, nodes, pods, services) to collect metrics from in configuration CIRC-5871
  • upd: refactor ksm collection to be more intelligent w/re to port used (not all deployment methods name ports the same) CIRC-5890
  • add: static ksm port option to configuration CIRC-5890
  • upd: deprecate ksm mode and telemetry port options CIRC-5890
  • upd: pod_pending, network_unavailable and cpu_utilization rulesets with windowing_min_duration CIRC-5875
  • upd: return error when no metrics received from ksm so it can be expose in dashboard
  • upd: add check for NaN values (skip) in metric processing
  • upd: emit warning when no metrics to submit, with number processed (e.g. locally filtered)
  • upd: use epoch for log timestamps (performance)
  • upd: refactor cli arg handling

v0.10.4

  • add: additional logging for ksm collection/processing
  • upd: add field selector to ksm errors for service and endpoint queries
  • upd: example args (debug) to default deployment
  • upd: switching to main errors pkg and new error handling
  • upd: latest lint release

v0.10.3

  • upd: alter default network_unavailable ruleset to help with spurious alerts CIRC-5849

v0.10.2

  • upd: add on absence rule to all default rulesets in order to clear stale alerts
  • upd: remove previously created ruleset if configuration updated to disable a default ruleset

v0.10.1

  • upd: send node conditions when status changes
  • upd: remove unused text metrics
  • add: usage millicores for res req/lim comparison
  • fix: need ':' when category only tag
  • upd: ensure sorted tag list
  • add: resource request/limit metric filters
  • add: hpa metric filters
  • upd: add rollup:false to events

v0.10.0

  • doc: add observation deployment instructions
  • add: observation/sizing mode deployment manifests deploy/observation
  • upd: remove log warning when no metrics sent for specific collectors for sizing mode
  • add: total events counter metric
  • upd: rename internal tracking stats for clarity when logged
  • add: ability to disable default alert rulesets
  • upd: expose threshold & window settings for all default rulesets
  • upd: refactor stream tag handling
  • add: agent metric mutex
  • upd: reset any observed event counters
  • add: event counter metrics
  • upd: validate tag category and value lengths to match broker rules
  • add: max tag len, max category len
  • add: timestamp to derived ksm metrics
  • add: collect_k8s_pod_count metric
  • add: collect_k8s_node_count metric
  • fix: whitespace

v0.9.10

  • fix: separate ksm and agent metrics
  • upd: refactor submission stats
  • add: --log-agent-metrics for debugging

v0.9.9

  • upd: refactor, common clientset func
  • add: k8s version metric collect_k8s_ver
  • upd: switch to go-client for in-cluster api interactions
  • upd: default k8s url host kubernetes.default.svc
  • fix: downgrade to go1.14

v0.9.8

  • fix: revert back to pod_status_ready and pod_status_scheduled

v0.9.7

  • add: pod_status text metric filter
  • upd: pod_container_status and pod_status_phase filters
  • fix: lint min tls ver
  • upd: go1.15

v0.9.6

  • add: pod_status_phase metric filter
  • add: pod_container_status text metrics and metric filter

v0.9.5

  • add: derived metrics to enhance dashboard performance
  • add: IncrementCounterByValue method
  • upd: force lowercase tag categories
  • upd: dependencies
  • upd: stub hpa endpoints
  • fix: binary name for updated goreleaser
  • upd: refactor dns annotation and explicit port use

v0.9.4

  • add: kube-dns-metrics-port - used when scrape/port annotations are NOT defined on the kube-dns service (e.g. GKE)

v0.9.3

  • fix: broker cn check both ip and external_host
  • upd: add debug line when using custom api ca cert
  • upd: use latest for default (simple) deployment
  • upd: remove path validation from api url
  • upd: add /v2 path to default api url

v0.9.2

  • upd: explicit cases for prometheus metric types
  • add: golangci-lint action
  • add: ksm collection state metric
  • upd: collect immediately then start intervals
  • add: dns collection state metric
  • fix: correct dns config option name
  • doc: update with dns configuration information
  • add: support kubedns* metrics if backing kube-dns service
  • add: initial event if event watching is enabled
  • add: support for broker's "filtered" back into per submission stat
  • add: logging of result if broker returned an err msg

v0.9.1

  • add: lookup_key to rule_sets
  • upd: dependencies (apiclient, cgm, toml, yaml, viper, zerolog)

v0.9.0

  • upd: deployment configurations to v0.9.0
  • add: ksm request mode (direct or proxy)
  • mrg: ksm field selector update v0.8.0
  • upd: use json for default rules, reduce friction of maintenance (both configuration.yaml and check.go use json)
  • upd: metric filter rules for coredns health metrics
  • add: default alerting and custom rules support
  • add: _avg for prom histograms (sum/count) for health dashboard dns metrics
  • add: config items for configmap json files (metric-filters.json, default-rules.json, custom-rules.json)
  • upd: default settings enable required collections for dashboard
  • upd: split deployment configurations into two: deploy/default/ (simplified) and deploy/custom/ full control
  • upd: dns metrics, use pod for tag
  • upd: metric filters
  • add: health dashboard specific metrics
  • upd: remove nodeSelector from deployment.yaml (old k8s versions lack kubernetes.io/os: linux label)
  • upd: configuration.yaml to enable needed collection to support dashboard
  • add: metric filter rules for health dashboard
  • add: node cpu utilization for health dashboard
  • add: cluster name to check for tagging check, rules, contacts

v0.8.0

  • add: kube-state-metrics field selector

v0.7.1

  • add: contributed helm chart

v0.7.0

  • NOTE: metrics-server option is deprecated
  • add: metric filter rules to configuration for dns, api errors, and api auth
  • upd: refactor sequencing and number of go routines for metric collection
  • add: basic local filtering of metrics by namerx in rules (reduce memory utilization, bandwidth, and broker load)
  • add: --nodecc argument to turn on concurrent collection for node/pod/container metrics (mem vs speed/cpu tradeoff) default is off
  • upd: collect dns metrics from each kube-dns pod, default true, for new health dashboard - can be turned off in configuration
  • add: api-server metrics collection, default true, for new health dashboard - can be turned off in configuration

v0.6.6

  • fix: force float64 for used percentages
  • upd: include units:percent for fs metric filters

v0.6.5

  • add: update metric filters from configuration on every start. deployment configuration is definitive source for metric filters.
  • fix: add default tags to internal collect_* metrics
  • fix: remove escaping quotes in string metric values
  • upd: remove unused methods
  • add: used percent for fs/volume metrics
  • add: metrics.k8s.io to rbac

v0.6.4

  • add: support https for certain kube-state-metrics configurations. port names prefixed with https- will trigger api server proxy urls using https:.

v0.6.3

  • add: make kube-state-metrics port names for metrics and telemetry configurable. Default from 'standard' service deployment. metrics=http-metrics and telemetry=telemetry.

v0.6.2

  • add: optional, metric collection for kube-dns
  • upd: default check.target to cluster.name if target is unset
  • add: error if neither metric or telemetry ports found in ksm service definition
  • add: warn if metric port not found in ksm service definition
  • add: warn if telemetry port not found in ksm service definition
  • add: debug message for ksm urls being used

v0.6.1

  • add: __rollup:false stream tag to remaining high cardinality metrics

v0.6.0

  • Switch to httptrap:kubernetes check type. To preserve metric continuity - if an httptrap:kubernetes check is not found, the agent will search for an httptrap check and use that if found. Otherwise, it will create a new check using the new check with correct sub-type.

v0.5.8

  • add: optional collection of cadvisor metrics from kubelet

v0.5.7

  • add: option --serial-submissions to disable concurrent submissions
  • upd: default to concurrent submissions w/timestamp metrics
  • upd: deprecate streaming metrics as an option
  • fix: use metric queue for events
  • fix: ensure all metrics have timestamp, address drift on retries
  • fix: /health output

v0.5.6

  • upd: sequential to use contextual logger for messages when submitting
  • add: use sequential for stream if not concurrent
  • add: status code to api response errors
  • add: sequential metric submitter (gaps)
  • add: option for concurrent metric submission
  • add: option for max metric bucket size for promtext
  • upd: use config option for max metric bucket size
  • fix: remove redundant call parameter for promtext

v0.5.5

  • upd: increase metric bucket size to 1000
  • fix: typo in metric name
  • upd: use resultLogger to identify source of submission errors/retries
  • add: set collect_submit_retries to 0 at start of each collection run

v0.5.4

  • add: liveness probe support /health
  • upd: increase default pool size to 2
  • upd: resource request/limit example (commented out)
  • add: ksm service error collect metric

v0.5.3

  • fix: async_metrics (queue/stream)
  • fix: drain streamed metrics in bucket
  • fix: use cluster ctx to honor signals

v0.5.2

  • add: collect submit counters (success,fail,error)

v0.5.1

  • upd: normalize collection metric tags for more clarity

v0.5.0

  • add: logging on submit retries and non-200 responses
  • add: api request time limit (default:10s)
  • add: kube_pod_deleted to metric filters
  • add: agent metric
  • add: api request error metrics
  • add: collection duration, api latency, and metric submission latency histograms
  • upd: failure message when cluster(s) can't be initialized resulting in 0 clusters
  • fix: ensure default tags used in queued metrics

v0.4.5

  • fix: handle empty tag lists better, e.g. events with no tags
  • doc: elaborate on need for settings to be uncommented in both configuration and deployment
  • fix: remove available memory metric for pods and containers since it is not provided

v0.4.4

  • fix: typo in network errors rule

v0.4.3

  • upd: metric filter rule to include storage capacity for pod volumes
  • add: per-interface network metrics
  • add: node capacity_ephemeral_storage metric

v0.4.2

  • add: node capacity metrics: capacity_cpu, capacity_memory, and capacity_pods

v0.4.1

  • add: collect_interval metric
  • add: default_streamtags option to apply a set of tags to all metrics

v0.4.0

  • upd: implement check metric_filters to collect only metrics in dashboard
  • add: chunk large arbitrary metric collectors (ksm and ms)
  • upd: collection metrics - remove accept/filter, add agent memory metrics
  • add: collection metrics - agent memory and goroutine metrics
  • upd: dependencies

v0.3.1

  • add: collection summary metrics (sent,accept,filter,bytes,duration)

v0.3.0

  • add: --no-base64 switch to disable for test/debug (base64 stream tags)
  • add: --no-gzip switch to disable for test/debug (gzip trap submissions)
  • upd: switch to using gzip compression for trap submissions as default
  • fix: stream tag quote escaping in printf

v0.2.0

  • add: abridged events
  • add: pod filtering by label key/val

v0.1.0

  • initial preview release