Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Automatically generated regex validation for Quantity does not match the validation used by unmarshalerDecoder #665

Closed
hoyhbx opened this issue Apr 26, 2022 · 19 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@hoyhbx
Copy link

hoyhbx commented Apr 26, 2022

rabbitmq's cluster-operator(https://github.com/rabbitmq/cluster-operator) and a lot of other operators use controller-gen to automatically generate their CRDs.
For the Quantity field, controller-gen automatically generates the following JSON schema:

additionalProperties:
  anyOf:
  - type: integer
  - type: string
  pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
  x-kubernetes-int-or-string: true
description: 'Limits describes the maximum amount of compute resources
  allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/'
type: object

However, the automatically generated pattern does not match the validation pattern used by the unmarshalerDecoder, causing cases where the value is valid against the CRD schema, but invalid against the unmarshalerDecoder. This causes the invalid inputs being silently rejected by the operator, and all the subsequent inputs are silently rejected without fixing the invalid value.

Example

Concretely, when specifying the value as .07893E985.90504 for the field Quantity, there will be an error message from client-go shown in the operator's log:

E0421 07:19:52.910713       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1beta1.RabbitmqCluster: failed to list *v1beta1.RabbitmqCluster: v1beta1.RabbitmqClusterList.Items: []v1beta1.RabbitmqCluster: v1beta1.RabbitmqCluster.Spec: v1beta1.RabbitmqClusterSpec.Rabbitmq: v1beta1.RabbitmqClusterConfigurationSpec.Persistence: v1beta1.RabbitmqClusterPersistenceSpec.Storage: unmarshalerDecoder: quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$', error found in #10 byte of ...|985.90504"},"rabbitm|..., bigger context ...|de":{},"persistence":{"storage":".07893E985.90504"},"rabbitmq":{"additionalConfig":"cluster_partitio|...

The value .07893E985.90504 satisfies the regex generated by controller-gen ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$, but does not satisfy the regex required by the unmarshalerDecoder: ^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$. The input CR is silently rejected by the operator without any error message from kubectl, and all the subsequent CRs are rejected until the invalid value is fixed.

Additional comments

The regex validation generated by controller-gen seems to be looser than the validation used by the unmarshalerDecoder, which causes the apiserver fails to reject the invalid values.

Would it make more sense to changed the regex generated by controller-gen to match the one used by the unmarshalerDecoder here?:

Pattern: "^(\\+|-)?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\\+|-)?(([0-9]+(\\.[0-9]*)?)|(\\.[0-9]+))))?$",

@alvaroaleman
Copy link
Member

Yeah, should match what upstreams uses. Feel free to open a PR where you update it and reference the upstream source.

@Porges
Copy link
Contributor

Porges commented May 2, 2022

Unfortunately, upstream does not use a regexp for validation so it can’t be simply included: https://github.com/kubernetes/apimachinery/blob/3b8fb46ed6f145544bd363fab539decff47c3753/pkg/api/resource/quantity.go#L147

@hoyhbx
Copy link
Author

hoyhbx commented May 3, 2022

The validation does not come from the kubernetes' parsing, but from a component called unmarshalerDecoder. I am still trying to find where exactly this component is implemented, but there is definitely some validation being used according to the error message from client-go.

@alvaroaleman
Copy link
Member

Unmarshaler decoder is here: https://github.com/kubernetes/kubernetes/blob/f02682c628c530219966a00ae002d799f0d813dc/vendor/github.com/json-iterator/go/reflect_marshaler.go#L201

Regex is here but it is documented as being overly permissive: https://github.com/kubernetes/kubernetes/blob/f02682c628c530219966a00ae002d799f0d813dc/staging/src/k8s.io/apimachinery/pkg/api/resource/quantity.go#L136

If it is an improvement over what we currently have, we should probably still update - But first look into the history of what we have today and how it came to be.

@hoyhbx
Copy link
Author

hoyhbx commented May 3, 2022

Thanks for the pointers, sorry I didn't see the error message that the parseQuantityString was returning.
I think we should use a regex that conform to the validation used by the upstream, and the current one used by controller-gen does not because it's over permissive. The question is if the one in the upstream is good enough or there is a better regex that can express the validation better.

@Porges
Copy link
Contributor

Porges commented May 3, 2022

I'd also note that the documentation of what is permitted is incorrect with regards to what parseQuantityString actually allows. (e.g. n and u are permitted as suffixes)

@Porges
Copy link
Contributor

Porges commented May 3, 2022

Here's a first-pass take on a regex that matches upstream:

package tests

import (
	"fmt"
	"regexp"
	"testing"
)

// copied from: https://github.com/kubernetes/apimachinery/blob/3b8fb46ed6f145544bd363fab539decff47c3753/pkg/api/resource/quantity.go#L31-L49

// <digit>           ::= 0 | 1 | ... | 9
// <digits>          ::= <digit> | <digit><digits>
// <number>          ::= <digits> | <digits>.<digits> | <digits>. | .<digits>
var number = `((\.[0-9]+)|([0-9](\.[0-9]+)?)|([0-9]+\.))`

// <sign>            ::= "+" | "-"
var sign = `[-+]`

// <signedNumber>    ::= <number> | <sign><number>
var signedNumber = sign + "?" + number

// <binarySI>        ::= Ki | Mi | Gi | Ti | Pi | Ei
var binarySI = `([KMGTPE]i)`

// <decimalSI>       ::= m | "" | k | M | G | T | P | E
var decimalSI = `([numkMGTPE]|)` // note empty permitted, and also 'u' and 'n' are permitted, contrary to the description, see: https://github.com/kubernetes/apimachinery/blob/57f2a0733447cfd41294477d833cce6580faaca3/pkg/api/resource/suffix.go#L108-L135

// <decimalExponent> ::= "e" <signedNumber> | "E" <signedNumber>
var decimalExponent = `([eE]` + signedNumber + `)`

// <suffix>          ::= <binarySI> | <decimalExponent> | <decimalSI>
var suffix = `(` + binarySI + `|` + decimalExponent + `|` + decimalSI + `)`

// <quantity>        ::= <signedNumber><suffix>
var quantity = `^` + signedNumber + suffix + `$`

var quantityRegexp = regexp.MustCompile(quantity)

func ExampleFullForm() {
	fmt.Println(quantity)

	// Output:
	// ^[-+]?((\.[0-9]+)|([0-9](\.[0-9]+)?)|([0-9]+\.))(([KMGTPE]i)|([eE][-+]?((\.[0-9]+)|([0-9](\.[0-9]+)?)|([0-9]+\.)))|([mkMGTPE]|))$
}

func TestQuantityRegexpValid(t *testing.T) {
	validInputs := []string{
		"+1",
		"-1",
		"1",
		"1.",
		".1",
		"1.1",
		"12e6",
		"12E6",
		"12e-6",
		"12E-6",
		"12e+6",
		"12E+6",
		"12Mi",
		"12M",
		"0.1",
		"0.12",
		"0.123",
		"1u",
		"1n",
	}

	for _, validInput := range validInputs {
		if !quantityRegexp.MatchString(validInput) {
			t.Errorf("Input '%s' should be valid", validInput)
		}
	}
}

func TestQuantityRegexpInvalid(t *testing.T) {
	validInputs := []string{
		"1.x",
		"x.1",
		".",
		"12e--6",
		"12E++6",
		"12MiMi",
                "",
	}

	for _, validInput := range validInputs {
		if quantityRegexp.MatchString(validInput) {
			t.Errorf("Input '%s' should be invalid", validInput)
		}
	}
}

@hoyhbx
Copy link
Author

hoyhbx commented May 4, 2022

I'd also note that the documentation of what is permitted is incorrect with regards to what parseQuantityString actually allows. (e.g. n and u are permitted as suffixes)

Yeah I agree. And the parseQuantityString does not conform to the decimalExponent production rule. The production rule permits signed number, but parseQuantityString seems to only permit signed integer: https://github.com/kubernetes/kubernetes/blob/f02682c628c530219966a00ae002d799f0d813dc/staging/src/k8s.io/apimachinery/pkg/api/resource/quantity.go#L252, which makes .07893E985.90504 to fail the validation.

We should probably report to upstream of this inconsistency and hear their opinion.

@Porges
Copy link
Contributor

Porges commented May 5, 2022

@hoyhbx
Copy link
Author

hoyhbx commented Jun 14, 2022

Hi @Porges , I realized the regex above has a little typo.
The regex for number should be

var number = `((\.[0-9]+)|([0-9]+(\.[0-9]+)?)|([0-9]+\.))`

instead of

var number = `((\.[0-9]+)|([0-9](\.[0-9]+)?)|([0-9]+\.))`

Otherwise, 992m would fail the validation

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 12, 2022
@hoyhbx
Copy link
Author

hoyhbx commented Sep 19, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 19, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 18, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 17, 2023
@camilamacedo86 camilamacedo86 added kind/bug Categorizes issue or PR as related to a bug. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 31, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 1, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 31, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 30, 2023
@Porges
Copy link
Contributor

Porges commented Jul 4, 2023

lol. lmao.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants