Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix validation of replication factor changes across APIs #13442

Closed
dotnwat opened this issue Sep 14, 2023 · 2 comments · Fixed by #14020
Closed

Fix validation of replication factor changes across APIs #13442

dotnwat opened this issue Sep 14, 2023 · 2 comments · Fixed by #14020
Assignees
Labels
area/kafka kind/bug Something isn't working

Comments

@dotnwat
Copy link
Member

dotnwat commented Sep 14, 2023

In this ticket #13422 it is reported that alter topic config allows replication factor to be changed to 2. It may be the case that this does not take affect, but no error is returned, and the change should probably be rejected as it is when requesting replication factor 2 when creating topics.

There are 4 places where validation may take place:

  1. Create topics API
  2. Alter topics API
  3. Incremental alter topics API
  4. The Redpanda controller itself

To some extent the controller should have the final say in the matter, but we still do validation in some cases at the kafka layer for simplicity.

This ticket should cover:

  1. Determine what the expected validation criteria is in each of the 4 locations above (roughly ownership is 1-3 enterprise team 4 replication team) and ensure that that validation is done correctly.
  2. Add tests that cover the cases in (1)

It is tempting to unify the validation across at least locations 1-3, but that might turn into a larger project. The validation in create topics api is simple, but it appears that validation in alter/incremental-alter is far more complicated and nuanced.

Additional context https://redpandadata.slack.com/archives/C01ND4SVB6Z/p1694596910870699?thread_ts=1694596910.870699&cid=C01ND4SVB6Z

@michael-redpanda
Copy link
Contributor

  • Create topics:

    • valid_range_end = validate_requests_range(
      begin,
      valid_range_end,
      std::back_inserter(response.data.topics),
      validators{});
    • Uses validators from here:
      using validators = make_validator_types<
      creatable_topic,
      custom_partition_assignment_negative_partition_count,
      partition_count_must_be_positive,
      replication_factor_must_be_positive,
      replication_factor_must_be_odd,
      replicas_diversity,
      compression_type_validator,
      compaction_strategy_validator,
      timestamp_type_validator,
      cleanup_policy_validator,
      remote_read_and_write_are_not_supported_for_read_replica,
      batch_max_bytes_limits,
      subject_name_strategy_validator>;
    • Validation structure:
      struct replication_factor_must_be_odd {
      static constexpr error_code ec = error_code::invalid_replication_factor;
      static constexpr const char* error_message
      = "Replication factor must be an odd number - 1,3,5,7,9,11...";
      static bool is_valid(const creatable_topic& c) {
      if (!c.assignments.empty()) {
      return true;
      }
      return (c.replication_factor % 2) == 1;
      }
      };
  • Alter topics incremental

    • no validation (tested using rpk topic alter-config test --set replicas=2)
  • Alter topics

    • no validation either (scanned through code until both the path for alter configs and incremental alter configs met)
  • Controller

    • no validation, even on create topics (removed validator check mentioned above and create topics with rf=2 passed through fine)

From the above list, the only place that performs validation across all the configs is create_topics. These validators are:

  • If performing custom partition assignment, ensure rf=-1 and partitionCount=-1
  • If not custom, ensure partitionCount > 0
  • If not custom, ensure rf>0
  • If not custom, ensure rf is odd
  • If custom assignment, ensure replica set is a set of unique nodes
  • Compression type validation
  • Compaction type validation
  • Timestamp type validation
  • Clean up policy validation
  • Ensure remote read and remote write are not enabled for read replicas
  • Validate value of max.message.bytes
  • Validate SSSV SNS policy

I believe all of these items should also be included as validation for items 2 & 3 above.

@michael-redpanda
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kafka kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants