Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support changing replication factor #4

Open
lacarvalho91 opened this issue Jun 12, 2017 · 3 comments
Open

Support changing replication factor #4

lacarvalho91 opened this issue Jun 12, 2017 · 3 comments

Comments

@lacarvalho91
Copy link
Contributor

Currently we don't support changing the replication factor for an existing topic as it isn't as trivial as the other configurations.

See the second answer here to see how it can be done.

@oliverlockwood
Copy link
Contributor

I've experimented with this. For example, a candidate configuration for the command line kafka-reassign-partitions operation for a topic named map.raw-sq.it with 10 partitions and a replication factor of 3 (copied from one which Kafka created in this manner) would be:

{
  "version":1,
  "partitions":[
     {"topic":"map.raw-sq.it","partition":0,"replicas":[1,2,3]},
     {"topic":"map.raw-sq.it","partition":1,"replicas":[2,3,4]},
     {"topic":"map.raw-sq.it","partition":2,"replicas":[3,4,5]},
     {"topic":"map.raw-sq.it","partition":3,"replicas":[4,5,1]},
     {"topic":"map.raw-sq.it","partition":4,"replicas":[5,1,2]},
     {"topic":"map.raw-sq.it","partition":5,"replicas":[1,3,4]},
     {"topic":"map.raw-sq.it","partition":6,"replicas":[2,4,5]},
     {"topic":"map.raw-sq.it","partition":7,"replicas":[3,5,1]},
     {"topic":"map.raw-sq.it","partition":8,"replicas":[4,1,1]},
     {"topic":"map.raw-sq.it","partition":9,"replicas":[5,2,3]}
  ]
}

This mechanism works fine if the number of partitions is unchanged. (So obviously any change in the number of partitions must be made first.)

In terms of doing this work, I suggest:

  • AdminUtils.assignReplicasToBrokers seems like the way forward to generate the partition-replica mapping - using the manner that it's called by AdminUtils.createTopic
  • Then we can call AdminUtils.createOrUpdateTopicPartitionAssignmentPathInZK() in the way that it's called by AdminUtils.addPartitions().

I think this would generate a fairly "clean" setup.

@oliverlockwood
Copy link
Contributor

In manual testing of something knocked together along these lines:

  def updateReplicationFactor(topicName: String, numPartitions: Int, replicationFactor: Int): Try[Unit] =
  for {
    brokerMetadatas <- getBrokerMetadatas
    replicaAssignment <- assignReplicasToBrokers(brokerMetadatas, numPartitions, replicationFactor)
  } yield Try { AdminUtils.createOrUpdateTopicPartitionAssignmentPathInZK(zkUtils, topicName, replicaAssignment, update = true) }

  private def getBrokerMetadatas: Try[Seq[BrokerMetadata]] = Try {
    AdminUtils.getBrokerMetadatas(zkUtils)
  }

  private def assignReplicasToBrokers(brokerMetadatas: Seq[BrokerMetadata], numPartitions: Int, replicationFactor: Int) = Try {
    AdminUtils.assignReplicasToBrokers(brokerMetadatas, numPartitions, replicationFactor)
  }

we went from:

Topic:map.raw-sq.it	PartitionCount:1	ReplicationFactor:2	Configs:min.compaction.lag.ms=3600000,delete.retention.ms=604800000,min.insync.replicas=2,cleanup.policy=compact
	Topic: map.raw-sq.it	Partition: 0	Leader: 1	Replicas: 1,2	Isr: 2,1

and ended up in a weird state:

Topic:map.raw-sq.it	PartitionCount:10	ReplicationFactor:3	Configs:min.compaction.lag.ms=3600000,delete.retention.ms=604800000,min.insync.replicas=2,cleanup.policy=compact
	Topic: map.raw-sq.it	Partition: 0	Leader: 1	Replicas: 3,5,1	Isr: 2,1
	Topic: map.raw-sq.it	Partition: 1	Leader: 4	Replicas: 4,1,2	Isr: 4,1,2
	Topic: map.raw-sq.it	Partition: 2	Leader: 5	Replicas: 5,2,3	Isr: 5,2,3
	Topic: map.raw-sq.it	Partition: 3	Leader: 1	Replicas: 1,3,4	Isr: 1,3,4
	Topic: map.raw-sq.it	Partition: 4	Leader: 2	Replicas: 2,4,5	Isr: 2,4,5
	Topic: map.raw-sq.it	Partition: 5	Leader: 3	Replicas: 3,1,2	Isr: 3,1,2
	Topic: map.raw-sq.it	Partition: 6	Leader: 4	Replicas: 4,2,3	Isr: 4,2,3
	Topic: map.raw-sq.it	Partition: 7	Leader: 5	Replicas: 5,3,4	Isr: 5,3,4
	Topic: map.raw-sq.it	Partition: 8	Leader: 1	Replicas: 1,4,5	Isr: 1,4,5
	Topic: map.raw-sq.it	Partition: 9	Leader: 2	Replicas: 2,5,1	Isr: 2,5,1

This didn't resolve itself until I forced a fix using kafka-reassign-partitions with a JSON file. So obviously, it's not as simple as I might have hoped.

For context, the logic for kafka-reassign-partitions is here.

@oliverlockwood
Copy link
Contributor

oliverlockwood commented Aug 29, 2017

I added this commit to the update-replication-factor branch. This works in a simple manual test case, but obviously needs thorough testing before it can be considered for proper inclusion in the project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants