-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-1215: Rack-Aware replica assignment option #132
Conversation
kafka-trunk-git-pr #127 SUCCESS |
Conflicts: core/src/test/scala/unit/kafka/admin/TopicCommandTest.scala Updated test to remove JUnit3Suite
kafka-trunk-git-pr #145 SUCCESS |
* 1 -> 3,1,5 | ||
* 2 -> 1,5,4 | ||
* 3 -> 5,4,2 | ||
* 4 -> 4,2,1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4,2,0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Will fix.
Conflicts: core/src/main/scala/kafka/server/KafkaApis.scala core/src/main/scala/kafka/server/KafkaConfig.scala core/src/test/scala/unit/kafka/server/KafkaConfigTest.scala Separate out function to get replica list for replica assignment.
kafka-trunk-git-pr #187 FAILURE |
The failed test in the last PR build passed on my laptop locally. It seems to be flaky as it also failed on an earlier build without my changes. |
kafka-trunk-git-pr #193 SUCCESS |
TestUtils.waitUntilLeaderIsElectedOrChanged(zkClient, topic, 0) | ||
val assignment = ZkUtils.getReplicaAssignmentForTopics(zkClient, Seq(topic)) | ||
.map(p => p._1.partition -> p._2) | ||
val brokerRackMap = Map(0 -> "rack1", 1 -> "rack1", 2 -> "rack2", 3 -> "rack2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add some negative testing please, folks do weird and odd things in their properties by accident and we want to guard against that too, etc
one overall general comment on the implementation is that the brokers properties themselves could cary this information making it so the topic creator doesn't have to know this. the fact is that still a lot of humans run the topic command but in many cases it is some software system operationally doing it. in either case if the broker had a property rack=0 or whatever it could then just be the way you have the topic distribute that information it should already be able to gather. Granted, this implementation saves a lot of having to store it in zookeeper so rationally speaking this is better than putting any code in to that. Sorry if I missed the entire discussion thread on this just seeing it for first time. I like it, would love to see this get into trunk and start to be used and also in the next release. Nice work so far!!! |
Please see KIP-36 for the latest proposal. (https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment) The biggest difference is that the rack information is added as a broker meta data in ZooKeeper. Consequently the inter-broker (UpdateMetadataRequest) and client to broker meta data query protocol (TopicMetadataResponse) will be changed to have rack information. Once the KIP is accepted, I will update this PR to incorporate these new ideas. |
Wouldn't it be nice to have metric per topic partition, in how many different racks do ISRs live? |
Conflicts: clients/src/test/java/org/apache/kafka/test/TestSslUtils.java core/src/main/scala/kafka/admin/AdminUtils.scala core/src/main/scala/kafka/admin/ReassignPartitionsCommand.scala core/src/main/scala/kafka/admin/TopicCommand.scala core/src/main/scala/kafka/server/KafkaApis.scala core/src/main/scala/kafka/server/KafkaConfig.scala
… Updated ZkUtils for serializing and deserializing broker information with rack. Refactoring AdminUtils.assignReplicaToBrokers to support rack aware assignment.
…ReassigPartitionCommand rack aware. Fix Junit tests.
…seTest.testSerialization
@allenxwang, can you please fix the merge conflicts? |
|
||
|
||
@Test | ||
def testAssignmentWithRackAwareWith12Partitions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything special with 12 partitions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. :)
In general, these tests run very fast since all they do is operate on collections in memory. So I have not thought about reducing the number of tests.
@granthenke : For your previous question on the json version for ZK registration, my preference is still to do the version change now. This way, our hands are not tied for potential future changes and it's also easier to document this. As for compatibility, most people will probably be on 0.9.0.1 before they upgrade to 0.10.0. So the impact should be limited. |
Conflicts: core/src/test/scala/unit/kafka/admin/AdminRackAwareTest.scala Added logic to prevent assigning replicas twice to the same broker for the same partition and enhanced tests for that.
val numPartitions = 6 | ||
val replicationFactor = 4 | ||
val brokerMetadatas = toBrokerMetadata(rackInfo) | ||
assertEquals(brokerList, brokerMetadatas.map(_.id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: we don't really need this assertEquals
or the brokerList
val since that is checking that toBrokerMetadata
works correctly, which is not the purpose of this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is value in checking this to make sure test set up is correct. Otherwise if toBrokerMetadata
is changed, there are two possibilities:
- Test fails and it is difficult to debug why it fails
- Test passes but is actually weakened
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@allenxwang, we use toBrokerMetadata
in many other tests and we don't check its behaviour in the other cases, so it looks a bit inconsistent. In my opinion, if we want to be sure about its behaviour, we should write a test for it instead of checking its behaviour inside other tests. In any case, this is a very minor point and I'm fine if we leave as is.
@junrao Thanks for the confirmation. I understand json version will change. |
* apache/trunk: KAFKA-3013: Display the topic-partition in the exception message for expired batches in recordAccumulator KAFKA-3375; Suppress deprecated warnings where reasonable and tweak compiler settings KAFKA-3373; add 'log' prefix to configurations in KIP-31/32 MINOR: Remove unused method, redundant in interface definition and add final for object used in sychronization KAFKA-3395: prefix job id to internal topic names KAFKA-2551; Update Unclean leader election docs KAFKA-3047: Explicit offset assignment in Log.append can corrupt the log
Thanks for the patch. LGTM. Could you rebase? |
@allenxwang, the following merges master into your branch: allenxwang#5 |
KAFKA-1215: Merge master and fix conflicts
Implements tiered storage aware `ListOffsetRequest`. With tiered storage, semantics of querying the beginning or end offset using `ListOffsetRequest` remain unchanged - it returns the true log start and end offset respectively, including any tiered portion of the log. `TierListOffsetRequest` is tiering aware and provides a mechanism to query the local log start offset for example. `TierListOffsetRequest` is an internal inter-broker API which will be used to figure out the point from which lagging or new followers begin replication.
TICKET = N/A LI_DESCRIPTION = Bintray is sunsetting, and the publishing should be migrated to https://linkedin.jfrog.io/
Squashed commits: - [LI-HOTFIX] add Bintray support to LinkedIn Kafka Github - [LI-HOTFIX] Migrate bintray publish to JFrog (apache#132) - [LI-HOTFIX] Use JFrog Api key instead of password (apache#139) TICKET = N/A LI_DESCRIPTION = DEPENG-2065. This is advised by dep engineering EXIT_CRITERIA = When not using JFrog for publishing
Squashed commits: - [LI-HOTFIX] add Bintray support to LinkedIn Kafka Github - [LI-HOTFIX] Migrate bintray publish to JFrog (apache#132) - [LI-HOTFIX] Use JFrog Api key instead of password (apache#139) TICKET = N/A LI_DESCRIPTION = DEPENG-2065. This is advised by dep engineering EXIT_CRITERIA = When not using JFrog for publishing
Squashed commits: - [LI-HOTFIX] add Bintray support to LinkedIn Kafka Github - [LI-HOTFIX] Migrate bintray publish to JFrog (apache#132) - [LI-HOTFIX] Use JFrog Api key instead of password (apache#139) TICKET = N/A LI_DESCRIPTION = DEPENG-2065. This is advised by dep engineering EXIT_CRITERIA = When not using JFrog for publishing
Please see https://cwiki.apache.org/confluence/display/KAFKA/KIP-36+Rack+aware+replica+assignment for the overall design.
The update to TopicMetadataRequest/TopicMetadataResponse will be done in a different PR.