Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mongo KV range scans #343

Merged
merged 2 commits into from
Sep 9, 2024
Merged

fix mongo KV range scans #343

merged 2 commits into from
Sep 9, 2024

Conversation

agavra
Copy link
Contributor

@agavra agavra commented Sep 6, 2024

range scans on KV stores didn't consider kafka partition - this patch fixes it by adding the kafka partition into the data model. while I don't love this since it'll make it more difficult to repartition a kafka topic, it will only be the case for topologies that use range or all on KV stores (something the DSL does not leverage, so I consider it an advanced use case)

@@ -185,7 +186,8 @@ public KeyValueIterator<Bytes, byte[]> range(
public KeyValueIterator<Bytes, byte[]> all(final int kafkaPartition, final long minValidTs) {
final FindIterable<KVDoc> result = docs.find(Filters.and(
Filters.not(Filters.exists(KVDoc.TOMBSTONE_TS)),
Filters.gte(KVDoc.TIMESTAMP, minValidTs)
Filters.gte(KVDoc.TIMESTAMP, minValidTs),
Filters.eq(KVDoc.KAFKA_PARTITION, kafkaPartition)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should consider adding a configuration that will create an index in MongoDB on the kafka partition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well we're going to have to get rid of this if we want to support partition scaling. That's why I went with the approach (long ago) of using the partitioner to figure out the kafka partition based on the key

Unfortunately I never found the time to finish that PR, so it's on me

Well technically the PR was finished but you requested a change that was completely reasonable and that's when I had to pivot to whatever it was that was more important -- point being I can dig that old PR up and finally finish it if we want to do this and it needs to happen ASAP. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An aside: how much work is left on that PR @ableegoldman ?

Copy link
Contributor

@rodesai rodesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@agavra agavra merged commit 24c7791 into main Sep 9, 2024
1 check passed
@agavra agavra deleted the fix_mongo_scans branch September 9, 2024 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants