Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loader (sampler component) - Kafka/Kinesis samplers #7566

Merged
merged 3 commits into from
May 17, 2019
Merged

Data loader (sampler component) - Kafka/Kinesis samplers #7566

merged 3 commits into from
May 17, 2019

Conversation

dclim
Copy link
Contributor

@dclim dclim commented Apr 28, 2019

Implementation of the sampler component of #7502.
Depends on #7531.

Adds additional implementations to support sampling from Kafka and Kinesis.

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm 👍

{
insertData(generateRecords(TOPIC));

KafkaSupervisorSpec supervisorSpec = new KafkaSupervisorSpec(DATA_SCHEMA, null, new KafkaSupervisorIOConfig(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting looks off


replayAll();

KinesisSupervisorSpec supervisorSpec = new KinesisSupervisorSpec(DATA_SCHEMA, null, new KinesisSupervisorIOConfig(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting

private void assignAndSeek() throws InterruptedException
{
final Set<StreamPartition<PartitionIdType>> partitions = recordSupplier
.getPartitionIds(ioConfig.getStream()).stream()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .stream() should probably be on newline

private final RecordSupplier<PartitionIdType, SequenceOffsetType> recordSupplier;

private Iterator<OrderedPartitionableRecord<PartitionIdType, SequenceOffsetType>> recordIterator;
private Iterator<byte[]> interRecordIterator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the starting bit of nextRowWithRaw might be a bit clearer if this variable was named something like recordBytesIterator or recordDataIterator?

@Override
public SamplerResponse sample()
{
return firehoseSampler.sample(new FirehoseFactory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: formatting

@vogievetsky
Copy link
Contributor

Been testing and using this feature as a "user", seems to work really well. Check it out here: https://youtu.be/tAEp5BXVHYE

Copy link
Member

@clintropolis clintropolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

We really need builders for those indexing config classes, but not this PR

@clintropolis clintropolis merged commit d384579 into apache:master May 17, 2019
@dclim dclim deleted the kafka-kinesis-sampler-only branch May 17, 2019 05:32
clintropolis pushed a commit to clintropolis/druid that referenced this pull request May 17, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 25, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
jihoonson pushed a commit to implydata/druid-public that referenced this pull request Jun 26, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
gianm pushed a commit to implydata/druid-public that referenced this pull request Jul 3, 2019
* implement Kafka/Kinesis sampler

* add KafkaSamplerSpecTest and KinesisSamplerSpecTest

* code review changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants