Changes to work with kafka 0.8 #7

whitedr · 2014-02-13T17:11:06Z

These changes are based on an initial set of changes from clippPR to get the river to work with Kafka 0.8. I made a few changes based on the changes to the consumer offset being logical vs. physical as well as some subtle changes to the differences in exception handling between 0.7 and 0.8. Also made a fix for the case where connection issues occur in the dumpStats utility method so that the river will reconnect. One final change was to add a new river configuration param named 'startFromNewestOffset' (defaults to false) which allows you to configure the river to start either from the oldest or newest offset.

…etch to translate and throw exceptions on error in the FetchResponse. Add support for new river configuration named 'startFromNewestOffset'. This flag allow the river to be setup either to start from the newest or oldest partition offset if/when the river encounters an OffsetOutOfRangeException. Fixed a couple tests that were failing due to the new clientName param in the KafkaClient connect method.

…t timeout exceptions

damienclaveau · 2014-02-15T21:46:21Z

+1
This update is a very good news

* Kafka 0.8.1 * ElasticSearch 1.0.1 * Scala 2.10.3

adewahyu123 · 2014-03-14T02:49:59Z

i got this error when i test.

Test set: org.elasticsearch.river.kafka.KafkaClientTest

Tests run: 9, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 0.456 sec <<< FAILURE!
testGetNewestOffset(org.elasticsearch.river.kafka.KafkaClientTest) Time elapsed: 0.425 sec <<< ERROR!
java.lang.AssertionError:
Unexpected method call getOffsetsBefore(Name: OffsetRequest; Version: 0; CorrelationId: 0; ClientId: null; RequestInfo: [my_replicated_topic,0] -> PartitionOffsetRequestInfo(-1,1); ReplicaId: -1):
at org.easymock.internal.MockInvocationHandler.invoke(MockInvocationHandler.java:45)
at org.easymock.internal.ObjectMethodsFilter.invoke(ObjectMethodsFilter.java:73)
at org.easymock.internal.ClassProxyFactory$MockMethodInterceptor.intercept(ClassProxyFactory.java:92)
at kafka.javaapi.consumer.SimpleConsumer$$EnhancerByCGLIB$$abaf5997.getOffsetsBefore()
at org.elasticsearch.river.kafka.KafkaClient.getNewestOffset(KafkaClient.java:107)
at org.elasticsearch.river.kafka.KafkaClientTest.testGetNewestOffset(KafkaClientTest.java:119)
testGetOldestOffset(org.elasticsearch.river.kafka.KafkaClientTest) Time elapsed: 0 sec <<< ERROR!
java.lang.AssertionError:
Unexpected method call getOffsetsBefore(Name: OffsetRequest; Version: 0; CorrelationId: 0; ClientId: null; RequestInfo: [my_replicated_topic,0] -> PartitionOffsetRequestInfo(-2,1); ReplicaId: -1):
at org.easymock.internal.MockInvocationHandler.invoke(MockInvocationHandler.java:45)
at org.easymock.internal.ObjectMethodsFilter.invoke(ObjectMethodsFilter.java:73)
at org.easymock.internal.ClassProxyFactory$MockMethodInterceptor.intercept(ClassProxyFactory.java:92)
at kafka.javaapi.consumer.SimpleConsumer$$EnhancerByCGLIB$$abaf5997.getOffsetsBefore()
at org.elasticsearch.river.kafka.KafkaClient.getOldestOffset(KafkaClient.java:117)
at org.elasticsearch.river.kafka.KafkaClientTest.testGetOldestOffset(KafkaClientTest.java:133)
testFetch(org.elasticsearch.river.kafka.KafkaClientTest) Time elapsed: 0.008 sec <<< ERROR!
java.lang.AssertionError:
Unexpected method call fetch(Name: FetchRequest; Version: 0; CorrelationId: 0; ClientId: null; ReplicaId: -1; MaxWait: 0 ms; MinBytes: 0 bytes; RequestInfo: [my_replicated_topic,0] -> PartitionFetchInfo(1717,1024)):
at org.easymock.internal.MockInvocationHandler.invoke(MockInvocationHandler.java:45)
at org.easymock.internal.ObjectMethodsFilter.invoke(ObjectMethodsFilter.java:73)
at org.easymock.internal.ClassProxyFactory$MockMethodInterceptor.intercept(ClassProxyFactory.java:92)
at kafka.javaapi.consumer.SimpleConsumer$$EnhancerByCGLIB$$abaf5997.fetch()
at org.elasticsearch.river.kafka.KafkaClient.fetch(KafkaClient.java:127)
at org.elasticsearch.river.kafka.KafkaClientTest.testFetch(KafkaClientTest.java:225)

Can you give me information to solved this error.
Many Thanks..

damienclaveau · 2014-04-14T17:10:27Z

Hi, I think the 0.8 version submitted by dtabwhite is OK, but there is still a problem whith the unit tests in maven.
By the way, I found that the implementation of the KafkaClient still doesn't support the dynamic rebalancing of the leaders of the partitions.
The guidelines to support it is here https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
I will try to work on it....

…ncorporating the configured river name in the location in zk where the offsets are stored.

- Have the river discover the set of brokers that exist from zk - Discover the leader broker for a given topic/partition - Don't store the brokerUrl in the zk offset path - On startup keep trying to discover/connect to kafka until there is a registered broker - On startup when no offsets are stored in zk, set the initial offset based on the 'startFromNewestOffset' setting

whitedr · 2014-05-03T13:53:14Z

FYI - My latest commit in my fork changes the river to be more resilient within a clustered kafka setup. Here are the set of changes it introduces:

Have the river discover the set of brokers that exist from zk
Discover the leader broker for a given topic/partition
Don't store the brokerUrl in the zk offset path
On startup keep trying to discover/connect to kafka until there is a registered broker
On startup when no offsets are stored in zk, set the initial offset based on the 'startFromNewestOffset'

So essentially, the broker_host and broker_port go away with this change and are dynamically (re)discovered for the specified topic/partition. This is essential when running kafka in a clustered setup as the leadership for a given topic/partition can change and the river should really be able to deal with these changes on the fly.

jplock · 2014-07-09T00:43:01Z

+1

santthosh · 2014-10-06T18:22:53Z

+1

mfirry · 2014-11-10T09:28:27Z

+1

Warner Onstine and others added 3 commits January 14, 2014 12:35

updating code so that it will work with 0.8.0

28e49bf

Change move dumpStats within try/catch so river doesn't stop on socke…

9c099f5

…t timeout exceptions

Dave White and others added 3 commits March 4, 2014 12:04

null check on thread during river close

9fb7299

Version upgrades, fix tests

7ca6e96

* Kafka 0.8.1 * ElasticSearch 1.0.1 * Scala 2.10.3

tabs -> spaces

ab3429a

Dave White added 2 commits April 24, 2014 14:11

Add support for multiple rivers running in different ES clusters by i…

0ec5a69

…ncorporating the configured river name in the location in zk where the offsets are stored.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to work with kafka 0.8 #7

Changes to work with kafka 0.8 #7

whitedr commented Feb 13, 2014

damienclaveau commented Feb 15, 2014

adewahyu123 commented Mar 14, 2014

damienclaveau commented Apr 14, 2014

whitedr commented May 3, 2014

jplock commented Jul 9, 2014

santthosh commented Oct 6, 2014

mfirry commented Nov 10, 2014

Changes to work with kafka 0.8 #7

Are you sure you want to change the base?

Changes to work with kafka 0.8 #7

Conversation

whitedr commented Feb 13, 2014

damienclaveau commented Feb 15, 2014

adewahyu123 commented Mar 14, 2014

i got this error when i test.

Test set: org.elasticsearch.river.kafka.KafkaClientTest

damienclaveau commented Apr 14, 2014

whitedr commented May 3, 2014

jplock commented Jul 9, 2014

santthosh commented Oct 6, 2014

mfirry commented Nov 10, 2014