Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError Creating SimpleProducer #113

Closed
herriojr opened this issue Jan 28, 2014 · 16 comments · Fixed by #174
Closed

KeyError Creating SimpleProducer #113

herriojr opened this issue Jan 28, 2014 · 16 comments · Fixed by #174
Labels

Comments

@herriojr
Copy link

This is an issue I've been experiencing running Master branch. Basically, the first time I access an uncreated topic, I get a KeyError, whereas the second time, I can access it just fine.

import kafka
client = kafka.client.KafkaClient(address, port)
producer = kafka.producer.SimpleProducer(client, "test") # Fails first time called, succeeds thereafter

Traceback (most recent call last):
File "/home/vagrant/.pycharm_helpers/pydev/pydevd.py", line 1532, in
debugger.run(setup['file'], None, None)
File "/home/vagrant/.pycharm_helpers/pydev/pydevd.py", line 1143, in run
pydev_imports.execfile(file, globals, locals) #execute the script
File "/projects/scratch/init.py", line 27, in
main()
File "/home/vagrant/.virtualenvs/app/local/lib/python2.7/site-packages/kafka/producer.py", line 191, in init
self.next_partition = cycle(client.topic_partitions[topic])
KeyError: 'test'

@rdiomar
Copy link
Collaborator

rdiomar commented Jan 30, 2014

Not sure why, but when it's a new topic kafka doesn't return the partition info to the first metadata request. As a workaround, you can do what ensure_topic_creation() in test_integration.py does before creating your producer. We should fix this though.

@dpkp
Copy link
Owner

dpkp commented Jan 31, 2014

Latest master (after merging #111 pull request) changes behavior so the Producer init no longer takes a topic. But this error could still show up on first message send to a new topic. Can you test?

@dpkp
Copy link
Owner

dpkp commented Feb 8, 2014

My testing suggests that when topic auto-creation is enabled on the kafka server (auto.create.topics.enable=True), it will create the topic but return a LeaderNotAvailable error while it does the initial setup asynchronously. this means that the message produce has to wait until the topic is fully setup w/ a broker leader. kafka-python should probably check for this error and optionally retry (w/ resyncing leader metadata?)

@dpkp dpkp mentioned this issue Feb 9, 2014
@hmahmood
Copy link

Any updates on this?

@mumrah mumrah added this to the Publish to PyPI milestone Feb 25, 2014
@frgtn
Copy link
Contributor

frgtn commented Feb 25, 2014

As of 8811298 this still breaks with KeyError:
Traceback (most recent call last):

  File "test_113.py", line 21, in <module>
    'The message')
  File "<some_prefix>/lib/python2.7/site-packages/kafka/producer.py", line 204, in send_messages
    partition = self._next_partition(topic)
  File "<_some_prefix>/lib/python2.7/site-packages/kafka/producer.py", line 200, in _next_partition
    self.partition_cycles[topic] = cycle(self.client.topic_partitions[topic])
KeyError: 'test_2'

@dpkp
Copy link
Owner

dpkp commented Mar 22, 2014

merged #109 -- please retry w/ master trunk

@imduffy15
Copy link

Still seeing this issue on master.

Any updates?

@wizzat
Copy link
Collaborator

wizzat commented Apr 9, 2014

I can confirm the issue happening, and I've looked into it enough to know why. Right now I am working on beefing up the test suite.

On a scale of 1-10, how much of a blocker is this for you?

-Mark

On Apr 9, 2014, at 3:23, Ian Duffy [email protected] wrote:

Still seeing this issue on master.

Any updates?


Reply to this email directly or view it on GitHub.

@imduffy15
Copy link

I'm working around it for the moment by creating the topics manually.

Lets go with 4 - Annoying.
On Apr 9, 2014 5:53 PM, "Mark Roberts" [email protected] wrote:

I can confirm the issue happening, and I've looked into it enough to know
why. Right now I am working on beefing up the test suite.

On a scale of 1-10, how much of a blocker is this for you?

-Mark

On Apr 9, 2014, at 3:23, Ian Duffy [email protected] wrote:

Still seeing this issue on master.

Any updates?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/113#issuecomment-39988130
.

@Korijn
Copy link

Korijn commented May 20, 2014

I'm getting exactly this error constantly, not just on the first try. Not sure why yet.

Traceback (most recent call last):
File "C:.........\logging\handlers\kafkahandler.py", line 42, in emit
self.producer.send_messages(self.topic, msg)
File "C:.........\VirtualEnv\lib\site-packages\kafka\producer.py", line 204, in send_messages
partition = self._next_partition(topic)
File "C:.........\VirtualEnv\lib\site-packages\kafka\producer.py", line 200, in _next_partition
self.partition_cycles[topic] = cycle(self.client.topic_partitions[topic])
KeyError: '.........'

@wizzat
Copy link
Collaborator

wizzat commented May 20, 2014

Do you have a topic without ISR or without a leader? Can you print
kafka-list-topics output?

-Mark

On Tue, May 20, 2014 at 4:22 AM, Korijn van Golen
[email protected]:

I'm getting exactly this error constantly, not just on the first try. Not
sure why yet.


Reply to this email directly or view it on GitHubhttps://github.com//issues/113#issuecomment-43613360
.

@Korijn
Copy link

Korijn commented May 20, 2014

I feel silly. It was a configuration issue. The kafka-server-start.sh and zookeeper-server-start.sh scripts that come with the packaged install from the ubuntu repo's were not exactly suitable for use with supervisord. After correcting setup all problems were solved. I can post the scripts if you're interested.

@dpkp
Copy link
Owner

dpkp commented May 21, 2014

Nonetheless, there's an assumption in the code that we'll always get topic metadata back from the server. We need to handle this more gracefully.

https://github.com/mumrah/kafka-python/blob/master/kafka/producer.py#L219

@Korijn
Copy link

Korijn commented May 21, 2014

Here's an idea: make it possible to work with kafka-python without metadata through a configuration option (or something similar). In the case of missing metadata, raise an exception that indicates that metadata can't be retrieved, and mention in the exception message that you can configure kafka-python not to use metadata. That would have pointed me in the proper direction immediately (figuring out why metadata can't be retrieved), instead of aimlessly searching for a workaround to some issue with kafka-python. Just a though :)

Thanks for the help guys.

@dpkp
Copy link
Owner

dpkp commented May 21, 2014

Metadata is necessary to know which broker the producer must connect to for a particular topic-partition. kafka-python doesn't currently expose a super low level producer that requires users to manage leadership discovery / topic metadata. But even if it did, you'd still be forced to deal with metadata. It's a pretty core part of the kafka design.

Nonetheless, this is clearly a bug in how kafka-python handles partition cycles when a topic does not exist (or at least the broker has no metadata available).

@Korijn
Copy link

Korijn commented May 21, 2014

I guess you're right about that.

wizzat added a commit to wizzat/kafka-python that referenced this issue May 22, 2014
Adds ensure_topic_exists to KafkaClient, redirects test case to use
that.  Fixes dpkp#113 and fixes dpkp#150.
@dpkp dpkp closed this as completed in #174 Aug 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants