Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hadoop Counter to avoid timeouts #14

Open
locked-fg opened this issue Sep 1, 2014 · 2 comments
Open

Add Hadoop Counter to avoid timeouts #14

locked-fg opened this issue Sep 1, 2014 · 2 comments

Comments

@locked-fg
Copy link

I recently had the problem that I was reading a lot of rows from an HBase table and filtered the majority of rows in the first steps of my scalding job. -> The Hadoop counters didn't change and the job timed out after 10min.

Would it be possible to add a counter that counts lines read (or hundrets of lines read) and publishes the values to a hadoop counter to avoid timing out?

@crajah
Copy link
Member

crajah commented Sep 1, 2014

Hi Franz,

I don't see why not. If you have an implementation, check it in and I'll pull it.

Cheers,
Chandan

On 1 Sep 2014, at 08:34, Franz Graf [email protected] wrote:

I recently had the problem that I was reading a lot of rows from an HBase table and filtered the majority of rows in the first steps of my scalding job. -> The Hadoop counters didn't change and the job timed out after 10min.

Would it be possible to add a counter that counts lines read (or hundrets of lines read) and publishes the values to a hadoop counter to avoid timing out?


Reply to this email directly or view it on GitHub.

@locked-fg
Copy link
Author

Hi Chandan,

I tried it already. But currently I get Exceptions when using the self-compiled SpyGlass-Version. I try to fix it later this week but currently I'm a bit stuck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants