Add Hadoop Counter to avoid timeouts #14

locked-fg · 2014-09-01T07:34:06Z

I recently had the problem that I was reading a lot of rows from an HBase table and filtered the majority of rows in the first steps of my scalding job. -> The Hadoop counters didn't change and the job timed out after 10min.

Would it be possible to add a counter that counts lines read (or hundrets of lines read) and publishes the values to a hadoop counter to avoid timing out?

crajah · 2014-09-01T07:55:42Z

Hi Franz,

I don't see why not. If you have an implementation, check it in and I'll pull it.

Cheers,
Chandan

On 1 Sep 2014, at 08:34, Franz Graf [email protected] wrote:

I recently had the problem that I was reading a lot of rows from an HBase table and filtered the majority of rows in the first steps of my scalding job. -> The Hadoop counters didn't change and the job timed out after 10min.

Would it be possible to add a counter that counts lines read (or hundrets of lines read) and publishes the values to a hadoop counter to avoid timing out?

—
Reply to this email directly or view it on GitHub.

locked-fg · 2014-09-01T09:41:51Z

Hi Chandan,

I tried it already. But currently I get Exceptions when using the self-compiled SpyGlass-Version. I try to fix it later this week but currently I'm a bit stuck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Hadoop Counter to avoid timeouts #14

Add Hadoop Counter to avoid timeouts #14

locked-fg commented Sep 1, 2014

crajah commented Sep 1, 2014

locked-fg commented Sep 1, 2014

Add Hadoop Counter to avoid timeouts #14

Add Hadoop Counter to avoid timeouts #14

Comments

locked-fg commented Sep 1, 2014

crajah commented Sep 1, 2014

locked-fg commented Sep 1, 2014