Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow processors to log and spew objects larger than 64K #55

Merged
merged 2 commits into from
May 23, 2017

Conversation

akariv
Copy link
Member

@akariv akariv commented May 23, 2017

This pull request handles 2 issues:

  • Logging lines larger than 64KB
  • Spewing data larger than 64KB

The main problem is that the manager uses the asyncio framework which imposes hard limits on line sizes (as it's an async framework). Processors themselves don't have that problem as they work synchronously.

The 1st issue is handled by detecting such cases and showing a friendly message to the user (the original log line is discarded).
The 2nd issue is resolved by adding a 'sink' processor to every pipeline which makes sure that no large objects are passed to the manager.

(Also added a few missing tests which were clouded by a bad gitignore rule)

try:
line = await out.readline()
except ValueError:
logging.error('Received a too long log line (>64KB), truncated')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the log line being truncated and re-outputted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's being discarded altogether and the error log is shown instead.

I think that log lines should be used for human readable text - limiting log lines at 64K is quite reasonable, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - it's fine, but the error message suggests it'll be truncated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got you. Changing to 'discarded'.

@akariv
Copy link
Member Author

akariv commented May 23, 2017

@OriHoch would you mind taking a look too?

@OriHoch
Copy link
Contributor

OriHoch commented May 23, 2017

looks good

from what I understand this doesn't resolve my use where we really needed more then 64kb of output

@akariv
Copy link
Member Author

akariv commented May 23, 2017

Why do you think so?
The tests are passing so it means you can actually yield any size you want now.

@akariv akariv merged commit d263f03 into master May 23, 2017
@akariv akariv deleted the fix/64k-limit branch May 23, 2017 14:50
@OriHoch
Copy link
Contributor

OriHoch commented May 24, 2017

I'll test with my use-case: Beit-Hatfutsot/mojp-dbs-pipelines#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants