Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration Amazon EMR #81

Open
igorgatis opened this issue Nov 5, 2013 · 0 comments
Open

Integration Amazon EMR #81

igorgatis opened this issue Nov 5, 2013 · 0 comments

Comments

@igorgatis
Copy link

Sounds like all that's needed is a new backend to talks to s3 file system and EMR jobflow control (via boto API).

Essential features:

  • Read input from and write output to S3.
  • Create new jobflow or reuse existing one.
  • Options to specify number of instance and their types (e.g. m1.medium)

Nice to have:

  • Automatic upload of local input files to S3.
  • Change number of workers instances.
  • Support to spot instances
  • Resource estimator for future runs (e.g. try with a sample, figure how long it will take for the full thing).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant