-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finalize shell mapper when file node functionality is ready (0/1) #50
Comments
Can I ask for a more detailed description? I do not understand what to do here. |
This task calls for adding the File/Archive functionality support to Shell mapper.
The last one is a tricky part and applies to every template where we use |
This task is blocked until we resolve #243 |
Post-mortem on what was done in trying to solve this issue: On 18-19.06.2019 me (Szymon) and Tomek we were working on adding file/archive support to the Shell mapper. Approach 1First of all, the way we run a shell command is using the We looked at the API and found the following flag:
Using it we tried setting the appropriate configuration properties to pass the files / archives, based on this SO answer.
Unfortunately, it didn't work. The file Approach 2Knowing that we have a Pig mapper which produces a DAG with a We found that a script.pig
The file/archive functionality in a Pig script is handled by modifying the script and adding a few
We ran this on Dataproc and there are a few observations:
Additional actionsWhen printing It is removed after the job has completed so unable to be inspected. We placed an ConclusionsWe didn't manage to find a way to add file/archive functionality to the Shell mapper. Moreover, it seems that it doesn't work correctly for the Pig mapper either. We've decided to abandon this problem for now and return to it later.
|
The text was updated successfully, but these errors were encountered: