Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Append HADOOP_CONF_DIR to the tools CLASSPATH execution cmd #1308

Merged
merged 6 commits into from
Aug 22, 2024

Conversation

amahussein
Copy link
Collaborator

@amahussein amahussein commented Aug 21, 2024

Signed-off-by: Ahmed Hussein [email protected]

Fixes #1253
Fixes #1283
Fixes #1302
Fixes #1303

This change includes the following:

  • The wrapper gets Hadoop's configuration directory from the environment variables. The first valid directory is added to the java cmd CLASSPATH. The order of available hadoop configuration directories are:

      1. `HADOOP_CONF_DIR`
      2. `HADOOP_HOME/conf`
      3. `HADOOP_HOME/etc/hadoop`
    
  • This PR also enforces URI to the --output-folder argument to the java cmd. This is required to prevent the core tools from storing the output-folder on the remote storage in case HDFS defines a default FileSystem.

Signed-off-by: Ahmed Hussein <[email protected]>

Fixes NVIDIA#1253
Fixes NVIDIA#1302

This change includes the following:

- the python wrapper pulls the hadoop configuration directory `$HADOOP_CONF_DIR` env var. If the latter is not defined, the wrapper tries `$HADDOP_HOME/etc/hadoop`.
- If the `hadoop_conf_dir` is defined then it is appended to the java
  CLASSPATH iff it is a valid local directory path
- If none of the above applies, the class path will be the same.
@amahussein amahussein added bug Something isn't working user_tools Scope the wrapper module running CSP, QualX, and reports (python) labels Aug 21, 2024
@amahussein amahussein self-assigned this Aug 21, 2024
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this fixing #1253? Just because we pick up conf now?

user_tools/src/spark_rapids_pytools/rapids/rapids_job.py Outdated Show resolved Hide resolved
user_tools/src/spark_rapids_tools/utils/util.py Outdated Show resolved Hide resolved
tgravescs
tgravescs previously approved these changes Aug 22, 2024
@amahussein
Copy link
Collaborator Author

How is this fixing #1253? Just because we pick up conf now?

yes.

I added a fix for #1303 as well. This is considered as a quick hack since the problem would still exist in any input/output file passed as arguments.
Followup tasks are listed in #1304

except Exception as ex: # pylint: disable=broad-except
self.logger.error('Failed in processing output arguments. Output_folder must be a local directory')
raise ex
# self.output_folder = FSUtil.get_abs_path(self.output_folder)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be removed

deps_arr = [self.prop_container.get_jar_file()]
hadoop_cp = self._get_hadoop_classpath()
# append hadoop conf dir if any
if hadoop_cp is not None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this throw an exception when hadoop conf dir was not found but event logs are in hdfs?

Signed-off-by: Ahmed Hussein <[email protected]>
Signed-off-by: Ahmed Hussein <[email protected]>
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amahussein.

@amahussein amahussein merged commit 72f7e57 into NVIDIA:dev Aug 22, 2024
14 checks passed
@amahussein amahussein deleted the rapids-tools-1253 branch August 22, 2024 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
3 participants