Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle SparkRapidsBuildInfoEvent in GPU event logs #1203

Merged
merged 18 commits into from
Aug 22, 2024

Conversation

cindyyuanjiang
Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang commented Jul 19, 2024

Fixes #995

Changes

  • Handles a new SparkRapidsBuildInfoEvent added in Spark-Rapids
  • Write SparkRapidsBuildInfoEvent into file spark_rapids_build_info.json under rapids_4_spark_profile/application_xxxxxx/

Testing

  1. Test with GPU event log with SparkRapidsBuildInfoEvent

spark_rapids profiling -v --eventlogs <my_gpu_log> --tools_jar <my_tools_jar>

spark_rapids_build_info.json
[ {
  "sparkRapidsBuildInfo" : {
    "url" : "https://github.com/NVIDIA/spark-rapids.git",
    "branch" : "HEAD",
    "revision" : "f932a7802bbf31b6205358d1abd7c7b49c8bea3c",
    "version" : "24.06.0",
    "date" : "2024-06-13T19:48:28Z",
    "cudf_version" : "24.06.0",
    "user" : "root"
  },
  "sparkRapidsJniBuildInfo" : {
    "url" : "https://github.com/NVIDIA/spark-rapids-jni.git",
    "branch" : "HEAD",
    "gpu_architectures" : "70;75;80;86;90",
    "revision" : "e9c92a5339437ce0cd72bc384084bd7ee45b37f9",
    "version" : "24.06.0",
    "date" : "2024-06-08T01:21:57Z",
    "user" : "root"
  },
  "cudfBuildInfo" : {
    "url" : "https://github.com/rapidsai/cudf.git",
    "branch" : "HEAD",
    "gpu_architectures" : "70;75;80;86;90",
    "revision" : "7c706cc4004d5feaae92544b3b29a00c64f7ed86",
    "version" : "24.06.0",
    "date" : "2024-06-08T01:21:55Z",
    "user" : "root"
  },
  "sparkRapidsPrivateBuildInfo" : {
    "url" : "https://gitlab-master.nvidia.com/nvspark/spark-rapids-private.git",
    "branch" : "HEAD",
    "revision" : "755b4dd03c753cacb7d141f3b3c8ff9f83888b69",
    "version" : "24.06.0",
    "date" : "2024-06-08T11:44:03Z",
    "user" : "root"
  }
} ]
  1. Test with event log without SparkRapidsBuildInfoEvent

spark_rapids profiling -v --eventlogs <my_gpu_log> --tools_jar <my_tools_jar>

spark_rapids_build_info.json
[ {
  "sparkRapidsBuildInfo" : { },
  "sparkRapidsJniBuildInfo" : { },
  "cudfBuildInfo" : { },
  "sparkRapidsPrivateBuildInfo" : { }
} ]

Follow up

#1296 to fix the hardcoded SparkRapidsBuildInfoEvent

Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the description to include what the solutions is here, how does user see this information.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang
I believe we want to keep the runtimeReport intact.

  • The information of the RAPIDS plugin is per-app, because it is exctracted from the eventlog. Therefore, we expect that to be in the Profiler output of the individual application

  • We need to maintain backward compatibility. If we profile an older GPU eventlog, we should still build the RAPIDs information. We can use a common object that gets populated either from the eventhandler or from parsing the Spark properties. Eventually the support of the legacy method will be dropped once all customers are using the most recent RAPIDs versions

  • The information needs to be passed to the output and to the autotuner as well.

@amahussein amahussein added the core_tools Scope the core module (scala) label Jul 19, 2024
@cindyyuanjiang cindyyuanjiang changed the title Handle SparkRapidsBuildInfoEvent in GPU event logs WIP: Handle SparkRapidsBuildInfoEvent in GPU event logs Jul 19, 2024
@cindyyuanjiang cindyyuanjiang marked this pull request as ready for review August 14, 2024 00:44
@cindyyuanjiang cindyyuanjiang changed the title WIP: Handle SparkRapidsBuildInfoEvent in GPU event logs Handle SparkRapidsBuildInfoEvent in GPU event logs Aug 14, 2024
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
Signed-off-by: cindyyuanjiang <[email protected]>
tgravescs
tgravescs previously approved these changes Aug 19, 2024
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, it would be nice to have a test to verify that it outputs the file and has valid data when the event is present. Maybe that they are empty when the event isn't present in older ones.

Signed-off-by: cindyyuanjiang <[email protected]>
parthosa
parthosa previously approved these changes Aug 19, 2024
Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang. LGTME. Similar comments as Tom.

@cindyyuanjiang
Copy link
Collaborator Author

overall looks good, it would be nice to have a test to verify that it outputs the file and has valid data when the event is present. Maybe that they are empty when the event isn't present in older ones.

Thanks @tgravescs! I added two test cases to verify spark_rapids_build_info.json output when SparkRapidsBuildInfoEvent is present in the event log and when it is not. PTAL.

Signed-off-by: cindyyuanjiang <[email protected]>
@cindyyuanjiang cindyyuanjiang dismissed stale reviews from parthosa and tgravescs via 9370675 August 20, 2024 00:57
Signed-off-by: cindyyuanjiang <[email protected]>
Copy link
Collaborator

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Cindy!

Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTME
Thanks @cindyyuanjiang

@cindyyuanjiang cindyyuanjiang merged commit 2a88516 into NVIDIA:dev Aug 22, 2024
14 checks passed
@cindyyuanjiang cindyyuanjiang deleted the spark-rapids-tools-995 branch August 22, 2024 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala) feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add implementation to handle SparkRapidsBuildInfoEvent in GPU eventlogs
4 participants