Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling FileNotFound exception in AutoTuner #1065

Merged
merged 9 commits into from
Jun 6, 2024

Conversation

bilalbari
Copy link
Collaborator

@bilalbari bilalbari commented Jun 4, 2024

Fixes #978

This fix handles the AutoTuner exception when an input file passed as workerInfo arguments does not exist.

Changes -

  1. Catching the FileNotFound exception and adding a warning for it.
  2. As well as appending a tuner comment which conveys this
  3. Minor change to handle NonFatal exception while building AutoTuner from props

@bilalbari bilalbari requested a review from amahussein June 4, 2024 22:38
@amahussein amahussein added bug Something isn't working core_tools Scope the core module (scala) labels Jun 5, 2024
@amahussein amahussein changed the title Changes for handling FileNotFound exception and correcting the test s… Handling FileNotFound exception in AutoTuner Jun 5, 2024
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bilal. Added a few comments

case _: FileNotFoundException =>
logWarning("No yaml found for input path")
None
case e: Exception => throw e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this case exception

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -245,7 +245,6 @@ class AutoTunerSuite extends FunSuite with BeforeAndAfterEach with Logging {
| path to the Spark RAPIDS plugin jar.
| If the Spark RAPIDS jar is being bundled with your Spark
| distribution, this step is not needed.
|- java.io.FileNotFoundException: File non-existing.yaml does not exist
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still useful to have a comment that the input path was not found.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bilalbari . Just a small comment to update

val autoT = new AutoTuner(clusterPropsOpt.getOrElse(new ClusterProperties()),
singleAppProvider, platform, driverInfoProvider)
if(clusterPropsOpt.isEmpty) {
autoT.appendComment(s"workerInfo file not found at $workerInfoFilePath. " +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here to explain why we are making this check.
This makes the code more readable

// In case the workerInfo input path was incorrect, the AutoTuner needs to add a comment that the recommendations were generated by assuming default worker properties.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amahussein . This has been updated.

@amahussein amahussein requested a review from tgravescs June 5, 2024 16:28
Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bilalbari
Lets wait for @tgravescs feedback on the AutoTuner's change

@@ -1215,7 +1216,15 @@ object AutoTuner extends Logging {
val reader = new BufferedReader(new InputStreamReader(fsIs))
val fileContent = Stream.continually(reader.readLine()).takeWhile(_ != null).mkString("\n")
loadClusterPropertiesFromContent(fileContent)
} finally {
}
catch {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, these should go on previous line
" } catch {"
"} finally {"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

catch {
// In case of missing file for cluster properties, default properties are used.
// Hence, catching and logging as a warning
case _: FileNotFoundException =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming there is a chance something else bad happens here, maybe a misformatted file or issue getting the file FileSystem.get returns IOException for instance. its probably safer to catch nonFatal errors here and log and return None

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! in this method we can capture IOException which is more generic.
The caller method catches the NoFatal(e) to cover other sort of exceptions that we are not aware of

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tgravescs @amahussein . This has been corrected

driverInfoProvider)
val autoT = new AutoTuner(clusterPropsOpt.getOrElse(new ClusterProperties()),
singleAppProvider, platform, driverInfoProvider)
if(clusterPropsOpt.isEmpty) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit add a space after if

"if (cluster"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bilalbari
LGTME

@amahussein amahussein merged commit bc705cb into NVIDIA:dev Jun 6, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Qualification tool throws error in AutoTuner when it shouldn't
3 participants