Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include exception message in errorMessage for failed task #16286

Merged
merged 5 commits into from
Apr 18, 2024

Conversation

YongGang
Copy link
Contributor

Description

Following is an example of auto-compaction subtask failure we can see from task Status field in Druid console.
Now include the actual exception message in the task Status output to easily understand the failure instead of having to check from Overlord logs.

{
  "id": "partial_range_index_generate...",
  "groupId": "coordinator-issued_compact...",
  "type": "partial_range_index_generate",
  "createdTime": "2024-04-15T16:41:58.300Z",
  "queueInsertionTime": "1970-01-01T00:00:00.000Z",
  "statusCode": "FAILED",
  "status": "FAILED",
  "runnerStatusCode": "WAITING",
  "duration": -1,
  "location": {
    "host": null,
    "port": -1,
    "tlsPort": -1
  },
  "dataSource": "data_source",
  "errorMsg": "Failed while waiting for the task to be ready to run. See overlord logs for more details."
}

Key changed/added classes in this PR
  • TaskQueue.java add exception message to errorMessage

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@kfaraz
Copy link
Contributor

kfaraz commented Apr 16, 2024

@YongGang , TaskQueueTest is failing.

+ "See overlord logs for more details.";
errorMessage = StringUtils.format(
"Encountered error[%s] while waiting for task to be ready. See Overlord logs for more details.",
e.getMessage()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be a good idea to trim this message if it exceeds a certain length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kashif, updated the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be chopping only the contents of e.getMessage(). Chopping out the parts while waiting for task to be ready. See Overlord logs for more details. would be less user friendly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
e.getMessage()
StringUtils.chop(e.getMessage(), 100)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -314,7 +315,7 @@ public boolean isReady(TaskActionClient taskActionClient)
Assert.assertEquals(TaskState.FAILED, statusOptional.get().getStatusCode());
Assert.assertNotNull(statusOptional.get().getErrorMsg());
Assert.assertTrue(
statusOptional.get().getErrorMsg().startsWith("Failed while waiting for the task to be ready to run")
statusOptional.get().getErrorMsg().contains(exceptionMsg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Better to verify the whole error message rather than contains.

@kfaraz kfaraz merged commit 6974498 into apache:master Apr 18, 2024
85 checks passed
@adarshsanjeev adarshsanjeev added this to the 30.0.0 milestone May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants