Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Add more task debug info in jobInfo api response [rest-server] #4667

Merged
merged 10 commits into from
Jul 7, 2020
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions src/rest-server/docs/swagger.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1971,10 +1971,16 @@ components:
type: string
reaction:
type: string
reason:
type: string
repro:
type: array
items:
type: string
solution:
type: array
items:
type: string
appExitDiagnostics:
type: string
nullable: true
Expand Down Expand Up @@ -2065,6 +2071,9 @@ components:
type: string
nullable: true
description: ip of the task container
containerNodeName:
type: string
yiyione marked this conversation as resolved.
Show resolved Hide resolved
description: node name of task container
containerPorts:
type: object
description: ports of the task container
Expand All @@ -2079,6 +2088,68 @@ components:
type: integer
nullable: true
description: exit code the task container
containerExitSpec:
type: object
nullable: true
description: container exit spec
properties:
code:
type: integer
phrase:
type: string
issuer:
type: string
causer:
type: string
type:
type: string
stage:
type: string
behavior:
type: string
reaction:
type: string
reason:
type: string
repro:
type: array
items:
type: string
solution:
type: array
items:
type: string
containerExitDiagnostics:
type: string
nullable: true
description: container exit diagnostics
retries:
type: integer
accountableRetries:
type: integer
createdTime:
type: integer
description: >-
task created time, in number of milliseconds since the Unix
Epoch.
completedTime:
type: integer
description: >-
task completion time, in number of milliseconds since the Unix
Epoch.
currentAttemptLaunchedTime:
type: integer
description: >-
the last attempt launched time, in number of milliseconds since the Unix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the last attempt [](start = 24, length = 16)

current attempt launched time

Epoch.
currentAttemptCompletedTime:
type: integer
description: >-
the last attempt completion time, in number of milliseconds since the Unix
Epoch.
hived:
type: object
nullable: true
required:
- taskRoleStatus
- taskStatuses
Expand Down
11 changes: 11 additions & 0 deletions src/rest-server/src/models/v2/job/k8s.js
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,8 @@ const convertTaskDetail = async (taskStatus, ports, logPathPrefix) => {
const containerGpus = null;

const completionStatus = taskStatus.attemptStatus.completionStatus;
const diagnostics = completionStatus ? completionStatus.diagnostics : null;
const exitDiagnostics = generateExitDiagnostics(diagnostics);
return {
taskIndex: taskStatus.index,
taskState: convertState(
Expand All @@ -222,10 +224,19 @@ const convertTaskDetail = async (taskStatus, ports, logPathPrefix) => {
),
containerId: taskStatus.attemptStatus.podUID,
containerIp: taskStatus.attemptStatus.podHostIP,
containerNodeName: taskStatus.attemptStatus.podNodeName,
containerPorts,
containerGpus,
containerLog: `http://${taskStatus.attemptStatus.podHostIP}:${process.env.LOG_MANAGER_PORT}/log-manager/tail/${logPathPrefix}/${taskStatus.attemptStatus.podUID}/`,
containerExitCode: completionStatus ? completionStatus.code : null,
containerExitSpec: completionStatus ? generateExitSpec(completionStatus.code) : generateExitSpec(null),
containerExitDiagnostics: exitDiagnostics ? exitDiagnostics.diagnosticsSummary : null,
retries: taskStatus.retryPolicyStatus.totalRetriedCount,
accountableRetries: taskStatus.retryPolicyStatus.accountableRetriedCount,
createdTime: new Date(taskStatus.startTime).getTime(),
completedTime: new Date(taskStatus.completionTime).getTime(),
currentAttemptLaunchedTime: new Date(taskStatus.attemptStatus.runTime || taskStatus.attemptStatus.startTime).getTime(),
currentAttemptCompletedTime: new Date(taskStatus.attemptStatus.completionTime).getTime(),
...launcherConfig.enabledHived && {
hived: {
affinityGroupName,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ const submitJob = jobConfig => {
const user = cookies.get('user');
loading.showLoading();
$.ajax({
url: `${webportalConfig.restServerUri}/api/v1/jobs/${user}~${jobConfig.jobName}`,
url: `${webportalConfig.restServerUri}/api/v2/jobs/${user}~${jobConfig.jobName}`,
data: JSON.stringify(jobConfig),
headers: {
Authorization: `Bearer ${token}`,
Expand Down Expand Up @@ -205,7 +205,7 @@ $(document).ready(() => {
if (type != null && username != null && jobName != null) {
const url =
username === ''
? `${webportalConfig.restServerUri}/api/v1/jobs/${jobName}/config`
? `${webportalConfig.restServerUri}/api/v2/jobs/${jobName}/config`
: `${webportalConfig.restServerUri}/api/v2/jobs/${username}~${jobName}/config`;
$.ajax({
url: url,
Expand Down