-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(portal): 将文件直接提交为sbatch命令 #891
Conversation
🦋 Changeset detectedLatest commit: e4c1bdf The changes in this PR will be included in the next version bump. This PR includes changesets to release 17 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #891 +/- ##
==========================================
- Coverage 68.89% 68.67% -0.23%
==========================================
Files 132 132
Lines 3926 3946 +20
Branches 523 528 +5
==========================================
+ Hits 2705 2710 +5
- Misses 1123 1138 +15
Partials 98 98
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个功能可以不增加新的调度器接口,而完全由SCOW完全使用已有的调度器接口实现吗?
.changeset/cold-moons-bathe.md
Outdated
"@scow/grpc-api": patch | ||
--- | ||
|
||
新增 submitFileAsJob 接口,连接 slurm 调度器,直接提交文件 sbatch 执行 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改说法,后端不一定是slurm调度器。改成“直接把文件作为作业提交调度器执行"
.changeset/proud-seahorses-repeat.md
Outdated
"@scow/mis-web": patch | ||
--- | ||
|
||
新增文件管理下提交任意文件直接作为作业文本 sbatch 执行的功能 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
注意说法,”直接作为作业文本提交调度器执行“
@@ -5,7 +5,7 @@ | |||
"main": "build/index.js", | |||
"private": true, | |||
"scripts": { | |||
"generate": "rimraf generated && buf generate --template buf.gen.yaml https://github.com/PKUHPC/scow-scheduler-adapter-interface.git", | |||
"generate": "rimraf generated && buf generate --template buf.gen.yaml https://github.com/PKUHPC/scow-scheduler-adapter-interface.git#branch=add_job_submit_file_as_job_proto", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SCOW主线代码里不能出现这种使用调度器接口仓库的非master分支的情况。这个PR应该等新的调度器接口发布后才合并。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
好的,这个是打算全部review、测试完没有问题再把scow-slurm-interface那边合并到主分支,然后再修改scow这边generate的调度器适配器仓库
因为原有调度器接口是在提交作业时,获得所有提交作业的参数,然后调度器自己执行类似下方的脚本拼接
所以如果想使用旧的接口,需要scow逐句解析出任意文件的对应各个参数,跟杨老师一起商量了一下最后新增了调度器的接口 |
changeset文字已修改,请确认 @ddadaal |
OK,那可以保留这个API。如果功能测试没有问题了,去接口的仓库提交PR,先把接口合并后再来合并这个。 |
好的,那请 @lyl-available 先测试,我在测试的时候先更新一下测试用的调度器适配器 |
测试通过 |
@ddadaal |
@ddadaal 已更新scow-scheduler-adapter-interface版本号至v1.2.0 |
const reply = await asyncClientCall(client.job, "submitScriptAsJob", { | ||
userId, script, | ||
}).catch((e) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const errors = parseErrorDetails(ex.metadata); | ||
// 如果接口不存在 | ||
if (errors[0] && errors[0].$type === "google.rpc.ErrorInfo" && errors[0].reason === "UNIMPLEMENTED") { | ||
scheduleApiVersion = { major: 1, minor: 1, patch: 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
接口不存在默认是1.0.0,因为1.1.0是要求实现这个接口的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为1.0.0
* @param comparedVersion | ||
*/ | ||
export async function checkSchedulerApiVersion(client: SchedulerAdapterClient, | ||
comparedVersion: ApiVersion): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个函数可以通用非常好,但是还是设计复杂了,目前只需要确认获取到的版本大于等于要求的版本即可。第二个参数可以命名为minVersion,更贴切
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改,请确认
scheduleApiVersion = null; | ||
logger.warn( | ||
"The scheduler API version can not be confirmed. Some functionalities may not operate as expected."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果甚至都获取不到版本,报错。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为报错。
scheduleApiVersion = { major: 1, minor: 1, patch: 0 }; | ||
// 如果接口服务不存在 | ||
} else if ((e as any).code === status.UNIMPLEMENTED) { | ||
scheduleApiVersion = { major: 1, minor: 1, patch: 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
接口服务不存在是指VersionService不存在吗?那也是认为是1.0.0。不需要区分
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
是的。已修改为服务或接口不存在都认为版本为1.0.0,请确认
} | ||
|
||
if (scheduleApiVersion) { | ||
const compareResult = compareSemVersion(comparedVersion, scheduleApiVersion); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
现在只要判断版本是否>=就可以了。以后需要的话以后再说吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改为在本函数下直接判断调度器api版本是否大于等于最低要求版本,只比较major和minor不再比较patch版本,请确认
// /** | ||
// * 判断当前集群下的API版本与调度器API版本 | ||
// * @param client | ||
// * @returns | ||
// */ | ||
// export async function checkScowSchedulerApiVersion(client: SchedulerAdapterClient): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
没太看懂这个函数是在做什么……SCOW只需要管要调用的集群的调度器版本,不需要管SCOW自己的版本
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已删除注释掉的代码部分
做了什么
需求
在文件管理中增加对所有文件对象增加”提交“按钮直接sbatch执行
暂时只考虑提交的文本作为作业直接执行
执行后也应该展示在正在运行的作业和已结束的作业中
提交文件大小限制为1M
实现
此pr完成上述需求
增加文件是否为文本文件的判断
增加此操作在审计系统中的操作日志
提交失败时
操作日志追加
调度器适配器接口仓库分支暂时变更为临时分支,测试结束后合并调度器适配器接口分支进主分支,修改复原SCOW中调度器适配器仓库