-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(mis,portal): 重构scow后端, 对接调度器适配器接口 #632
Conversation
🦋 Changeset detectedLatest commit: 912eff9 The changes in this PR will be included in the next version bump. This PR includes changesets to release 18 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
现在还没法测试正确性, 主要发上来看看 |
同时同步千条以上作业至管理系统扣费 有问题(已修复) |
fetchjob功能优化测试通过 |
em.persist(pricedJob); | ||
await em.flush(); | ||
|
||
pricedJobs.push(pricedJob); | ||
} catch (error) { | ||
logger.warn("invalid job. cluster: %s, jobId: %s, error: %s", job.cluster, job.jobId, error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为了跳过有问题的作业,对于每条作业,都尝试调用flush,如果出错(如某个字段值溢出),则跳过它
1. 部署调度器适配器
首先需要确保您的集群上部署了对应的调度器适配器,得到访问它的地址及端口号
部署适配器可参考文档:
2. 修改SCOW配置文件
首先确保您使用了最新的SCOW镜像(可查看
install.yaml
中的imageTag
字段)在用于部署scow的
scow-deployment
文件夹中,修改配置文件:首先修改集群配置文件
主要变化为删除
slurm
配置项, 将loginNodes
配置项作为独立的一项配置。新增adapterUrl
配置项,标识适配器地址修改管理系统配置文件
删除了
fetchJobs
配置项中的db
项,即不再采用源作业信息数据库,通过适配器同步作业信息3. 不再使用源作业信息数据库
部署使用适配器后,可以不再部署
export-jobs
项目,同步作业信息的功能由适配器完成