-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] [Dependent] The date rules of the dependent node are ambiguous. #15129
Comments
Search before asking
What happenedCurrently there are two types of datetimes in execution:
But there is currently a logical problem, that is, the In version 3.2.0, the matching logic of 'dependent' nodes is:
Lines 115 to 116 in e648d6d
In the
In the scenario I am in (bank, data warehouse), 'business date' or 'data date' is particularly important, because for the data processing process, the key point of dependence is 'which day the data has been processed', not ' On which day the data processing was completed? The users of the data involve regulatory submissions, reports, and many management systems. They are very sensitive to the business time where the data is located, so it is a professional scheduling tool in the banking industry. Generally, the business date of the data is mainly considered, rather than the execution time of the task. Furthermore, there are many bank-related business systems. In addition to direct business systems, there are also management systems, third-party data sources, and third-party custody systems. The data as of the end of last month will most likely be available on the 2-3rd of next month. For example, the financial system will delay the payment and adjustment for several days. This situation is very common and routine. Currently, the data warehouse I am responsible for has more than 40 upstream systems or data sources, and at least 5 of them may be delayed. There are hundreds of data processing tasks that will be postponed due to upstream delays. Therefore, in an environment that is very sensitive to business dates and has possible delays, there are the following problems for the
The dependency logic of the dependent node in the current implementation has two types of datetime:
However, there is a logical issue with the In version
In the
In my scenario (banking industry, data warehouse), the 'business date' or 'data date' is particularly important because for data processing, the focus of dependency lies in 'which day's data has been processed', rather than 'when was the data processed'. The users of data include regulatory reporting, reports, and many management systems, which are highly sensitive to the business time of the data. Therefore, professional scheduling tools in the banking industry mainly consider the business date of the data, rather than the execution time of tasks. Furthermore, there are many banking-related business systems, including direct business systems, management systems, third-party data sources, and third-party hosting systems. The data at the end of last month may not be available until the 2nd or 3rd day of next month, such as financial system reconciliation that delays for several days. This situation is very common. Currently, I am responsible for more than 40 upstream systems or data sources, and there may be delays in at least 5 of them. As a As a result, there are hundreds of data processing tasks that may be delayed due to upstream delays. Therefore, in an environment where the business date is highly sensitive and there may be delays, there are problems with versions
What you expected to happenFirst, is it reasonable to match the Secondly, if there are indeed different needs, is it possible to add an option to specify whether the dependent node detects the Firstly, is it reasonable to match the Secondly, if there are indeed different requirements, can an option be added to specify whether the dependent node detects the How to reproducePlease refer to the above Please refer to the above content. Anything elseNo response Versiondev Are you willing to submit PR?
Code of Conduct
|
Good catch! Would you like to submit a PR to fix it? @reele |
Yeah, but I believe that the solution to this problem still needs further discussion. This issue in version Similar issues have occurred before between versions |
…iguous. (#15289) * [Fix-15129][Dependent] Fix the ambiguity in date rules for dependent node. * [fix #15129] Revert ddl * restore findLastProcessInterval * update: mvn spotless:apply --------- Co-authored-by: 李乐 <[email protected]> Co-authored-by: xiangzihao <[email protected]>
Search before asking
What happened
目前在执行过程中有两种日期时间:
但目前存在一个逻辑问题,就是
dependent
节点使用processInstance.scheduleTime
作为日期基准去匹配startTime
。在
3.2.0
版本中,'dependent'节点的匹配逻辑是:通过
findLastProcessInterval
找到依赖任务的ProcessInstance
queryLastSchedulerProcessInterval
查询ProcessInstance
, 条件是schedule_time >= #{startTime} and schedule_time <= #{endTime}
queryLastManualProcessInterval
查询ProcessInstance
, 条件是start_time >= #{startTime} and start_time <= #{endTime}
dolphinscheduler/dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/utils/DependentExecute.java
Lines 115 to 116 in e648d6d
通过
findValidTaskListByProcessId
匹配配依赖的任务dolphinscheduler/dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/utils/DependentExecute.java
Lines 158 to 166 in e648d6d
在
dev
版本中,'dependent'节点的匹配逻辑是:queryLastTaskInstanceIntervalByTaskCode
查找 TaskInstance, 条件是start_time >= #{startTime} and start_time <= #{endTime}
dolphinscheduler/dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/utils/DependentExecute.java
Lines 254 to 255 in d675d32
在我所处的场景(银行, 数据仓库)中, '业务日期'或'数据日期'尤其重要, 因为对于数据加工的过程, 依赖的重点在于 '哪天的数据已经处理完成', 而不是 '数据的处理在哪天执行完成过', 数据的使用方涉及监管报送、报表还有诸多管理类系统, 对数据所处的业务时间的敏感度非常高, 所以在银行行业内的专业调度工具一般主要考虑数据的营业日期, 而不是任务的执行时间.
再者, 银行相关的业务系统非常多, 除了直接营业系统外, 还有管理类系统、第三方数据源、第三方托管系统, 截止上月末的数据很有可能会在下个月的2-3号才会产生, 比如财务系统出账调账就会延迟几天, 这种情况非常多且常规, 目前我负责的数据仓库已经有40+个上游系统或数据源, 可能延迟的至少有5个, 而因上游延迟会推后的数据加工任务就有上百个.
所以在对业务日期非常敏感且存在可能延迟的环境下, 对于
和3.2.0
dev
版本有以下问题:1. 对于3.2.0
版本, 假设'dependent'节点(today)所依赖的任务T1
在当天还没有执行, 这时我对T1
执行了日期为3天前的补数
操作, 这就会触发上面3.2.0
版本1.b
的逻辑导致依赖意外检测成功。2. 对于
dev
版本, 假设上游的数据下发推迟了1天, 被依赖的任务T1
在第二天才完成, 'dependent'节点包括它的下游节点永远也不会成功了, 因为永远不会有新任务的startTime
发生在上一天。目前我一直是维护一个本地化的版本,移除
dependent
节点对startTime
的操作,完全依赖scheduleTime
。The dependency logic of the dependent node in the current implementation has two types of datetime:
However, there is a logical issue with the
dependent
node usingprocessInstance.scheduleTime
as the date base to matchstartTime
.In version
3.2.0
, the matching logic of the 'dependent' node is:Find the dependent task's
ProcessInstance
by callingfindLastProcessInterval
.ProcessInstance
by callingqueryLastSchedulerProcessInterval
with the conditionschedule_time >= #{startTime} and schedule_time <= #{endTime}
.ProcessInstance
by callingqueryLastManualProcessInterval
with the conditionstart_time >= #{startTime} and start_time <= #{endTime}
.Reference link:
dolphinscheduler/dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/utils/DependentExecute.java
Lines 115 to 116 in e648d6d
Match the dependent tasks by calling
findValidTaskListByProcessId
.Reference link:
dolphinscheduler/dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/utils/DependentExecute.java
Lines 158 to 166 in e648d6d
In the
dev
version, the matching logic of the 'dependent' node is:queryLastTaskInstanceIntervalByTaskCode
with the conditionstart_time >= #{startTime} and start_time <= #{endTime}
.Reference link:
dolphinscheduler/dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/utils/DependentExecute.java
Lines 254 to 255 in d675d32
In my scenario (banking industry, data warehouse), the 'business date' or 'data date' is particularly important because for data processing, the focus of dependency lies in 'which day's data has been processed', rather than 'when was the data processed'. The users of data include regulatory reporting, reports, and many management systems, which are highly sensitive to the business time of the data. Therefore, professional scheduling tools in the banking industry mainly consider the business date of the data, rather than the execution time of tasks.
Furthermore, there are many banking-related business systems, including direct business systems, management systems, third-party data sources, and third-party hosting systems. The data at the end of last month may not be available until the 2nd or 3rd day of next month, such as financial system reconciliation that delays for several days. This situation is very common. Currently, I am responsible for more than 40 upstream systems or data sources, and there may be delays in at least 5 of them. As a result, there are hundreds of data processing tasks that may be delayed due to upstream delays.
Therefore, in an environment where the business date is highly sensitive and there may be delays, there are problems with versions
and3.2.0
dev
:1. For version3.2.0
, assuming that the dependent taskT1
that the 'dependent' node (today) depends on has not been executed on that day, if I perform a 'supplement' operation on it with a date three days ago, this will trigger the above logic in version3.2.0
, causing an unexpected detection of dependency success.2. For the
dev
version, assuming that the upstream data is delayed by one day, and the dependent taskT1
is completed on the second day, the 'dependent' node and its downstream nodes will never succeed because there will never be a new task'sstartTime
that occurs on the previous day.Currently, I have been maintaining a localized version that removes the operation of
dependent
node onstartTime
and relies solely onscheduleTime
.What you expected to happen
首先,用工作流的
scheduleTime
匹配任务的startTime
是否合理?其次,如果确实有不同的需求,是不是可以增加一个选项,指定依赖节点是检测
调度时间
还是实际开始时间
?Firstly, is it reasonable to match the
startTime
of a task with thescheduleTime
?Secondly, if there are indeed different requirements, can an option be added to specify whether the dependent node detects the
scheduleTime
or the actualstartTime
?How to reproduce
请参考上述内容
Please refer to the above content.
Anything else
No response
Version
dev
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: