-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Re-run execution from Step i #192
Comments
@jugdemon This feature request needs a bit more thinking: Execution reruns
I guess the parameters should be allowed to overwrite, correct? If yes and if we store the project path on the executions as suggested in #194
Download outputs @caviri What do you think? Does this approach make sense to you? |
I think a rerun should use the same component versions but may use different parameters. I don't see why you couldn't change brains but at this point it might not be necessary. Yes that would be perfect. Skip the prep (if no version changes) and be able to change parameters for components after step i. That is if you start at step 3, you cannot change parakeets for step 1 and 2. Does that make sense? UI-wise, maybe make it a flag in a column? There can only be one flag and all components after the flag may change parameters and components before the flag may not. Let me know what you think. Yes, zip file downloads in the overview are perfect. |
I think this feature makes sense and it's needed to implement in the CLI. As a workaround I was running partial execution workflows from previous output folder, but this is far from ideal and not compatible with GUI. I'd like to drop here my ideas on the different cases: A) Rerun an already run execution from a midpoint. Here, IMO, components, versions and parameters should be similar. If a execution requires different inputs, parameters, or versions then a new execution should be added under the same digital twin with another ID, and label. For this I think it would be necessary to add a flag to the different steps document to reflect which one has been already executed. When defining the rerunning only the ones affected can get the flag set to B) Create a new execution based on a previous execution. It can be useful to copy the one previous execution in order to edit only a few components and parameters. This option can be added in the GUI with a "Duplicate" button. In this case it can happen that the initial part of the pipeline has been already run in previous executions. For this, one idea is to let the user manually decide if one execution step can be related to a previous one with one dropdown listing previous steps. The only caveat I see here @sabinem is that, when running this with local folder, this will duplicate the content for the 2 executions, unless we implement specific input, output paths in each step rather than a general project path. Finally, regarding the S3 downloads I think it's reasonable to add a download button from the GUI of the compressed output. |
@caviri These are great point and I think they are mostly compatible. I think on the ODTWS we actually had a conceptual framework for project / workflow / components / execution. Logically, to me it would make sense that one project holds multiple executions. That there is an "not yet run" workflow to start with and that each component has its on local folder. An execution can than be stringed together by using the workflow as a basis. For each component, it can either be re-run (adding a new execution output in the components local folder) or re-use the old excutions (with the caveat that all previous steps to the old execution cannot be changed and must be used). In that sense, we can get a tree of related executions (which would be fun to visualise but probably not within the scope of this project ...). My question is, is it realistic to get workflows (or a semblance thereof) implemented in the left-over time on the project? I think the duplicate button is a no-brainer that helps even if we have to re-run the whole thing. Even if we do not have workflows, I think a component-based storage of executions could be a useful preparation for the next iteration of ODTP. It would also faciliate to use those partial executions as a starting point based on the input of older folders like your little hack does. I think that hack could run under the hood of the GUI until we have a more stable workflow implemented. |
@jugdemon , @caviri rethinking this, can we do the folllowing:
I think adding workflows is more doable and cleaner than any other approach. Since it is generally not good that executions can rerun. What do you both think about this? |
That makes absolutely sense and it was also the direction I was thinking. From a user point of view, if they can execute the workflow multiple times, it doesn't matter that it produces several executions. It would just be good if the executions don't always need a new project. Right now, you need to specify a new project each time you execute - which is especially hampering if your twin doesn't run in the first go (we ended up having corsica + random numbers as project paths which is not efficient and incomprehensible when you revisit). I think the best is to store all the executions inside the project folder. In the UI of completed executions, the executions could also be grouped by project, that way different projects can peacefully co-exist without cluttering the user experience. |
Hello @sabinem, I agree that's a good solution. However, I won't have free time to work into this before the end of the project as I would need to use my focus days (1 per month) and spare time from others projects. |
Description:
We currently run executions and we would prefer to be able to rerun executions (a short hand for delete and run) so that we don't need to create a new project for each execution. Especially during development. The other issue is that for instance the first step is downloading the data. Once it is in the S3, it is sort of silly to redo it every time. We would like to create an execution that start with the first step already completed. This was initially a goal of ODTP. I am not sure whether this is possible at the moment on the CLI level but it is not exposed on the Dashboard. This would save tremendous time. It would also allow us to start after the execution and only inspect the results without big trouble.
In this context, it would also be great to download the output of the full execution from UI and also the outputs of each step.
It could also be meaningful to have executions within projects so that we don't need new projects for each execution but have the executions automatically labelled and numbered inside a project.
Importance Level
(High)
It was a planned core feature of ODTP and it might be available on CLI but on the Dashboard it is not. We have Dashboard only users, so it would be great if we can expose them to a specific step of the slow execution instead of always starting from the beginning.
Origin
See description / importance.
User Impact
Users' can reuse the computation of slow but stable steps such as downloading datasets instead of having to do it again very single time. It frees up time, makes the twins faster and more interactive.
Mockups or Diagrams
Add an interface to completed executions that let's us revisit steps and rerun them, possible with different parameters.
Affected Components (examples: components, modules, … )
Identification of specific parts of the project that the feature or feedback pertains to. This could be ODTP modules or ODTP components.
Technical Requirements (if possible, otherwise completed by SDSC)
Detailed technical specifications or requirements needed to implement the feature. This could include algorithms, data structures, APIs, or third-party services.
Related Documents/Links:
References to any related documentation, user stories, tickets, or external resources that provide additional context.
Dependencies (if possible, otherwise completed by SDSC):
Identification of any other features, systems, or processes that the proposed feature depends on or interacts with. This can be considered a “ready if” field and it will define what’s needed to have in order to start the development.
Acceptance criteria:
Specific criteria or metrics for evaluating the success or effectiveness of the feature once implemented.
The text was updated successfully, but these errors were encountered: