Review CI system #2867

jonmaddock · 2023-08-21T16:01:05Z

Now on github, we should review the CI system and think about any future tasks. This relates to #2862 .

timothy-nunn · 2023-08-30T16:08:33Z

PROCESS CI Plan

Introduction

The PROCESS CI system contains several core components that work together to ensure comprehensive testing and tracking of the PROCESS systems code.

Building a Docker container to run tests
Code quality checks
Building PROCESS artefacts
Testing PROCESS (unit, integration, regression)
Tracking PROCESS MFiles
Building documentation

PROCESS regression testing consists of two jobs: a no-tolerance job, and a 5% tolerance job. The no-tolerance job will fail if any of the checked quantities in the MFile are not exactly identical. The 5% job will only fail if any of the checked quantities change by more than 5%. Any new model or model change is expected to change the optima which we converge to, and thus a lot of quantities can change as the result of a change. Generally, a reviewer will look at the change in iteration variables' final values to validate these differences.

History

On GitLab, PROCESS had a less-than-ideal solution to changing test artefacts: each main pipeline would make a commit back to main that overwrite the MFile test assets with the results from main. Branches that were made off of main then benchmarked their changes against the MFile on main at the time the branch was created. This workflow was not implemented into GitHub actions due to the hacky nature.

Aim

To provide a framework that maintains accountability for a branches' changes. That is, the regression test differences should only reflect the changes made on a feature branch.

The proposal

flowchart TB

    subgraph "Feature 1"
        commit_f1_ba(("Commit BA"))
        commit_f1_bb(("Commit BB"))

        commit_f1_ba --- commit_f1_bb
    end

    subgraph main
        commit_a((("Commit A")))
        commit_b((("Commit B")))
        commit_c((("Commit C")))
        commit_d((("Commit D")))

        commit_a --- commit_b
        commit_b --- commit_c
        commit_c --- commit_d
    end

    commit_b --- commit_f1_ba

    subgraph "Feature 2"
        commit_f2_ca(("Commit CA"))
        commit_f2_cb(("Commit CB"))

        commit_f2_ca --- commit_f2_cb
    end

    commit_c --- commit_f2_ca

    subgraph legend
        legend_circle((" ")) ~~~ |A commit with no tracking data| legend_dcircle
        legend_dcircle(((" "))) ~~~ |A commit with tracking data| null:::hidden

        classDef hidden display: none;
    end

The above graph models a typical Git setup of the PROCESS repository. Commits A-D are features which have been merged into the main branch. Two modelers have created feature branches that come off of main at different commits. Feature 1 needs to be tested against the changes it makes to the code with respect to the state of the code at Commit B. Feature 2 needs to be tested against the changes it makes to the code with respect to the state of the code at Commit C.

The proposal is that each commit (merge commit) onto main will be tracked. Tracking means we will keep the data (MFile) from each main pipeline in some external data repository. If we want to run the regression tests on our feature branch, for example Commit CB:

We recurse back through the Git history until we find the latest commit in our history that has tracking data, Commit C.
We download the tracking data for Commit C and compare our result of running PROCESS against the downloaded data.

Benefits:

Data is managed separately from the core code.
Eliminates a CI system commiting to itself.
Maintain accountability for each branch's changes.
Will enable integration of 'the tracker' and testing.

Drawbacks and considerations:

Testing will require internet.
Won't work if main's history is ever changed.
MFile's could be cached locally to avoid constantly downloading data (which could be slow).
What happens if some of the regression tests do not converge on main?

Actions

Immediate:

Begin tracking MFiles in some form to ensure we have a history if required.
Document our proposed changes.

Short term:

Create a v3.0 release of PROCESS.
Implement the above proposal likely using GitHub as the data repository.

Medium term:

Use a proper data repository to store tracked data.
Integrate the tracker and testing to use one source of data.

timothy-nunn · 2023-09-07T14:11:11Z

New issues created to complete the short and medium term features

jonmaddock added this to the Open source release milestone Aug 21, 2023

jonmaddock assigned timothy-nunn Aug 21, 2023

timothy-nunn mentioned this issue Sep 6, 2023

2863 review contributing guide #2892

Merged

jonmaddock mentioned this issue Sep 7, 2023

Improve testing strategy #2903

Open

5 tasks

timothy-nunn mentioned this issue Sep 7, 2023

Convert tests to use tracked MFiles #2915

Closed

timothy-nunn closed this as completed Sep 7, 2023

timothy-nunn mentioned this issue Sep 7, 2023

Review regression testing strategy #2862

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review CI system #2867

Review CI system #2867

jonmaddock commented Aug 21, 2023

timothy-nunn commented Aug 30, 2023 •

edited

Loading

timothy-nunn commented Sep 7, 2023

Review CI system #2867

Review CI system #2867

Comments

jonmaddock commented Aug 21, 2023

timothy-nunn commented Aug 30, 2023 • edited Loading

PROCESS CI Plan

Introduction

History

Aim

The proposal

Actions

timothy-nunn commented Sep 7, 2023

timothy-nunn commented Aug 30, 2023 •

edited

Loading