Convert tests to use tracked MFiles #2915

timothy-nunn · 2023-09-07T12:58:07Z

This issue implements the workflow for testing described in #2867. #2870 introduced the tracking of MFiles for each merge commit onto main.

This issue implements the recursive search for regression test assets, downloading them to be used in testing.

Considerations when implementing this feature, which may be deferred to other issues, are caching of test assets.

The text was updated successfully, but these errors were encountered:

timothy-nunn · 2024-02-14T17:18:03Z

Copy of the plan:

PROCESS CI Plan

Introduction

The PROCESS CI system contains several core components that work together to ensure comprehensive testing and tracking of the PROCESS systems code.

Building a Docker container to run tests
Code quality checks
Building PROCESS artefacts
Testing PROCESS (unit, integration, regression)
Tracking PROCESS MFiles
Building documentation

PROCESS regression testing consists of two jobs: a no-tolerance job, and a 5% tolerance job. The no-tolerance job will fail if any of the checked quantities in the MFile are not exactly identical. The 5% job will only fail if any of the checked quantities change by more than 5%. Any new model or model change is expected to change the optima which we converge to, and thus a lot of quantities can change as the result of a change. Generally, a reviewer will look at the change in iteration variables' final values to validate these differences.

History

On GitLab, PROCESS had a less-than-ideal solution to changing test artefacts: each main pipeline would make a commit back to main that overwrite the MFile test assets with the results from main. Branches that were made off of main then benchmarked their changes against the MFile on main at the time the branch was created. This workflow was not implemented into GitHub actions due to the hacky nature.

Aim

To provide a framework that maintains accountability for a branches' changes. That is, the regression test differences should only reflect the changes made on a feature branch.

The proposal

flowchart TB

    subgraph "Feature 1"
        commit_f1_ba(("Commit BA"))
        commit_f1_bb(("Commit BB"))

        commit_f1_ba --- commit_f1_bb
    end

    subgraph main
        commit_a((("Commit A")))
        commit_b((("Commit B")))
        commit_c((("Commit C")))
        commit_d((("Commit D")))

        commit_a --- commit_b
        commit_b --- commit_c
        commit_c --- commit_d
    end

    commit_b --- commit_f1_ba

    subgraph "Feature 2"
        commit_f2_ca(("Commit CA"))
        commit_f2_cb(("Commit CB"))

        commit_f2_ca --- commit_f2_cb
    end

    commit_c --- commit_f2_ca

    subgraph legend
        legend_circle((" ")) ~~~ |A commit with no tracking data| legend_dcircle
        legend_dcircle(((" "))) ~~~ |A commit with tracking data| null:::hidden
        classDef hidden display: none;
    end

The above graph models a typical Git setup of the PROCESS repository. Commits A-D are features which have been merged into the main branch. Two modelers have created feature branches that come off of main at different commits. Feature 1 needs to be tested against the changes it makes to the code with respect to the state of the code at Commit B. Feature 2 needs to be tested against the changes it makes to the code with respect to the state of the code at Commit C.

The proposal is that each commit (merge commit) onto main will be tracked. Tracking means we will keep the data (MFile) from each main pipeline in some external data repository. If we want to run the regression tests on our feature branch, for example Commit CB:

We recurse back through the Git history until we find the latest commit in our history that has tracking data, Commit C.
We download the tracking data for Commit C and compare our result of running PROCESS against the downloaded data.

Benefits:

Data is managed separately from the core code.
Eliminates a CI system commiting to itself.
Maintain accountability for each branch's changes.
Will enable integration of 'the tracker' and testing.

Drawbacks and considerations:

Testing will require internet.
Won't work if main's history is ever changed.
MFile's could be cached locally to avoid constantly downloading data (which could be slow).
What happens if some of the regression tests do not converge on main?

Actions

Immediate:

Begin tracking MFiles in some form to ensure we have a history if required.
Document our proposed changes.

Short term:

Create a v3.0 release of PROCESS.
Implement the above proposal likely using GitHub as the data repository.

Medium term:

Use a proper data repository to store tracked data.
Integrate the tracker and testing to use one source of data.

timothy-nunn · 2024-02-21T15:59:32Z

A running track of questions I have asked myself, and answered:

What about a non-linear history while a branch is being developed?
The non-linear bits of the branch will have not been on main, so won't have any test assets. The most recent tracked commit on main will always be reached first by the recursive algorithm (in practice it will actually be iterative, but it is conceptually recursive).

What about people wanting to run tests who haven't cloned our repository?
Who will do that? Running tests should be done on a development install of PROCESS which will almost always have required git to install, so its a safe bet everyone wanting to run the tests will:

Have git installed.
Be inside of the PROCESS git repository when running the tests.

What about people with no internet?
They can't test PROCESS, but most developers of PROCESS should be on internet, and if they can't be for a sustained period of time, probably have bigger issues than whether or not PROCESS works.

timothy-nunn mentioned this issue Sep 7, 2023

Review regression testing strategy #2862

Closed

timothy-nunn self-assigned this Feb 14, 2024

timothy-nunn assigned clmould Feb 14, 2024

timothy-nunn mentioned this issue Feb 21, 2024

Regression test overwriting #3043

Closed

timothy-nunn mentioned this issue Mar 19, 2024

Rewrite regression tests to be get assets from remote repository #3089

Merged

5 tasks

timothy-nunn closed this as completed in #3089 Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert tests to use tracked MFiles #2915

Convert tests to use tracked MFiles #2915

timothy-nunn commented Sep 7, 2023

timothy-nunn commented Feb 14, 2024 •

edited

Loading

timothy-nunn commented Feb 21, 2024 •

edited

Loading

Convert tests to use tracked MFiles #2915

Convert tests to use tracked MFiles #2915

Comments

timothy-nunn commented Sep 7, 2023

timothy-nunn commented Feb 14, 2024 • edited Loading

PROCESS CI Plan

Introduction

History

Aim

The proposal

Actions

timothy-nunn commented Feb 21, 2024 • edited Loading

A running track of questions I have asked myself, and answered:

timothy-nunn commented Feb 14, 2024 •

edited

Loading

timothy-nunn commented Feb 21, 2024 •

edited

Loading