Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All chapters: Black Box Model comments #261

Closed
14 tasks done
lmadavies opened this issue Nov 13, 2024 · 9 comments · Fixed by #285
Closed
14 tasks done

All chapters: Black Box Model comments #261

lmadavies opened this issue Nov 13, 2024 · 9 comments · Fixed by #285

Comments

@lmadavies
Copy link
Collaborator

lmadavies commented Nov 13, 2024

These comments on the sections related to black box models made on the version which was live on Friday 8 November 2024. The sections outlined below are what was covered.

Definitions

  • AI: Could explicitly say this includes models which fall under generative AI. In particular, pre-trained models which are being used and in-house.
  • On AI I would also note that not all AI models are black box models (see point below).
  • Agree with the definition of a black box model - that these are models which are not succinctly explainable using the inner workings of the model. I would argue however that there is a difference between many machine learning models and black box models. A random forest is explainable, even with many trees it is as explainable arguably than a model with 1,000 lines of code. Need to define ML more precisely.
    E.g. Machine Learning uses algorithms to learn from patterns in data without needing to programme explicit business rules. Some models are white box models and some are black box models. It is a subset of artificial intelligence.

Proportionality: 3.4 Artificial intelligence and business risk
No comments. Proportionality remains the same consideration. AI models are naturally a more complex analytical process/technique so will require more structure around the QA process, but this is in line with other complex models.

  • You may want to note that the Generative AI Framework for HMG has QA as a principle of use: "Principle 10: You use these principles alongside your organisation’s policies and have the right assurance in place"

Analytical Life Cycle: 5.2.5 Maintenance and continuous review

  • I would argue that all these are important for black box models, with the addition that there be processes in place to assess data drift and the model output quality between deployments

Engagement and Scoping: 6.6 Black box models and the engagement and scoping stage
No comments. It highlights the areas where ML/AI models require additional scrutiny and offers a link to a document to assist with this.

Design Decisions

  • Line Where relevant this should include a review of the data used in any pretrained models you are using. I would also highlight the need to make an assessment of the quality of the data at design stage. AI/ML models "learn" based on the input data so this can have a huge impact of the validity of the model (including bias) and the types of model with are appropriate.
  • Line You may want to be more formal here, consider the appropriate data training, validation and testing strategy for your model. This is even the case for pre-trained models where you want to test which is the best version for your question. You still need to make sure your 'design' phase is separate from your 'outputs' phase.
  • Carrying on above suggestion to split into two points. 1) Considering the training data and a 80/10/10 split. 2) Considering appropriate validation methods for the models such as calculating similarity to labelled images or ground truths for generative AI.
  • Line More broadly consider your maintenance and ongoing assurance of the model if this is being used as part of regular decision making. This needs to consider the likely inputs and how these may drift over time. I assume here we are not thinking about the maintenance of the infrastructure around a deployed ML/AI model which will also require consideration.
  • Link the above to the maintenance section
  • Line Possible they will need to consider the overall Governance of the model (depending on the use case). This could be through the relevant ethics committee but if one doesn’t exist or this is outside the remit then additional governance needs to be put in place.
  • It may also be necessary to set up a peer review committee to discuss the techniques and design (in addition to the ethics considerations).

Analysis

  • Line Do you mean in the production environment?

Delivery and Communication: Black box models and the delivery, communication and sign-off stage

  • Line Perhaps peer review to ensure the latest guidance and assurance methodology is taken into account.
@lmadavies lmadavies changed the title Review: Black Box Model comments Black Box Model comments Nov 13, 2024
@lmadavies lmadavies changed the title Black Box Model comments Review: Black Box Model comments Nov 13, 2024
@Hurstharrier
Copy link
Collaborator

Comments from Andrew Duncan from the Turing Centre
A few days late, but I’ve finished going through the book.

General comments:

  1. It might help to have a very explicit taxonomy of the models you have in mind. For example, would you consider a simple data-driven statistical model to be an AI/ML model, e.g. something you would write as a Structural Equation Model or a Generalised Linear model? I wouldn’t consider most of these to be black-box, but they do require data.
  2. Is the book focused on the commissioning of a static piece of analysis? Do the models developed have any longevity within these organisations, or are they single-use? In the former, then maybe one needs to have a bit more product focused viewpoint, e.g. planning where it is going to run (cloud, local machine, etc) and what resources are needed to enable this.
  3. Are all your models built in isolation? Do you combine / compose models? In the latter case, are there any standards / best-practices for interoperability which you might encourage?

Specific Comments:

• Section 2: “Artificial Intelligence models (including Machine Learning) are the most common type of black box models used today. Other forms of black box models may arise in future."

It's not just about the nature of the model, it’s also about the provenance. For example, it’s quite common for an organisation to purchase a piece of commercial modelling software where the inner workings are proprietary and protected IP, in which case it is black box. I would argue this is far more a common existing setting for bb models (unless you’re developing all your models in-house). Additionally, as models grow in complexity, there will come a point where they just have to be considered black-box, simply because the inner workings are too complex and multi-faceted.

• Figure 3-1: Version control is certainly important for maintaining longevity of a software base, but is it really relevant to assurance? I would assume this would be mandated for all software engineering activities.

• Section 6.6: Concerns about ethics, etc would pertain to any data-driven model (not just AI) -- or are we using AI as a catch-all for all data-driven models (including white-box/ grey-box models) – see previous comment.

• Section 7.6: For data-driven black-box models, it’s also important that the Analyst produces an estimate of how long they expect the trained model to remain within validity thresholds (e.g. due to concept drift), and thus come up with a plan for retraining / updating etc. This needs to be accompanied by a plan of how additional / refreshed data will be brought in to refresh the model and what resource (compute) is required.

• Section 8.3.1: For data-driven models, especially black-box, most V&V methods are statistical: i.e. evaluating validity over a sample of the dataset.

@irisoren-sg irisoren-sg changed the title Review: Black Box Model comments All chapters: Black Box Model comments Nov 25, 2024
@irisoren-sg
Copy link
Collaborator

@lmadavies @Hurstharrier . The branch that I created for these changes is now a fair bit behind main. So it would be good to create a new branch from main for these edits, to minimise conflicts. Have either of you started to work on these edits on a local branch, or can you wait until we have committed the outstanding PRs (which should hopefully be done tomorrow). It would be best to create the new branch only when you are ready to start work on the edits so that we try keep the new branch as synced up with main as possible

@lmadavies
Copy link
Collaborator Author

@irisoren-sg I haven't started adding in the comments as I didn't think I had the permissions set up but may do now. Happy to wait until you have made the outstanding PRs before starting on this. You can either start a new branch or rebase the current one, shouldn't make too much difference.

@irisoren-sg
Copy link
Collaborator

@lmadavies I have deleted the branch and will start a new one once we have merged the big outstanding commits. I would prefer to minimise the chance of getting rebase conflict resolution problems. I'll let you know when the branch is ready

@irisoren-sg
Copy link
Collaborator

@lmadavies
Copy link
Collaborator Author

Initial updates to the pages have been made - I would like to do a final review before merging.

I have also tried to address the comments from Turing. The ones I have not included currently are:

• Figure 3-1: Version control is certainly important for maintaining longevity of a software base, but is it really relevant to assurance? I would assume this would be mandated for all software engineering activities.

• Section 6.6: Concerns about ethics, etc would pertain to any data-driven model (not just AI) -- or are we using AI as a catch-all for all data-driven models (including white-box/ grey-box models) – see previous comment.

• Section 8.3.1: For data-driven models, especially black-box, most V&V methods are statistical: i.e. evaluating validity over a sample of the dataset.

@Hurstharrier
Copy link
Collaborator

@lmadavies I have reviewed the additions. I have made one minor change. On the other points

@valentine-scroll our editor will take a look too as she is reviewing the whole document and it makes sense to include this branch

On the bits you haven't changed.

Version control - strictly speaking he is right but in my experience many of us don't treat models as pieces of software that require maintenance etc. Therefore, I think this is a useful reminder.

ON ethics, etc. It seems to me that ethics concerns are more of an issue with AI - just read the papers! Happy to include a reference to ethics for all analysis but I think we need to also make the point for AI.

On data driven models - I agree but am not sure what to add or amend. Am open to suggestions.

@lmadavies
Copy link
Collaborator Author

Version control - I think we do want to include version control in the "types of assurance" figure. Andrew may be specifically thinking about git version control which is a core software engineering activity but the book is meant to cover all types of model (including non-coding models) so therefore I think the meaning is broader here and relates to assurance. Personally, I would argue that every piece of analysis has a level of version control (minimum: when it was produced and whether this is a dev, pre-qa or live version) but won't add anything else at this stage as this goes slightly beyond AI/black box.

For the maintenance point, I have added a few points about this already. Agree that it is an important point to remember specifically for AI models.

On ethics - You have a reference to ethics in the Analytical Plan section on the design page. I will add something into section 6.3 Assurance activities in the engagement and scoping stage to mention ethics again.

On data driven models - Agree not sure what to amend or add here.

@lmadavies lmadavies linked a pull request Dec 16, 2024 that will close this issue
2 tasks
@Hurstharrier
Copy link
Collaborator

@lmadavies
VC - I agree with your views but don't think we need to go any further - thanks for adding the maintenance point
Ethics - good idea for 6.3
Data driven models - just leave that be. I think we could tie ourselves in knots if we do much more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants