Notes on documentation from C100 migration experience #222

WillTaylor22 · 2022-08-23T12:19:23Z

I'm a senior web developer who has used PaaS extensively, and have used AWS and Azure (though not every component involved in this migration).

Just using the docs I found the overall structure of the SDS platform difficult to get my head around, and the config files were not easy to put together or debug.

If I were going to improve the documentation, with the goal to be self-service for non-dev-ops senior developers, the key changes are a better descriptions of the objects that the platform actually uses, plus comprehensive documentation on the configuration files.

What I'd add

Objects to describe (in no particular order - more detail on what I would describe for each object below):

Database
Key Vault
Redis
Application services
Cron Jobs
Charts
Helm
Flux
WAF (Firewall)
Container Registry
Kubernetes Cluster
Virtual Networks
Front door
Ingress
Public DNS
Private DNS
Identities
Kubernetes CLI
Pipeline
Jenkins
Github & Github Teams
Azure AD
Slack
TLS certificates
Azure Application Gateway (Load balancer)

For every object:

What is it and why do we have it?
When is it applicable?
Do I create my own one (e.g. a database) or is there a central one that I connect to (e.g. the flux service)?
Link to any external documentation
How do I get read and write access?
If I must create it or connect to it, how do do so?
If this object needs a config file:
- Systematically, what does every single attribute I can add to the config file do behind the scenes? Links to any relevant external documentation.
- What are the allowed values or naming conventions for each attribute? (must either be present in the external documentation or explicitly listed in this documentation) and why? (there shouldn't be any unexplained comments such as "also, add 'foo: bar' at the end")
- If this config file inherits from a parent config file, chart, etc, where is that parent defined? Line by line (or referencing external docs where needed), how does the parent work?
Does this object do anything periodically? Does it do anything when triggered by another component?
How do I monitor it?
- If monitored by CLI, what commands can I use (link to external documentation if helpful). What commands are commonly used?
- If monitored by a dashboard, what actions (or scripts) can I run in the dashboard? What is commonly used?
How do I debug it when it goes wrong?
Common issues / troubleshooting.

Additional sections

1.) BAU Processes:-

Manual QA testing:- page written for QAs on how to access
Deployment of new code to production:- page written for junior developers who don't need to understand the overall infrastructure
ITHC - what it actually is, how to set it up, how to give an external pen tester access
Demo environment - what it is, how to give external users access
Monitoring the service:- page for junior developers who don't need to understand the overall infrastructure, but should be able to manage pods through the Kubernetes CLI.
Staging - page for non-technical people on why it exists and how to access it.
[ In proposed restructuring below, I'd add any existing BAU process pages under this sub-heading ]

Possible extension - full documentation restructuring

If it were totally up to me, I would completely re-structure the documentation. The existing structure (see nav left side) is, in my opinion at least, confusing. Other devs have said to me that they don't know where to start or what is relevant.

One possible way to structure would be:

Introduction (one page) - scope of SDS, links to team pages, links to projects, links to shared repositories, links to external services documentation.
Full infrastructure diagram with glossary below (one page). Link to full page for each object in the glossary
Objects (one comprehensive page per object, as described above)
BAU Process guides - guides for non-platform-users (described above)

I'd scrap everything else. e.g.

https://hmcts.github.io/ways-of-working/principles/
https://hmcts.github.io/ways-of-working/standards/
https://hmcts.github.io/ways-of-working/standards/db-schema-change.html#making-a-database-schema-change
https://hmcts.github.io/ways-of-working/standards/apis.html
https://hmcts.github.io/ways-of-working/standards/pull-requests.html#who-should-review
etc.

Perhaps there are other users who do need these, but I didn't during the migration, and I'd do what I could to thin the documentation down.

hmcts-platform-operations mentioned this issue Jun 19, 2024

Create self-service documentation(62) hmcts/roadmap-platform-operations#384

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Notes on documentation from C100 migration experience #222

Notes on documentation from C100 migration experience #222

WillTaylor22 commented Aug 23, 2022 •

edited

Loading

Notes on documentation from C100 migration experience #222

Notes on documentation from C100 migration experience #222

Comments

WillTaylor22 commented Aug 23, 2022 • edited Loading

What I'd add

Additional sections

1.) BAU Processes:-

Possible extension - full documentation restructuring

WillTaylor22 commented Aug 23, 2022 •

edited

Loading