Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise CONTRIBUTING.md #34

Merged
merged 3 commits into from
Apr 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
345 changes: 231 additions & 114 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,130 +1,247 @@
# Contributing guidelines

- [Contributing guidelines](#contributing-guidelines)
- [Coding standards](#coding-standards)
- [Version control](#version-control)
- [Git guidelines](#git-guidelines)


## Coding standards

We intend to use a properly configured linter for each supported language. Run a linter before committing your code. Some general coding principles we want to adhere to (inspired by [Tristan's coding standards Confluence page](https://confluence.galois.com/display/~tristan/Haskell+Coding+Standards)):

* Explicit is better than implicit
* Code should be written to be read, and not to make writing more convenient; assume that the person reading the code is you in a year
* Comments are critical, and should prioritize explaining why rather than what
* Similarly, good commit messages are required; good commit messages explain why a change was made (including links to issues where appropriate or reference to observed incorrect behaviors that may inform others who see similar failures) (more on this in [Git guidelines](#git-guidelines))
* Strive to make error states unrepresentable (but not at the expense of clarity)
* Advanced development tools are great, but should not be required to develop a project
* Libraries should not call exit or produce console output (unless initiating a truly mandatory crash); libraries should not have fatal crashes
* Prefer library-first development (the functionality of any program should be available as a library)
* A clean version control history is important (e.g., to support bisecting and code understanding), but extensive history rewriting is not important (more on this in [Git guidelines](#git-guidelines))

VERSE project specific guidelines:
* Use *snake_case* when possible for naming files and folders. The only exceptions exist:
* the top-level files, such as `README.md`, `CONTRIBUTING` and such.
* `Dockerfile*` is capitalized, per Docker [naming conventions](https://stackoverflow.com/a/63995752)
* `Makefile*` is capitalized, per GNU Make [naming conventions](https://www.gnu.org/software/make/manual/make.html#Makefile-Names)
* Include [submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) at the point of use (e.g. `./foo/bar/my_submodule`), rather than in the root directory (`./my_submodule`).

## Version control

The VERSE project uses Git for revision control. Currently, we are
using [Github][] for hosting and code review.

The VERSE project and its client does not demand that commits or MRs
are cryptographically signed. You are welcome to setup signatures and
follow a workflow that preserves signatures if you wish, but it is not
mandatory.

We are following [Trunk Based Development](https://trunkbaseddevelopment.com/) (TBD) methodology.
The development workflow is as follows:

1. In the Github project repository, create an issue describing the change
you're planning to make.
2. Push "Create a branch" in the newly created issue.
Github will create a topic branch named `XX-issue-name` where `XX` is the issue number.

Checkout the branch
and a Pull Request (PR), all linked to the
issue. Make sure that the PR is marked as draft, untilizing `DRAFT` prefix.
You can perform these steps manually but Github makes it easier.
3. Locally, in a terminal opened in the project repo clone, pull updates
from the remote and switch to the branch created in step #2.
Or use IDE of your choice to accomplish the same.
4. Develop and test the change locally. If applicable, make sure to add
unit and integration tests for the change. Push your commits frequently.
Once you have some commits in your branch, go to the Github project repository,
and open a new *DRAFT* pull request. If not done automatically, in the PR description
link to the issue it is closing using `closes/resolves/fixes` [keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue)
5. If you are working with another developer on
the same topic branch, use `git pull --rebase` to rebase your local
branch against the remote topic branch (resolving any conflicts
that arise) before pushing, which will keep history clean on the
topic branch. A few notes:
* avoid merge conflicts as much as possible, as they are difficult to resolve. Communicate your intent to your colleagues ahead of time, and limit the scope of your changes as much as possible
* if a merge conflict arises (e.g. against the `main` branch), and is non-trivial to resolve, contact the author of the changes and work together to ensure a proper resolve
6. Push changes to the remote origin. This will trigger CI pipeline execution.
If the tests pass and you're done (i.e. the PR fully addresses the issue it is linked to), mark the PR as *READY FOR REVIEW*
7. Typically, at least one _other_ person must review any changes to
the parent branch, and approve it using the Github PR interface. A
_reviewer_ should check that all necessary comments are addressed. The _reviewer_ should either:
* **Request changes** for blocking/breaking issues and tricky fixes that require re-reviewing
* **Approve** for suggestions/opinions that you trust the code author to consider and address as they see fit. Approving without comments or only a simple *LGTM* is acceptable
* **Comment** for giving early feedback on a longer review
* For example, a reviewer marks a PR as *Approved* even though they added a couple of commments about the PR – the author can address it if they agree, but the PR can be merged as is
8. If a reviewer deems some comments must be addressed, please ensure
that discussions salient to the changes are captured in the merge
request and in the associated issues.
9. Note that *force-pushes are dangerous*, so make sure that you
know that no one else has pushed changes to the branch that are not
in the history of your local branch. If others on the team are
pulling and testing it locally, they will need to fix up their
local branches with `git checkout <yourbranch>`, `git fetch`, and
`git reset --hard origin/<yourbranch>`. For more details, please
see [The Dark Side of the Force Push][] and [--force considered
harmful; understanding git's --force-with-lease][].
10. Once it has been approved, PR author merges the PR (in Github push the `Merge pull request`
button in the PR UI). Github will execute
CI/CD pipeline and complete the merge.
11. If, for some reason the source branch was not automatically deleted, the reviewer that merges the branch should manually delete
the branch after the merge.
12. Do not forget to occasionally clean up local branches that have
been merged! Doing so on a weekly cadence means that you do not
become overwhelmed with branches.
Learn how the VERSE team develops the toolchain and how you can contribute.

[The Dark Side of the Force Push]: http://willi.am/blog/2014/08/12/the-dark-side-of-the-force-push/
[--force considered harmful; understanding git's --force-with-lease]: https://developer.atlassian.com/blog/2015/04/force-with-lease/
[Magit]: https://magit.vc/
[GitKraken]: https://www.gitkraken.com/
[Github]: https://github.com/
## Start here

Before you dive in, [set up your development environment](#setting-up-your-development-environment) to automatically enforce
our style guidelines and linting rules. These avoid needless nitpicking in code
reviews.

We use [this GitHub project](https://github.com/orgs/GaloisInc/projects/23) to
track and coordinate development.

- All work starts by creating an issue.
- The issue is done when the work is done, tested, reviewed, and merged.

Read [contributing code](#contributing-code) for a step-by-step guide and an
explanation of why we do things this way.

Finally, read through these best practices guidelines.

- [VERSE-specific guidelines](#verse-specific-guidelines)
- [Coding principles](#coding-principles)
- [Git guidelines](#git-guidelines)
- (external) [A wonderful, general overview of best practices](https://gitlab-ext.galois.com/program-analysis/guidance/-/blob/master/BestPractices.org) ([permalink](https://gitlab-ext.galois.com/program-analysis/guidance/-/blob/78d2d6229f027579384c39a76cf28026b03279a7/BestPractices.org))

Wondering if the juice is worth the squeeze? Jump to [Why this process?](#why-this-process)

## Setting up your development environment

We currently don't have any specific environment setup, but will add information
here as we progress in the project.

## Contributing code

We are following [Trunk Based Development](https://trunkbaseddevelopment.com/)
methodology. Here's the development workflow.

1. [Create an issue](#create-an-issue)
1. [Create a branch](#create-a-branch)
1. [Write your code](#write-your-code)
1. [Create a PR and get it reviewed](#create-a-pr-and-get-it-reviewed)
1. [Merge your PR](#merge-your-pr)
1. [Delete your branch](#delete-your-branch)

### Create an issue

Create a GitHub issue describing the change you're planning to make.

- Assign it to yourself.
- Add a label for your organization (Galois, UCam, etc.)
- Add labels for the type of work (IDE, CN, documentation, etc.)
thatplguy marked this conversation as resolved.
Show resolved Hide resolved

If you create the issue from the repo on GitHub, there are _bug_ and _feature_
issue templates to guide you through the process. Creating issues from the
project board unfortunately cannot use the templates.

Creating an issue automatically adds it to the project board, which helps us
thatplguy marked this conversation as resolved.
Show resolved Hide resolved
track and prioritize ongoing work across different teams and organizations.
Tying issues to development work (and PRs) also makes it easier to answer questions
in the future, like "What went into the release we sent to the user study last
year?" or "What exactly did we fix from the red team's report?"

### Create a branch

Create a topic branch named `XX-issue-name` where XX is the issue number.

Naming branches this way makes it clear which branch corresponds to which issue
and makes it easier to keep the repo clear of unused or outdated branches. An
easy way to create a branch following this convention is to click the _"create a
branch"_ button on the issue page and then check out the newly created branch
locally.

### Write your code

Here are some things to consider as you develop on your branch:

- Structure your changes into small commits that each address a specific task.
See [What goes in a commit?](#what-goes-in-a-commit)

- Write unit and integration tests.

- Rebase regularly from the main branch (`git pull -r origin main`).

- Push regularly to your remote branch. This saves your work and runs CI tests.

When you're done...

### Create a PR and get it reviewed

When you are ready to have your changes reviewed, create a PR on GitHub.

- Summarize your change.
- Tag one or more reviewers.
- Link to the issue it addresses using `closes/resolves/fixes #XX`, where XX is the issue number.
thatplguy marked this conversation as resolved.
Show resolved Hide resolved

The PR template will guide this process.

At least one reviewer needs to approve your PR. The _reviewer_ should either:

- **Request changes** for blocking/breaking issues and tricky fixes that require re-reviewing.
- **Approve** for suggestions/opinions that you trust the code author to consider and address as they see fit. Approving without comments or a simple _LGTM_ is acceptable.
- **Comment** for giving early feedback on a longer review.

For example, a reviewer marks a PR as _Approved_ even though they added a couple
of commments about the PR – the author can address (or ignore) them as they see
fit, and then merge without another round of reviewing.

Consider reflecting significant change requests or discussion points back into
the related issue(s) as appropriate.

### Merge your PR

Before merging, see [What goes in a commit?](#what-goes-in-a-commit) for
guidelines on merging your branch commit history into `main`.

### Delete your branch

GitHub usually does this for you. Consider updating your local state with

```
git checkout main
git pull
git branch -d your-branch-name
```

Git will warn you if you try to delete a branch that hasn't been merged.

## VERSE-specific guidelines

- Use language-appropriate naming conventions for naming files and folders, and default to _snake_case_ when in doubt. The only exceptions are:
- the top-level files, such as `README.md`, `CONTRIBUTING` and such.
- `Dockerfile*` is capitalized, per Docker [naming conventions](https://stackoverflow.com/a/63995752)
- `Makefile*` is capitalized, per GNU Make [naming conventions](https://www.gnu.org/software/make/manual/make.html#Makefile-Names)
- Include [submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules) at the point of use (e.g. `./foo/bar/my_submodule`), rather than in the root directory (`./my_submodule`).

## Coding principles

- Explicit is better than implicit
- Code should be written to be read, and not to make writing more convenient; assume that the person reading the code is you in a year
- Comments are critical, and should prioritize explaining why rather than what
- Similarly, good commit messages are required; good commit messages explain why a change was made (including links to issues where appropriate or reference to observed incorrect behaviors that may inform others who see similar failures) (more on this in [Git guidelines](#git-guidelines))
- Advanced development tools are great, but should not be required to develop a project
- Libraries should not call exit or produce console output (unless initiating a truly mandatory crash); libraries should not have fatal crashes
- Prefer library-first development (the functionality of any program should be available as a library)
- A clean version control history is important (e.g., to support bisecting and code understanding), but extensive history rewriting is not important (more on this in [Git guidelines](#git-guidelines))

## Git guidelines

- [General guidelines](#general-guidelines)
- [What goes in a commit?](#what-goes-in-a-commit)
- [Why support `git bisect`?](#why-support-git-bisect)

### General guidelines

- Do not commit directly to `main`.
- To support bisecting, do not merge WIP commits that break the build.
On topic branches, squash commits as needed before merging, but only
to reduce excessive small commits; the development history of topic
branches should be preserved as much as is reasonable. Use your
best judgement. Ask a git expert for advise if you are stuck more
than 10 minutes.
- Write short, useful commit messages with a consistent style. Follow
these [seven rules][], with the amendment that on this project, we
have adopted the convention of ending the subject line with a
period. Galois has excellent resources about commit messages, please consult
[What goes into a commit](https://confluence.galois.com/pages/viewpage.action?pageId=82346420)

- Do not merge WIP commits that break the build (required for
[`git bisect`](#why-support-git-bisect)).

- Write short, useful commit messages with a consistent style (see [What goes in
a commit?](#what-goes-in-a-commit)).

- Keep your topic branches small to facilitate review.

- Before merging someone else's PR, make sure other reviewers'
comments are resolved, and that the MP author considers the PR ready
to merge.

- For security-sensitive code, ensure your changes have received an
in-depth review, preferably from multiple reviewers. Please consult [Code reviews page](https://confluence.galois.com/display/EN/Code+Reviews) for more details.
- Configure Git so that your commits are [signed][].
- Whenever possible, use automation to avoid committing errors or
noise (e.g., extraneous whitespace). Use linters, automatic code
formatters, test runners, and other static analysis tools.
Configure your editor to use them, and when feasible, integrate them
into the upstream continuous integration checks.

- Consider using `git pull -r` to fetch remote changes. This will rebase local
changes on top of remote changes, which keeps history more linear and will keep
local changes instead of discarding them in the case that someone force pushed
to the remote.

- Avoid `git push --force` to shared branches. If you must, prefer `git push
--force-with-lease --force-if-includes`. See [The Dark Side of the Force Push][]
and [--force considered harmful; understanding git's --force-with-lease][].

- [Optional] Configure Git so that your commits are [signed][].

[The Dark Side of the Force Push]: http://willi.am/blog/2014/08/12/the-dark-side-of-the-force-push/
[--force considered harmful; understanding git's --force-with-lease]: https://developer.atlassian.com/blog/2015/04/force-with-lease/
[seven rules]: https://chris.beams.io/posts/git-commit/#seven-rules
[signed]: https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work

### What goes in a commit?

Follow these [seven rules][] for writing commit messages.

Commits should be minimal self-contained changes, reflected in the commit
message. This is an art we aspire to. See [this
article](https://confluence.galois.com/pages/viewpage.action?pageId=82346420)
for an in-depth discussion.

For example, consider these sets of commit messages.

- `Add logging util to capture timing`
- `Add new core algorithm, disabled by default`
- `Add flag to use new core algorithm`

is preferable to

- `Add new core algorithm with flag to enable`

is preferable to

- `First attempt at core algorithm`
- `Fix logging`
- `Refactor logging fix`
- `Fix up algorithm`
- `WIP`
- `Finish alg`
- `Add flag`

Small commits that each address a single, self-contained issue are easy to review, easy to trace in `git log`, and make `git bisect` much more effective.

### Why support `git bisect`?

`git bisect <known-good-commit> <known-bad-commit>` is a debugging utility to find which commit first introduced a bug. Using a binary search, it will iteratively check out a commit between a known-good and known-bad commit; you will then run tests to determine if the bug is present and mark the commit "good" or "bad". Eventually, it will point you to the first bad commit.

This only works if every commit is well formed. Suppose you have an example that triggers a bug, and you use `git bisect` to build and test each commit on that example. Consider the following series of commits.

1. `Partially implements XX (broken)`
2. `Finishes implementing XX (fixed)`

When `git bisect` checks out (1), you will be unable to build the project or determine if the bug is present.

The guidelines in [What goes in a commit?](#what-goes-in-a-commit) support `git bisect` by encouraging each commit to be small and self contained.

## Why this process?

We are coordinating development across a variety of teams at different organizations around the world, and the project is expected to run for several years. Despite the long duration, our timeline is tight. Hence, we need to balance several concerns:

- **Visibility**. We need to understand what work has been planned, started, completed, or blocked, and who is doing it, in order to prioritize our next steps as things slip or circumstances change. We use issues and the project board for this.

- **Traceability**. When code changes, we need to understand why. This sometimes
lives in commit messages, PR messages, issues, or a developer's head. In this
project, we're standardizing on issues, which is why all work starts with an
issue. Commit messages still give context for specific changes (details
[here](#what-goes-in-a-commit)); multiple commits may support a single issue.

- **Development best practices.** Software engineering research and personal experience both suggest that coherent contribution guidelines ease development while raising the level of assurance. See [here](https://gitlab-ext.galois.com/program-analysis/guidance/-/blob/78d2d6229f027579384c39a76cf28026b03279a7/BestPractices.org) for motivation and details of specific best practices, some of which are adopted in this document.
thatplguy marked this conversation as resolved.
Show resolved Hide resolved

- **Developer ergonomics**. Development should focus on building neat stuff, not fighting a process. We don't want to add boilerplate for no reason.