Skip to content

Commit

Permalink
docs: Better documentation for developers (#427)
Browse files Browse the repository at this point in the history
Closes #375.

### Summary of Changes

* Improved developer documentation about tests
* Added guidelines about copying objects (rather than modifying them
in-place)
* Added code-style guidelines
* Added code review guidelines

---------

Co-authored-by: Lars Reimann <[email protected]>
  • Loading branch information
zzril and lars-reimann authored Jul 10, 2023
1 parent bc73a6c commit 2c54ec0
Show file tree
Hide file tree
Showing 3 changed files with 220 additions and 24 deletions.
72 changes: 72 additions & 0 deletions docs/development/code_review.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Code review

This document describes
general guidelines for developers
when reviewing a pull request (PR).

## Verify that the PR solves the addressed issue

### Understand the issue

* Read the issue that was addressed by the PR and make sure you understand it
* Read any discussions that occurred on the issue page

### Check the PR

* Check the PR for __completeness__: Was every point from the issue covered?
* Check for __correctness__: Were the problems described in the issue solved in the intended way?
* (For PRs that introduce new features:) Check the design - does it comply with our [project guidelines][guidelines-general]?
* Check for potential bugs - edge cases, exceptions that may be raised in functions that were called etc.
* Check the code style: Is it readable? Do variables have sensible names? Is it consistent to existing code? Is there a better / more elegant way to do certain things (e.g. [f-string](https://docs.python.org/3/tutorial/inputoutput.html#tut-f-strings) instead of manual string concatenation)?
* Check any warnings reported by your IDE

## Check the tests

* Run the tests locally
* Check if there are any warnings that are not caught by the test suite
* Make sure the test cases are complete - do they cover all edge cases (e.g. empty tables)?
* Make sure they comply to our [project guidelines][guidelines-tests] - are the tests parametrized and do all testcases have descriptive IDs?

## Verify that the PR does not break existing code

This is largely covered by the automated tests.
However, you should always:

* Make sure all tests actually ran through (pytest, linter, code coverage)
* Make sure that the branch is up-to-date with the `main` branch, so you actually test the behaviour that will result once the feature branch is merged

## Check the PR format

* Check that the PR title starts with a fitting [type](https://github.com/Safe-DS/.github/blob/main/.github/CONTRIBUTING.md#types) (e.g. `feat`, `fix`, `docs`, ...)
* Check that the changes introduced by the PR are documented in the PR message

## Requesting changes

If you found any issues with the reviewed PR,
navigate to the `Files changed` tab in the PR page
and click on the respective line
to add your comments.

For more details on specific review features,
check out the [documentation on GitHub][github-review].

## Finishing the review

When done, finish your review
by navigating to the `Files changed` tab
and clicking the green `Review changes` button
on the top right.

If you found no issues,
select the `Approve` option
to approve the changes made by the PR.
If you like, you can add a "LGTM" meme
from [lgtm.party](https://lgtm.party/),
preferably one with a cat image.

If you found any problems,
select `Request Changes` instead.

[guidelines-general]: project_guidelines.md
[guidelines-tests]: project_guidelines.md#tests
[github-review]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Guidelines
# Project Guidelines

This document describes general guidelines for the Safe-DS Python Library. In the **DO**/**DON'T** examples below we either show _client code_ to describe the code users should/shouldn't have to write, or _library code_ to describe the code we, as library developers, need to write to achieve readable client code. We'll continuously update this document as we find new categories of usability issues.

Expand Down Expand Up @@ -56,6 +56,43 @@ Some flag parameters drastically alter the semantics of a function. This can lea
table.drop("name", axis="columns")
```

### Return copies of objects

Modifying objects in-place
can lead to surprising behaviour
and hard-to-find bugs.
Methods shall never change
the object they're called on
or any of their parameters.

!!! success "**DO** (library code):"

```py
result = self._data.copy()
result.columns = self._schema.column_names
result[new_column.name] = new_column._data
return Table._from_pandas_dataframe(result)
```

!!! failure "**DON'T** (library code):"

```py
self._data.add(new_column, axis='column')
```

The corresponding docstring should explicitly state
that a method returns a copy:

!!! success "**DO** (library code):"

```py
"""
Return a new table with the given column added as the last column.
The original table is not modified.
...
"""
```

### Avoid uncommon abbreviations

Write full words rather than abbreviations. The increased verbosity is offset by better readability, better functioning auto-completion, and a reduced need to consult the documentation when writing code. Common abbreviations like CSV or HTML are fine though, since they rarely require explanation.
Expand Down Expand Up @@ -205,27 +242,6 @@ The user should not have to deal with exceptions that are defined in the wrapper
return pd.read_csv(path) # May raise a pd.ParserError
```

### Sort entries in `__all__` lists alphabetically
The entries in the `__all__` list in `__init__.py` files should be sorted alphabetically. This helps reduce the likelihood of merge conflicts when new entries are introduced on different branches.
!!! success "**DO** (library code):"

```py
__all__ = [
"ColumnSizeError",
"DuplicateColumnNameError",
"MissingValuesColumnError"
]
```
!!! failure "**DON'T** (library code):"

```py
__all__ = [
"MissingValuesColumnError",
"ColumnSizeError",
"DuplicateColumnNameError"
]
```

### Group API elements by task

Packages should correspond to a specific task like classification or imputation. This eases discovery and makes it easy to switch between different solutions for the same task.
Expand Down Expand Up @@ -370,7 +386,114 @@ Examples

## Tests

If a function contains more code than just the getting or setting of a value, automated test should be added to the [`tests`][tests-folder] folder. The file structure in the tests folder should mirror the file structure of the [`src`][src-folder] folder.
We aim for 100% line coverage,
so automated tests should be added
for any new function.

### File structure

Tests belong in the [`tests`][tests-folder] folder.
The file structure in the tests folder
should mirror the file structure
of the [`src`][src-folder] folder.

### Naming

Names of test functions
shall start with `test_should_`
followed by a description
of the expected behaviour,
e.g. `test_should_add_column`.

!!! success "**DO** (library code):"

```py
def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None:
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
...
```

!!! failure "**DON'T** (library code):"

```py
def test_value_error(self, number_of_trees) -> None:
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
...
```

### Parametrization

Tests should be parametrized
using `@pytest.mark.parametrize`,
even if there is only a single test case.
This makes it easier
to add new test cases in the future.
Test cases should be given
descriptive IDs.

!!! success "**DO** (library code):"

```py
@pytest.mark.parametrize("number_of_trees", [0, -1], ids=["zero", "negative"])
def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None:
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
RandomForest(number_of_trees=number_of_trees)
```

!!! failure "**DON'T** (library code):"

```py
def test_should_raise_if_less_than_0(self, number_of_trees) -> None:
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
RandomForest(number_of_trees=-1)

def test_should_raise_if_equal_to_0(self, number_of_trees) -> None:
with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."):
RandomForest(number_of_trees=0)
```

## Code style

### Consistency

If there is more than one way
to solve a particular task,
check how it has been solved
at other places in the codebase
and stick to that solution.

### Sort exported classes in `__init__.py`

Classes defined in a module
that other classes shall be able to import
must be defined
in a list named `__all__`
in the module's `__init__.py` file.
This list should be sorted alphabetically,
to reduce the likelihood of merge conflicts
when adding new classes to it.

!!! success "**DO** (library code):"

```py
__all__ = [
"Column",
"Row",
"Table",
"TaggedTable",
]
```

!!! failure "**DON'T** (library code):"

```py
__all__ = [
"Table",
"TaggedTable",
"Column",
"Row",
]
```

[src-folder]: https://github.com/Safe-DS/Stdlib/tree/main/src
[tests-folder]: https://github.com/Safe-DS/Stdlib/tree/main/tests
Expand Down
3 changes: 2 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,9 @@ nav:
- Glossary: glossary.md
- Development:
- Environment: development/environment.md
- Guidelines: development/guidelines.md
- Project Guidelines: development/project_guidelines.md
- Contributing: https://github.com/Safe-DS/Stdlib/contribute
- Code Review Guide: development/code_review.md

# Configuration of MkDocs & Material for MkDocs --------------------------------

Expand Down

0 comments on commit 2c54ec0

Please sign in to comment.