diff --git a/docs/development/code_review.md b/docs/development/code_review.md new file mode 100644 index 000000000..075ac0f27 --- /dev/null +++ b/docs/development/code_review.md @@ -0,0 +1,72 @@ +# Code review + +This document describes +general guidelines for developers +when reviewing a pull request (PR). + +## Verify that the PR solves the addressed issue + +### Understand the issue + +* Read the issue that was addressed by the PR and make sure you understand it +* Read any discussions that occurred on the issue page + +### Check the PR + +* Check the PR for __completeness__: Was every point from the issue covered? +* Check for __correctness__: Were the problems described in the issue solved in the intended way? +* (For PRs that introduce new features:) Check the design - does it comply with our [project guidelines][guidelines-general]? +* Check for potential bugs - edge cases, exceptions that may be raised in functions that were called etc. +* Check the code style: Is it readable? Do variables have sensible names? Is it consistent to existing code? Is there a better / more elegant way to do certain things (e.g. [f-string](https://docs.python.org/3/tutorial/inputoutput.html#tut-f-strings) instead of manual string concatenation)? +* Check any warnings reported by your IDE + +## Check the tests + +* Run the tests locally +* Check if there are any warnings that are not caught by the test suite +* Make sure the test cases are complete - do they cover all edge cases (e.g. empty tables)? +* Make sure they comply to our [project guidelines][guidelines-tests] - are the tests parametrized and do all testcases have descriptive IDs? + +## Verify that the PR does not break existing code + +This is largely covered by the automated tests. +However, you should always: + +* Make sure all tests actually ran through (pytest, linter, code coverage) +* Make sure that the branch is up-to-date with the `main` branch, so you actually test the behaviour that will result once the feature branch is merged + +## Check the PR format + +* Check that the PR title starts with a fitting [type](https://github.com/Safe-DS/.github/blob/main/.github/CONTRIBUTING.md#types) (e.g. `feat`, `fix`, `docs`, ...) +* Check that the changes introduced by the PR are documented in the PR message + +## Requesting changes + +If you found any issues with the reviewed PR, +navigate to the `Files changed` tab in the PR page +and click on the respective line +to add your comments. + +For more details on specific review features, +check out the [documentation on GitHub][github-review]. + +## Finishing the review + +When done, finish your review +by navigating to the `Files changed` tab +and clicking the green `Review changes` button +on the top right. + +If you found no issues, +select the `Approve` option +to approve the changes made by the PR. +If you like, you can add a "LGTM" meme +from [lgtm.party](https://lgtm.party/), +preferably one with a cat image. + +If you found any problems, +select `Request Changes` instead. + +[guidelines-general]: project_guidelines.md +[guidelines-tests]: project_guidelines.md#tests +[github-review]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests diff --git a/docs/development/guidelines.md b/docs/development/project_guidelines.md similarity index 76% rename from docs/development/guidelines.md rename to docs/development/project_guidelines.md index 95cf16d57..22cfcbb87 100644 --- a/docs/development/guidelines.md +++ b/docs/development/project_guidelines.md @@ -1,4 +1,4 @@ -# Guidelines +# Project Guidelines This document describes general guidelines for the Safe-DS Python Library. In the **DO**/**DON'T** examples below we either show _client code_ to describe the code users should/shouldn't have to write, or _library code_ to describe the code we, as library developers, need to write to achieve readable client code. We'll continuously update this document as we find new categories of usability issues. @@ -56,6 +56,43 @@ Some flag parameters drastically alter the semantics of a function. This can lea table.drop("name", axis="columns") ``` +### Return copies of objects + +Modifying objects in-place +can lead to surprising behaviour +and hard-to-find bugs. +Methods shall never change +the object they're called on +or any of their parameters. + +!!! success "**DO** (library code):" + + ```py + result = self._data.copy() + result.columns = self._schema.column_names + result[new_column.name] = new_column._data + return Table._from_pandas_dataframe(result) + ``` + +!!! failure "**DON'T** (library code):" + + ```py + self._data.add(new_column, axis='column') + ``` + +The corresponding docstring should explicitly state +that a method returns a copy: + +!!! success "**DO** (library code):" + + ```py + """ + Return a new table with the given column added as the last column. + The original table is not modified. + ... + """ + ``` + ### Avoid uncommon abbreviations Write full words rather than abbreviations. The increased verbosity is offset by better readability, better functioning auto-completion, and a reduced need to consult the documentation when writing code. Common abbreviations like CSV or HTML are fine though, since they rarely require explanation. @@ -205,27 +242,6 @@ The user should not have to deal with exceptions that are defined in the wrapper return pd.read_csv(path) # May raise a pd.ParserError ``` -### Sort entries in `__all__` lists alphabetically -The entries in the `__all__` list in `__init__.py` files should be sorted alphabetically. This helps reduce the likelihood of merge conflicts when new entries are introduced on different branches. -!!! success "**DO** (library code):" - - ```py - __all__ = [ - "ColumnSizeError", - "DuplicateColumnNameError", - "MissingValuesColumnError" - ] - ``` -!!! failure "**DON'T** (library code):" - - ```py - __all__ = [ - "MissingValuesColumnError", - "ColumnSizeError", - "DuplicateColumnNameError" - ] - ``` - ### Group API elements by task Packages should correspond to a specific task like classification or imputation. This eases discovery and makes it easy to switch between different solutions for the same task. @@ -370,7 +386,114 @@ Examples ## Tests -If a function contains more code than just the getting or setting of a value, automated test should be added to the [`tests`][tests-folder] folder. The file structure in the tests folder should mirror the file structure of the [`src`][src-folder] folder. +We aim for 100% line coverage, +so automated tests should be added +for any new function. + +### File structure + +Tests belong in the [`tests`][tests-folder] folder. +The file structure in the tests folder +should mirror the file structure +of the [`src`][src-folder] folder. + +### Naming + +Names of test functions +shall start with `test_should_` +followed by a description +of the expected behaviour, +e.g. `test_should_add_column`. + +!!! success "**DO** (library code):" + + ```py + def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None: + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): + ... + ``` + +!!! failure "**DON'T** (library code):" + + ```py + def test_value_error(self, number_of_trees) -> None: + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): + ... + ``` + +### Parametrization + +Tests should be parametrized +using `@pytest.mark.parametrize`, +even if there is only a single test case. +This makes it easier +to add new test cases in the future. +Test cases should be given +descriptive IDs. + +!!! success "**DO** (library code):" + + ```py + @pytest.mark.parametrize("number_of_trees", [0, -1], ids=["zero", "negative"]) + def test_should_raise_if_less_than_or_equal_to_0(self, number_of_trees) -> None: + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): + RandomForest(number_of_trees=number_of_trees) + ``` + +!!! failure "**DON'T** (library code):" + + ```py + def test_should_raise_if_less_than_0(self, number_of_trees) -> None: + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): + RandomForest(number_of_trees=-1) + + def test_should_raise_if_equal_to_0(self, number_of_trees) -> None: + with pytest.raises(ValueError, match="The parameter 'number_of_trees' has to be greater than 0."): + RandomForest(number_of_trees=0) + ``` + +## Code style + +### Consistency + +If there is more than one way +to solve a particular task, +check how it has been solved +at other places in the codebase +and stick to that solution. + +### Sort exported classes in `__init__.py` + +Classes defined in a module +that other classes shall be able to import +must be defined +in a list named `__all__` +in the module's `__init__.py` file. +This list should be sorted alphabetically, +to reduce the likelihood of merge conflicts +when adding new classes to it. + +!!! success "**DO** (library code):" + + ```py + __all__ = [ + "Column", + "Row", + "Table", + "TaggedTable", + ] + ``` + +!!! failure "**DON'T** (library code):" + + ```py + __all__ = [ + "Table", + "TaggedTable", + "Column", + "Row", + ] + ``` [src-folder]: https://github.com/Safe-DS/Stdlib/tree/main/src [tests-folder]: https://github.com/Safe-DS/Stdlib/tree/main/tests diff --git a/mkdocs.yml b/mkdocs.yml index 2014e01d0..ba93e6dd6 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -15,8 +15,9 @@ nav: - Glossary: glossary.md - Development: - Environment: development/environment.md - - Guidelines: development/guidelines.md + - Project Guidelines: development/project_guidelines.md - Contributing: https://github.com/Safe-DS/Stdlib/contribute + - Code Review Guide: development/code_review.md # Configuration of MkDocs & Material for MkDocs --------------------------------