[PROPOSAL] Always use pre-commit to run Python linters #9309

vyasr · 2021-09-24T21:55:37Z

cuDF Python currently has a number of useful linters configured to perform static checks of code style and quality. We currently maintain version pinnings for our linters in two places: 1) in the .pre-commit-config.yaml file at the repo root, and 2) in the conda recipes in conda/environments/. This duplication can cause problems because minor changes in linter rules across different versions can change the preferred format of code, costing developers time especially if they are unfamiliar with the linters.

To avoid these issues, I'm proposing the following:

Remove linters from our conda environments and maintain them only in the .pre-commit-config.yaml file.
Modify CI style checks in ci/checks/style.sh to run pre-commit rather than invoking linters manually.
Going forward, assume all developers running hooks invoke them via pre-commit run HOOK rather than directly invoking HOOK on the command line. Developers who manually call hooks are responsible for dealing with version incompatibilities. Note that this does not require developers to actually install Git hooks because manual pre-commit invocations as above will work even without installing the pre-commit Git hooks.

This change should not affect other RAPIDS projects. We will only be removing linters from cudf's conda environments, but other packages are free to continue installing them. In the long run, having more RAPIDS projects switch to using pre-commit will allow different projects to pin different versions of linters so that upgrading a single hook does not require RAPIDS-wide changes (cf. #8215). @rapidsai/ops does this seem feasible to you? @charlesbluca I'm sure you gave this some thought when getting isort synced across RAPIDS, what do you think? CC @shwina @bdice @quasiben (@dantegd and @BradReesWork may be interested in following this thread in case cuML or cuGraph want to go this route as well in the long term).

The text was updated successfully, but these errors were encountered:

charlesbluca · 2021-09-24T22:48:23Z

Would definitely be in favor of making this change for a variety of reasons:

at least for isort, it was a chore to get consistent behavior between isort ... and pre-commit run isort ..., and this would eliminate that work
this would make it easier to bump linter versions in the future, as it wouldn't be a necessity to sync linter version up with gpuCI and consequently all other RAPIDS projects using that linter (see Bump isort version to 5.6 integration#286)
clang-format requires a separate script to be ran on gpuCI; when this script is not updated to match up with additions to the codebase, we end up with code that fails our style guidelines (for example, Example to build custom application and link to libcudf #7671 added a CPP directory that is not being linted by gpuCI because it isn't in this list)

bdice · 2021-09-24T22:55:14Z

Also, there is an ongoing effort to make updated builds of clang-format that can be pulled from PyPI and used as a pre-commit hook repository. See issue on clang-format wheel repo.

vyasr · 2021-09-24T23:33:40Z

Let's push clang-format discussions to a future issue so that we don't confuse this thread. The following are not part of this issue (although they are things we may come back to at a future date, I just don't want to get off topic here):

Linters for components other than cuDF Python. To my knowledge this is just clang-format at present, but also includes any Java.
Switching from our custom run-clang-format script to a raw clang-format call.
Using pre-commit.ci to automatically fixup PRs

This issue is exclusively to discuss how we want to move forward with running Python linters both on developers' local machines and on gpuCI to make it as easy as possible for these to conform.

ajschmidt8 · 2021-09-27T15:50:30Z

From an OPs perspective, I don't see any issues here. This seems like a worthwhile effort if it will help ensure that CI checks and local checks are in sync. Especially if it prevents all the RAPIDS repositories from having to use the same linting versions. If all the other libraries adopt this methodology, then I think we could also potentially remove some of the tools from the integration repo environments (i.e. rapids-build-env) which would be nice.

shwina · 2021-09-27T15:54:49Z

+1 that it would be nice to do this across all of RAPIDS. However, let's limit the initial work to just cuDF Python, to make sure that it works and works well, for CI as well as for developers.

bdice · 2021-09-28T14:04:36Z

@trxcllnt Do you have thoughts on this? If we move this forward, I would like to make the lint-cudf-cpp / lint-cudf-python commands in rapids-compose call pre-commit, for consistency with CI linting.

trxcllnt · 2021-09-29T20:14:10Z

@bdice I support anything that allows us to delete lint.sh.

Switch CI to use pre-commit to run linters rather than invoking them directly. This centralizes linter version management, simplifies our need to clean up the output of linters for CI, and allows cudf to handle linters independently of the gpuCI containers. Resolves #9309. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Jordan Jacobelli (https://github.com/Ethyling) - Bradley Dice (https://github.com/bdice) - Ashwin Srinath (https://github.com/shwina) URL: #9412

This change adds pre-commit hook to run on each commit, that will automatically run code linters and not allow you commit changes if the linters fail. This is similar to how cudf uses pre-commit, as described in rapidsai/cudf#9309 . The pre-commit checks will also be run in CI as part of the check style script. This change also adds several new linters that weren't been run in CI before: * pydocstyle * mypy * black * isort * cmake-format * cmake-lint Of the new linters - Mypy caught an issue where the test_comms.py would always be skipped (`python/raft-dask/raft_dask/test/test_comms.py:23: error: Module "raft_dask" has no attribute "Comms”` ), and Pydocstyle caught some minor formatting inconsistencies in the python docstrings. black/isort ensure consistent code formatting for the python code here.

This change adds pre-commit hook to run on each commit, that will automatically run code linters and not allow you commit changes if the linters fail. This is similar to how cudf uses pre-commit, as described in rapidsai/cudf#9309 . The pre-commit checks will also be run in CI as part of the check style script. This change also adds several new linters that weren't been run in CI before: * pydocstyle * mypy * black * isort * cmake-format * cmake-lint Of these new linters - mypy caught an issue where the test_comms.py would always be skipped (`python/raft-dask/raft_dask/test/test_comms.py:23: error: Module "raft_dask" has no attribute "Comms"` ), and Pydocstyle caught some minor formatting inconsistencies in the python docstrings. black/isort ensure consistent code formatting for the python raft code

This change adds a pre-commit hook to run on each commit, that will automatically run code linters and not allow you to commit changes if the linters fail. This is similar to how cudf uses pre-commit, as described in rapidsai/cudf#9309 . The pre-commit checks will also be run in CI as part of the check style script. This change also adds several new linters that weren't being run in CI before: * pydocstyle * mypy * black * isort * cmake-format * cmake-lint Of these new linters - mypy caught an issue where the test_comms.py would always be skipped (`python/raft-dask/raft_dask/test/test_comms.py:23: error: Module "raft_dask" has no attribute "Comms"` ), and Pydocstyle caught some minor formatting inconsistencies in the python docstrings. black/isort ensure consistent code formatting for the python raft code Authors: - Ben Frederickson (https://github.com/benfred) Approvers: - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) - Ray Douglass (https://github.com/raydouglass) - Bradley Dice (https://github.com/bdice) URL: #965

vyasr added proposal Change current process or code gpuCI Python Affects Python cuDF API. labels Sep 24, 2021

vyasr self-assigned this Sep 24, 2021

This was referenced Oct 7, 2021

[FEA] Make CMake style check used in CI easy to run locally #9396

Closed

Use pre-commit for CI #9412

Merged

rapids-bot bot closed this as completed in #9412 Oct 12, 2021

benfred mentioned this issue Oct 28, 2022

Run linters using pre-commit rapidsai/raft#965

Merged

vyasr mentioned this issue Feb 17, 2023

CI: Remove specification of manual stage for check_style.sh script. rapidsai/raft#1283

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] Always use pre-commit to run Python linters #9309

[PROPOSAL] Always use pre-commit to run Python linters #9309

vyasr commented Sep 24, 2021

charlesbluca commented Sep 24, 2021

bdice commented Sep 24, 2021 •

edited

Loading

vyasr commented Sep 24, 2021

ajschmidt8 commented Sep 27, 2021

shwina commented Sep 27, 2021

bdice commented Sep 28, 2021

trxcllnt commented Sep 29, 2021

[PROPOSAL] Always use pre-commit to run Python linters #9309

[PROPOSAL] Always use pre-commit to run Python linters #9309

Comments

vyasr commented Sep 24, 2021

charlesbluca commented Sep 24, 2021

bdice commented Sep 24, 2021 • edited Loading

vyasr commented Sep 24, 2021

ajschmidt8 commented Sep 27, 2021

shwina commented Sep 27, 2021

bdice commented Sep 28, 2021

trxcllnt commented Sep 29, 2021

bdice commented Sep 24, 2021 •

edited

Loading