Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pylint checker for copyright notices #4564

Closed
vtomole opened this issue Oct 11, 2021 · 10 comments
Closed

pylint checker for copyright notices #4564

vtomole opened this issue Oct 11, 2021 · 10 comments
Assignees
Labels
area/checks area/pylint good first issue This issue can be resolved by someone who is not familiar with the codebase. A good starting issue. kind/health For CI/testing/release process/refactoring/technical debt items no QC knowledge needed Want to contribute to Cirq, but don't know quantum computing? This issue is for you.

Comments

@vtomole
Copy link
Collaborator

vtomole commented Oct 11, 2021

Description of the issue
Maybe pylint could automatically check for copyright notices instead of reviewers needed to keep an eye out. Inspired by #456.

Look into https://stackoverflow.com/questions/51886592/can-pylint-check-for-a-static-comment-copyright-notice-at-the-top-of-all-docum

@vtomole vtomole added good first issue This issue can be resolved by someone who is not familiar with the codebase. A good starting issue. no QC knowledge needed Want to contribute to Cirq, but don't know quantum computing? This issue is for you. kind/health For CI/testing/release process/refactoring/technical debt items area/checks area/pylint labels Oct 11, 2021
@yjt98765
Copy link
Contributor

A 'raw checker' should be able to check it. Can you assign the task to me?

@vtomole
Copy link
Collaborator Author

vtomole commented Oct 12, 2021

@yjt98765 Assigned.

@ronnie-llamado
Copy link

Here's a tool I've seen used in other projects for a similar purpose: https://github.com/fsfe/reuse-tool

@vtomole
Copy link
Collaborator Author

vtomole commented Oct 12, 2021

Thanks for the recommendation @ronnie-llamado . We prefer to leverage tools we already have like pylint for these cases because Cirq contains too many dependencies as it is.

@yjt98765
Copy link
Contributor

@yjt98765 Assigned.

Thank you!

As shown in this example, we need to assign an id (a letter + 4 digits) to the message. Since no custom checkers have been used in this project, we should come up with a rule for this, especially for the numbers (e.g., they are sequential numbers with prefix 00 or 99). The letter is one of C, W, E, F, and R, standing for Convention, Warning, Error, Fatal, and Refactoring. I guess the proper one for the copyright notice would be E as it has to be there. What do you think?

@yjt98765
Copy link
Contributor

Thanks for the recommendation @ronnie-llamado . We prefer to leverage tools we already have like pylint for these cases because Cirq contains too many dependencies as it is.

It seems that the tool can be applied to many types of files, while pylint can only check python files. In this project, Some (if not all) of the bash scripts, TypeScript files, and Jupyter Notebooks also contain a copyright notice.

However, if only one type of copyright is adopted here, applying that tool seems like overkill. It complicates the checking process and increases external dependency.

@yjt98765
Copy link
Contributor

yjt98765 commented Oct 13, 2021

@vtomole I have developed the checker and tested it locally. Before creating a pull request, I have several questions:

  1. There are several versions of copyright notices existing in the project. First, the year ranges from 2018 to 2021 in the first line. Second, there are several variations regarding the 7th line:
  • #http://www.apache.org/licenses/LICENSE-2.0
  • # http://www.apache.org/licenses/LICENSE-2.0
  • # https://www.apache.org/licenses/LICENSE-2.0 (note the difference between http and https)
    I guess all of them are fine. Therefore, I implemented the checking with Regex matching instead of String comparison. Should we allow the copyright notices to be slightly different, or do you prefer a sticker rule?
  1. I have run check/pylint, and more than 300 files are reported as not having the correct copyright notice. Nearly half of them are testing files and 17 are __init__.py. Should we skip these two types of files or require every python file to have the copyright notice? I recommend checking all the files because the logic is clear, and it is easy to maintain the workflow. If some particular file does not need a copyright notice, it can disable the message in the file using # pylint: disable=wrong-copyright-notice.

@vtomole
Copy link
Collaborator Author

vtomole commented Oct 13, 2021

Therefore, I implemented the checking with Regex matching instead of String comparison. Should we allow the copyright notices to be slightly different, or do you prefer a sticker rule?

Let's change all the http links to # https://www.apache.org/licenses/LICENSE-2.0 but keep the copyright years ranging as it that specifies when the file was created and add copyright notices to all the files.

@vtomole
Copy link
Collaborator Author

vtomole commented Oct 13, 2021

guess the proper one for the copyright notice would be E as it has to be there. What do you think?

https://softwareengineering.stackexchange.com/a/19653/322605 says that it's not legally required that every file have a copyright notice (note: the poster is not a lawyer.) Keeping this in mind, I think the more appropriate emitted error would be [R]efactor for a "good practice" metric violation.

CirqBot pushed a commit that referenced this issue Nov 1, 2021
This PR implements a pylint checker that automatically checks the copyright notice at the beginning of a python file. This task is proposed in [Issue 4564](#4564).

The PR consists of the following parts:
1. The implemented pylint checker (dev_tools/pylint_copyright_checker.py) and its test file (dev_tools/pylint_copyright_checker_test.py). The checker reports a message with id "R0001" if there is no copyright notice or the notice does not conform to the standard one. The letter 'R' represents refactoring. The checking can be disabled in the file with pragma `# pylint: disable=wrong-copyright-notice`
2. Modification of pylint-related files (check/pylint, check/pylint-changed-files, dev_tools/conf/.pylintrc) to integrate the new checker into the workflow.
3. Python files that failed in this check. Some copyright notices were wrong (e.g. cirq-core/cirq/circuits/circuit_dag.py and cirq-core/cirq/ops/qubit_order_test.py), and they are corrected. Some files did not have a copyright notice. In this case, the "disable" pragma is added. Note that not all the files need to have a copyright notice, [probably](https://softwareengineering.stackexchange.com/a/19653/322605). After the modification, these files can pass pylint checking again.
mpharrigan pushed a commit to mpharrigan/Cirq that referenced this issue Nov 1, 2021
This PR implements a pylint checker that automatically checks the copyright notice at the beginning of a python file. This task is proposed in [Issue 4564](quantumlib#4564).

The PR consists of the following parts:
1. The implemented pylint checker (dev_tools/pylint_copyright_checker.py) and its test file (dev_tools/pylint_copyright_checker_test.py). The checker reports a message with id "R0001" if there is no copyright notice or the notice does not conform to the standard one. The letter 'R' represents refactoring. The checking can be disabled in the file with pragma `# pylint: disable=wrong-copyright-notice`
2. Modification of pylint-related files (check/pylint, check/pylint-changed-files, dev_tools/conf/.pylintrc) to integrate the new checker into the workflow.
3. Python files that failed in this check. Some copyright notices were wrong (e.g. cirq-core/cirq/circuits/circuit_dag.py and cirq-core/cirq/ops/qubit_order_test.py), and they are corrected. Some files did not have a copyright notice. In this case, the "disable" pragma is added. Note that not all the files need to have a copyright notice, [probably](https://softwareengineering.stackexchange.com/a/19653/322605). After the modification, these files can pass pylint checking again.
@MichaelBroughton
Copy link
Collaborator

This functionality is in now.

rht pushed a commit to rht/Cirq that referenced this issue May 1, 2023
This PR implements a pylint checker that automatically checks the copyright notice at the beginning of a python file. This task is proposed in [Issue 4564](quantumlib#4564).

The PR consists of the following parts:
1. The implemented pylint checker (dev_tools/pylint_copyright_checker.py) and its test file (dev_tools/pylint_copyright_checker_test.py). The checker reports a message with id "R0001" if there is no copyright notice or the notice does not conform to the standard one. The letter 'R' represents refactoring. The checking can be disabled in the file with pragma `# pylint: disable=wrong-copyright-notice`
2. Modification of pylint-related files (check/pylint, check/pylint-changed-files, dev_tools/conf/.pylintrc) to integrate the new checker into the workflow.
3. Python files that failed in this check. Some copyright notices were wrong (e.g. cirq-core/cirq/circuits/circuit_dag.py and cirq-core/cirq/ops/qubit_order_test.py), and they are corrected. Some files did not have a copyright notice. In this case, the "disable" pragma is added. Note that not all the files need to have a copyright notice, [probably](https://softwareengineering.stackexchange.com/a/19653/322605). After the modification, these files can pass pylint checking again.
harry-phasecraft pushed a commit to PhaseCraft/Cirq that referenced this issue Oct 31, 2024
This PR implements a pylint checker that automatically checks the copyright notice at the beginning of a python file. This task is proposed in [Issue 4564](quantumlib#4564).

The PR consists of the following parts:
1. The implemented pylint checker (dev_tools/pylint_copyright_checker.py) and its test file (dev_tools/pylint_copyright_checker_test.py). The checker reports a message with id "R0001" if there is no copyright notice or the notice does not conform to the standard one. The letter 'R' represents refactoring. The checking can be disabled in the file with pragma `# pylint: disable=wrong-copyright-notice`
2. Modification of pylint-related files (check/pylint, check/pylint-changed-files, dev_tools/conf/.pylintrc) to integrate the new checker into the workflow.
3. Python files that failed in this check. Some copyright notices were wrong (e.g. cirq-core/cirq/circuits/circuit_dag.py and cirq-core/cirq/ops/qubit_order_test.py), and they are corrected. Some files did not have a copyright notice. In this case, the "disable" pragma is added. Note that not all the files need to have a copyright notice, [probably](https://softwareengineering.stackexchange.com/a/19653/322605). After the modification, these files can pass pylint checking again.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/checks area/pylint good first issue This issue can be resolved by someone who is not familiar with the codebase. A good starting issue. kind/health For CI/testing/release process/refactoring/technical debt items no QC knowledge needed Want to contribute to Cirq, but don't know quantum computing? This issue is for you.
Projects
None yet
Development

No branches or pull requests

4 participants