-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pre-commit hook for common misspelling check in files #18964
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely love that!
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
This PR adds codespell to the pre-commit hooks. This will specifically help us a bit in resolving sphinx errors. From the project page: It does not check for word membership in a complete dictionary, but instead looks for a set of common misspellings. Therefore it should catch errors like "adn", but it will not catch "adnasdfasdf". This also means it shouldn't generate false-positives when you use a niche term it doesn't know about. This means the sphinx errors are not solved completely.
ffe82ad
to
7a48a14
Compare
@@ -1214,6 +1220,7 @@ sharded | |||
shellcheck | |||
shellcmd | |||
shm | |||
sie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is weird
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's from testing release doc about using gpg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done 👏
self.log.info('Poking for blob: %s\nin wasb://%s', self.blob_name, self.container_name) | ||
self.log.info('Poking for blob: %s\n in wasb://%s', self.blob_name, self.container_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is not right.
It will start a new line with space.
There are numbers of this cases in the PR.
Coding lines like this should be excluded. \nin
is valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a fix for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can split the strings into two strings in two log entries @HaloKo4 - that will actually even better reflect thte idea of this. I personally believe using /n on logs is a bad idea because it does not play well with the way how prefix logs works. If you have separate log entries without the /n they will be much nicer laid out (both lines will 'start' in the same place:
Date time log level: Line 1
Date time log level: LIne 2
Instead of (as we have now)
Date time log level: Line 1
LIne 2
The current approach is also potentially (low level) opening for so called 'log injection' vulnerability and it is considered as bad practice to have \n in the log lines because of that.
I think this is a great opportunity for you to improve those logs in general.
Can we.count on your help here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this doesn't fix the problem.
The next person who will do /nsomething
in a log string will be hit by this pre-commit and might change it to space just like was done on this PR. If two lines is what you want then there is need to be another pre commit that prevent using /n in logging. isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw yeah - this is exactly how other people do it. Usually when someone fixes thing like that, also adds pre-commit to prevent it in the future - so yeah we are thinking in the same direction. It would be great to fix it in all places and prevent it in the future (probably 20 or 30 pre-commits of ours came to being in this way)
…flow#18964) This PR adds codespell to the pre-commit hooks. This will specifically help us a bit in resolving sphinx errors. From the project page: It does not check for word membership in a complete dictionary, but instead looks for a set of common misspellings. Therefore it should catch errors like "adn", but it will not catch "adnasdfasdf". This also means it shouldn't generate false-positives when you use a niche term it doesn't know about. This means the sphinx errors are not solved completely. Partial Commit Extracted From: https://github.com/apache/airflow
This PR adds codespell to the pre-commit hooks. This will specifically help
us a bit in resolving sphinx errors.
From the project page:
It does not check for word membership in a complete dictionary, but instead looks for a set of common misspellings.
Therefore it should catch errors like "adn", but it will not catch "adnasdfasdf".
This also means it shouldn't generate false-positives when you use a niche term it doesn't know about.
This means the sphinx errors are not solved completely.
Happy to hear your thoughts.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.