-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(checkquery): tool for checking SPARQL query files (resolves #48) #53
feat(checkquery): tool for checking SPARQL query files (resolves #48) #53
Conversation
Tool can check all, changed or single SPARQL query files
Thank you for the pull request!The Scribe team will do our best to address your contribution as soon as we can. The following is a checklist for maintainers to make sure this process goes as well as possible. Feel free to address the points below yourself in further commits if you realize that actions are needed :) If you're not already a member of our public Matrix community, please consider joining! We'd suggest using Element as your Matrix client, and definitely join the General and Data rooms once you're in. It'd be great to have you! Maintainer checklist
|
src/scribe_data/checkquery.py
Outdated
try: | ||
context.setQuery(query.load(limit)) | ||
result = context.query().convert() | ||
return result if result else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the context is configured to return the result as JSON then the convert
method will always return a dictionary.
Furthermore, the context.query().convert()
call can be replaced by context.queryAndConvert()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess for a lof of the sparqlwrapper methods we have camelCase, so maybe making this change makes sense? I'm fine either way though, and thank you for doing the research into this!
def load(self, limit: int) -> str: | ||
"""Load the SPARQL query from 'path' into a string. | ||
|
||
Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss how we want docstrings to work in terms of eventual documentation, m-charlton :) I added a point to the dev sync just now that we should look into readthedocs or something for that. In that case we should standardize how we're doing these. I personally have no preferece. If memory serves me I went with the NumPy style, but then readability wise I like yours more! Just a question of whether they'd translate to a documentation web interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have changed so that the text begins on a new line. I'm using the autoDocstring VS code plugin for docstrings for no other reason than it was there.
Quite prepared to change to a common format as consistency is good.
According to this stackoverflow answer as long as the plugin is configured correctly then sphinx will generate HTML documentation.
src/scribe_data/checkquery.py
Outdated
|
||
|
||
def changed_queries() -> Optional[list[QueryFile]]: | ||
"""Find all the SPARQL queries that have changed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so cool! And also adds so much value to this process! Being able to run all of them on the off week to make sure that we can split them if need be, plus using changed_queries()
whenever there's an edit via a PR!
query_file (str): the file to validate. | ||
|
||
Returns: | ||
Path: the validated file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd change this to fpath (str): ...
. To me file_path
is also a bit more in line with the variable/function names. I can make these edits though :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/scribe_data/checkquery.py
Outdated
return fpath | ||
|
||
|
||
def check_positive_number(value: str, err_msg: str) -> int: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit for the function name: I'd call it check_positive_int
given the functionality and to make it more explicit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Makes more sense.
Args: | ||
limit (str): the LIMIT to be validated. | ||
|
||
Raises: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nice. With the issue that we do for reworking the doc strings let's for sure also add in a Raises
for files that need it! I added this to the sync notes.
group.add_argument( | ||
"-c", | ||
"--changed", | ||
action="store_true", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so cool. Really I'm so happy to see how this can be done! Really fascinating to see this in action!
"--endpoint", | ||
type=str, | ||
default="https://query.wikidata.org/sparql", | ||
help="URL of the SPARQL endpoint", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even including that we could check other endpoints is so great! As stated, I don't think we'll need to worry about Lexemes moving to a Wikibase anytime soon, but if and when it happens we just switch this and we're good to go!
( | ||
[ | ||
("/root", ("src",), ("README.txt",)), | ||
("/root/src", (), ("spam.sh", "eggs.py")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😋
tests/load/test_checkquery.py
Outdated
], | ||
) | ||
def test_main_mutex_opts(args): | ||
"""some options cannot be used together""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit on the comment: capital S and a period at the end. I think for doctrings it also makes sense to have them always like:
"""
Docstring starting on the next line between the quotes.
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Changes made to all docstrings.
Code wise all looks good, @m-charlton 😊 Happy to send along the minor changes I discussed above. I'll do a functionality check later today and we'll close this out! 🚀 |
@andrewtavis thanks for the comments. Will make changes and submit as a PR |
Thank you, @m-charlton! Happy to bring this in later today :) |
Ah and your questions, @m-charlton:
|
@@ -29,7 +31,8 @@ | |||
|
|||
@dataclass(repr=False, frozen=True) | |||
class QueryFile: | |||
"""Holds a reference to a file containing a SPARQL query.""" | |||
""" | |||
Holds a reference to a file containing a SPARQL query.""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tripple quote should be on newline. Slipped through final review.
Final tripple quote should be on new line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested it all out and it's working great, @m-charlton 😊 Thanks so much for bringing this quality of work to Scribe-Data. Really is exactly what we need :) :)
I'm a bit confused by the CI fail for ERROR: No matching distribution found for tensorflow>=2.5.1
. Wrote a note about it in the dev sync, but also I can check this on my end.
Thanks and looking forward to the next steps here!
Tool can check all, changed or single SPARQL query files
Contributor checklist
Description
Adds new
checkquery.py
tool for checking all/changed/single SPARQL query files. Run./checkquer.py -h
for usage instructions.Limitations
--limit
argument with say--ping
argument. No harm but, does not make sense.Testing
Unit tests result in about 70% code coverage. Ad-hoc testing for all modes carried out.
Particularly interested in feedback regarding how failed/passing tests are reported.
Questions
Is
checkquery.py
in the correct location?Resolves issue