Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create workflow to check query forms #450

Closed
2 tasks done
andrewtavis opened this issue Oct 21, 2024 · 28 comments · Fixed by #507
Closed
2 tasks done

Create workflow to check query forms #450

andrewtavis opened this issue Oct 21, 2024 · 28 comments · Fixed by #507
Assignees
Labels
-priority- High priority feature New feature or request

Comments

@andrewtavis
Copy link
Member

andrewtavis commented Oct 21, 2024

Terms

Description

This issue is to document work that I've been doing to create a check for form identifiers within all Scribe-Data queries. The check goes through and derives the properties of all forms and create a new metadata file for lexeme forms from which a naming criteria is derived.

Contribution

I'm working on this. Current status at time of writing is that I need to add in a check for whether all forms are actually returned in the query, that each returned form is unique, and that returned forms are ordered in the same way as the metadata file. From there also a minor check to see that the query docstring is correct would be good.

From there I need to go through and rename all the forms in the queries that have issues :)

@andrewtavis andrewtavis added feature New feature or request -priority- High priority labels Oct 21, 2024
@andrewtavis andrewtavis self-assigned this Oct 21, 2024
@andrewtavis andrewtavis moved this from Todo to In Progress in Scribe Board Oct 21, 2024
@DeleMike
Copy link
Contributor

DeleMike commented Oct 21, 2024

Hello @andrewtavis 👋🏾, well done for #451 !!
I'm happy to assist you on this issue. Is there any task I can help with?

@andrewtavis
Copy link
Member Author

Thanks for your offer to help, @DeleMike :) Maybe I can solve the current issues with the queries later tonight, and from there we can make some issues for the extra features?

@DeleMike
Copy link
Contributor

Sure! I will ping you later, then!

andrewtavis added a commit that referenced this issue Oct 21, 2024
#450 Script and workflow created for query form check
@andrewtavis
Copy link
Member Author

andrewtavis commented Oct 22, 2024

Ok @DeleMike, with #451 in we now need the following:

  • Add in a check for whether all forms are actually returned in the query (i.e. are there optional selections that are not brought up to the select statement)
  • Add a check that each returned form is unique (never ?plural ?plural)
  • Add a check that the returned forms are ordered in the same way as the metadata file and that their optional statements match (this is a simple as checking the top against the labels for the generated form labels)
  • General testing and error handling

CC @axif0 who's also looking for Python work to do :) Do you all want to plan out the work here?

@DeleMike
Copy link
Contributor

Thanks @andrewtavis!

We will look into it. @axif0 how about we break this into parts?


Some questions:
For these checks, how do we wanna verify outputs? Are we gonna make network calls for all the checks?

Also, I believe these checks will be part of our local tests, yes?

@andrewtavis
Copy link
Member Author

Honestly let's keep this to expanding the lexeme_form_metadata.json file. Yes it's more work for us, but doing network calls for all of this each time a PR commit is made would be way to much usage of a common database :)

And ya we really should do a single Python or shell script that runs all of the checks we've been working on locally and add a note on that into the testing section of the contriving guide 😊

@axif0
Copy link
Collaborator

axif0 commented Oct 23, 2024

I'm done with Add a check that each returned form is unique (never ?plural ?plural) and working on Add in a check for whether all forms are actually returned in the query (i.e. are there optional selections that are not brought up to the select statement)

@andrewtavis
Copy link
Member Author

This would be great, @axif0! This can form the basis of some of the other work here. So as with getting all the form texts, we should also parse and get all forms that are being returned from the query. This first check will be to make sure that each is unique, and from there we can work towards making sure that what's being returned matches what's in the forms 😊

@DeleMike
Copy link
Contributor

DeleMike commented Oct 23, 2024

[OFF TOPIC]
Hey everyone, I'm currently not strong enough. I fell ill.

But hoping to come back soon and resolve this issue! They said I need rest.

EDIT: I had to drop this update so that it doesn't seem I am silent.

@andrewtavis
Copy link
Member Author

No stress, @DeleMike! Please take care and feel better soon!

@OmarAI2003
Copy link
Contributor

[OFF TOPIC] Hey everyone, I'm currently not strong enough. I fell ill.

But hoping to come back soon and resolve this issue! They said I need rest.

EDIT: I had to drop this update so that it doesn't seem I am silent.

Thank you for the update.

I hope you feel better soon! If there’s anything I can do to assist with this issue while you’re away, please let me know, and I’m happy to coordinate with @andrewtavis and @axif0 if needed.

Take care!

@KesharwaniArpita
Copy link
Contributor

Get well soon @DeleMike :)

Hi @andrewtavis, @DeleMike , @axif0 and @OmarAI2003 I also want to help here. Should we all do one task each?

@axif0 axif0 mentioned this issue Oct 23, 2024
4 tasks
@andrewtavis
Copy link
Member Author

Ok @axif0 finished the first two :) @OmarAI2003, do you want to do the edit to this process that checks to make sure that the order of the returned forms is the same as how they appear below in the query? And @KesharwaniArpita, can you check the functions and make sure that they have the needed checks included?

@KesharwaniArpita
Copy link
Contributor

Sure @andrewtavis

@OmarAI2003
Copy link
Contributor

Sure, thanks, Andrew.

@KesharwaniArpita
Copy link
Contributor

And @KesharwaniArpita, can you check the functions and make sure that they have the needed checks included?

@andrewtavis I am a little confused by like what functions are we talking about? The one in the check_query_forms.py or the files for the scribe data functions of list, total etc?

@andrewtavis
Copy link
Member Author

Would be the functions in check_query_forms.py, @KesharwaniArpita :)

@DeleMike
Copy link
Contributor

DeleMike commented Oct 26, 2024

Thank you so much everyone! ✨
This means a lot!

Feel a bit better :)

@andrewtavis
Copy link
Member Author

Great to hear, @DeleMike!

@andrewtavis
Copy link
Member Author

So last thing here is to check that the actual QIDs that make up the form after wikibase:grammaticalFeature are in order, right? :) Thanks all so much for the amazing work here!

@OmarAI2003
Copy link
Contributor

So last thing here is to check that the actual QIDs that make up the form after wikibase:grammaticalFeature are in order, right? :) Thanks all so much for the amazing work here!

Yeah, and the ordering of the labels in sELECT to be based on the lexeme forms JSON.

@KesharwaniArpita
Copy link
Contributor

Yess!!

@andrewtavis
Copy link
Member Author

@OmarAI2003, do we still need to check that the forms are ordered correctly in the OPTIONAL selections, or was that included in #503?

@OmarAI2003
Copy link
Contributor

No @andrewtavis, we still need to check for that, but it will be easy since the functions extract_form_qids and extract_form_rep_label you defined can be utilized effectively for this, and I can work on it too.

@andrewtavis
Copy link
Member Author

Awesome, @OmarAI2003 :) Looming forward and thanks for the continued efforts!

@OmarAI2003
Copy link
Contributor

@OmarAI2003, do we still need to check that the forms are ordered correctly in the OPTIONAL selections, or was that included in #503?

Just to make sure before I start working on the final part to close this issue. My understanding is that we need to ensure the order of the QIDs matches the order of the labels in the Optional statments. For example:

OPTIONAL {
?lexeme ontolex:lexicalForm ?masculineIndicativePastForm .
?masculineIndicativePastForm ontolex:representation ?masculineIndicativePast ;
wikibase:grammaticalFeature wd:Q682111, wd:Q1994301, wd:Q499327 .
}

we need to check that the first QID (Q682111) corresponds to the first label, and so on. Is this what you meant in this comment?

@andrewtavis
Copy link
Member Author

Exactly, @OmarAI2003 :) Thanks for checking! Really looking forward to finalizing this :)

@andrewtavis
Copy link
Member Author

Thanks all for the efforts here! We're all closed up now, and the work here will make the growth of the query base so much more sustainable 😊 Appreciate the dedication and collaboration from all involved :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
-priority- High priority feature New feature or request
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants