Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add schema for embed API requests #117

Merged
merged 15 commits into from
Jan 17, 2024

Conversation

bjester
Copy link
Member

@bjester bjester commented Nov 2, 2023

Summary

  • Defines a JSON schema for the structure of recommendations API requests that will be received by TorchServe
  • Fields level, parent, and has_content will be generated in preprocessing in the TorchServe handler
  • TODO: unit tests to ensure schema does what we expect

Reference

Closes learningequality/studio#4392

@bjester bjester requested a review from jamalex November 2, 2023 18:09
@bjester
Copy link
Member Author

bjester commented Nov 2, 2023

@jamalex :: @akolson and I worked on a JSON schema to nail down the format of the data sent to TorchServe for topic recommendations. We followed your example in the example pipeline code. Does it look sufficient?

Other questions:

  • Do we need has_content to represent whether a topic has non-topic resources? Or is that only representative of whether it has subtopics?
  • What is category and do we need it?

Copy link
Member

@jamalex jamalex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some clarifying comments in here -- the main thing is that the current schema seems designed to send the entire tree at once, whereas we only need the target topic(s) and their ancestors.

Fields level, parent, and has_content will be generated in preprocessing in the TorchServe handler

Level and has_content aren't important for embedding. The parent/ancestors are needed, though.

spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
@akolson akolson requested a review from jamalex January 10, 2024 21:34
Copy link
Member

@jamalex jamalex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great changes -- getting closer! I had a few high level points/questions (such as this being the schema for calling an embedding vs recommender endpoint, and needing to support embedding of content resources in addition to topics). And then some smaller notes. Happy to hop on a call to chat through any of my notes that may be confusing! Thanks.

spec/recommendations-v1.json Outdated Show resolved Hide resolved
@@ -0,0 +1,77 @@
{
"$id": "/schemas/recommendations_request",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering about naming: is this endpoint for recommendations, or embeddings? (I know there was discussion of having endpoints for making actual recommendations, as well -- but then we'd probably want to add a few additional parameters in addition to what's here, e.g. around desired number of recommended items to return, etc. So it may be best to have this named as an embedding endpoint schema?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand you correctly, I think "$id": "/schemas/recommendations_request", here is a URI for the schema to refer to elements of the schema from inside the same document or from external JSON documents. However, your comment on endpoint naming still stands and will be put into consideration for the endpoints we intend to implement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus said, this schema is for embedding purposes only. Serve responds with embeddings that we hope to store and make comparisons against later in studio. I will change the URI to embed_request for better clarity.

spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
spec/recommendations-v1.json Outdated Show resolved Hide resolved
@akolson
Copy link
Member

akolson commented Jan 12, 2024

Hi @jamalex I was able to incorporate the feedback. It was pretty clear so need for the call. We should now be more closer to the final request body 🤞. Please let me know incase there is anything that may require an update. @bjester, a few files have been renamed as noted by Jamie in his comment. Also, any feedback is welcome. Thanks

@akolson akolson requested a review from jamalex January 12, 2024 15:38
@bjester bjester marked this pull request as ready for review January 12, 2024 15:53
@akolson akolson changed the title Add schema for recommendations API requests Add schema for embed API requests Jan 15, 2024
js/EmbedRequest.js Outdated Show resolved Hide resolved
Copy link
Member

@jamalex jamalex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me -- thanks! I left one small comment that may be good to address if you have a chance, but not a blocker.

@akolson akolson merged commit 006c130 into learningequality:main Jan 17, 2024
10 checks passed
@akolson akolson mentioned this pull request Jan 25, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement schema for embed API requests
3 participants