-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add additional logging around search_controller's ES query building #689
Comments
Fixed by WordPress/openverse-api#777 |
Re-opening for consideration, since some of the logging in WordPress/openverse-api#777 had to be reverted while triaging the API stability. Does it make sense to try to reintroduce some of the logging before revisiting the search controller refactor? |
Agreed, Zack, that we should add the logging back to the search controller. |
@WordPress/openverse-api Does this issue still make sense to implement as it reads today? I'm wondering if we still find value in this particular approach or if we'd want to do something more like labelling specific queries and making it possible to read specific query type times and such. Given we've already identified the most time-intensive parts of the search controller (dead link filtering and very deep pagination), should we add logging around that kind of thing specifically rather than more general logs? |
I think we can lower this issue's priority and look into Kibana, which should have everything built-in to see what is happening inside Elastichsearch. The additional logging around dead link filtering and pagination sounds good to have in the short term as well. |
Can you elaborate what you mean by this? I'm not familiar with using Kibana in this way. Is there a documentation I could read about it? Is Kibana equipped to do meta-analytics of our ES cluster? |
@krysal @zackkrida @obulat Do y'all think this issue is still necessary? The original conceit of it is no longer valid: we pretty accurately understand why big queries get sent to Elasticsearch: dead links. The methods mentioned in the issue description have all been well documented and unit tested at this point as well, so I think we understand them. If there are other motivations for adding process logging to the query building then we should update the issue description to accurately reflect the new motivations. Otherwise, we could close the issue. At the very least it doesn't seem like it should be high priority anymore 🤔 |
I looked at this a few days ago and was thinking the same, forgot to comment due to other priorities. The logs are there, and I don't think we need more here at the moment. Thanks for confirming Sara! |
Problem
Currently we only log the query itself, but not the process that led to the query being constructed. We are seeing queries get sent with sizes of up to 1800 but have no way of telling why 1800 was chosen for the size.
Description
Add logging to the
_get_query_slice
method. Log at each branch and be sure to log the variables that lead to each branch as well. The idea is to log as much as is necessary to understand how the method is working in production to produce the from and size results that are being sent to ES.In addition, log some basic search facts like the term and the serializer data.
Be sure to introduce a "trace" variable so that we can easily follow a particular search request. This can just be a uuidv4 generated at the top of the
search
method and passed around and added to the logs.Implementation
The text was updated successfully, but these errors were encountered: