Add additional logging around search_controller's ES query building #689

sarayourfriend · 2022-06-28T22:00:26Z

Problem

Currently we only log the query itself, but not the process that led to the query being constructed. We are seeing queries get sent with sizes of up to 1800 but have no way of telling why 1800 was chosen for the size.

Description

Add logging to the _get_query_slice method. Log at each branch and be sure to log the variables that lead to each branch as well. The idea is to log as much as is necessary to understand how the method is working in production to produce the from and size results that are being sent to ES.

In addition, log some basic search facts like the term and the serializer data.

Be sure to introduce a "trace" variable so that we can easily follow a particular search request. This can just be a uuidv4 generated at the top of the search method and passed around and added to the logs.

Implementation

🙋 I would be interested in implementing this feature.

The text was updated successfully, but these errors were encountered:

sarayourfriend · 2022-07-12T18:31:29Z

Fixed by WordPress/openverse-api#777

zackkrida · 2022-08-12T18:16:39Z

Re-opening for consideration, since some of the logging in WordPress/openverse-api#777 had to be reverted while triaging the API stability. Does it make sense to try to reintroduce some of the logging before revisiting the search controller refactor?

sarayourfriend · 2022-08-19T13:23:05Z

Agreed, Zack, that we should add the logging back to the search controller.

sarayourfriend · 2022-11-21T02:54:49Z

@WordPress/openverse-api Does this issue still make sense to implement as it reads today? I'm wondering if we still find value in this particular approach or if we'd want to do something more like labelling specific queries and making it possible to read specific query type times and such. Given we've already identified the most time-intensive parts of the search controller (dead link filtering and very deep pagination), should we add logging around that kind of thing specifically rather than more general logs?

krysal · 2022-11-22T19:50:53Z

I think we can lower this issue's priority and look into Kibana, which should have everything built-in to see what is happening inside Elastichsearch. The additional logging around dead link filtering and pagination sounds good to have in the short term as well.

sarayourfriend · 2022-11-24T01:53:23Z

which should have everything built-in to see what is happening inside Elastichsearch

Can you elaborate what you mean by this? I'm not familiar with using Kibana in this way. Is there a documentation I could read about it? Is Kibana equipped to do meta-analytics of our ES cluster?

sarayourfriend · 2023-03-15T23:51:56Z

@krysal @zackkrida @obulat Do y'all think this issue is still necessary? The original conceit of it is no longer valid: we pretty accurately understand why big queries get sent to Elasticsearch: dead links.

The methods mentioned in the issue description have all been well documented and unit tested at this point as well, so I think we understand them. If there are other motivations for adding process logging to the query building then we should update the issue description to accurately reflect the new motivations. Otherwise, we could close the issue. At the very least it doesn't seem like it should be high priority anymore 🤔

krysal · 2023-03-16T16:57:10Z

I looked at this a few days ago and was thinking the same, forgot to comment due to other priorities. The logs are there, and I don't think we need more here at the moment. Thanks for confirming Sara!

sarayourfriend added 🟧 priority: high Stalls work on the project or its dependents 💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users labels Jun 28, 2022

sarayourfriend self-assigned this Jun 29, 2022

This was referenced Jun 29, 2022

Add more logging to ES query builder WordPress/openverse-api#776

Closed

Add request id to logs WordPress/openverse-api#777

Merged

sarayourfriend closed this as completed Jul 12, 2022

zackkrida reopened this Aug 12, 2022

sarayourfriend removed their assignment Aug 19, 2022

obulat transferred this issue from WordPress/openverse-api Feb 22, 2023

obulat added the stack: backend label Feb 22, 2023

obulat added this to Openverse Backlog Feb 23, 2023

github-project-automation bot moved this to 📋 Backlog in Openverse Backlog Feb 23, 2023

krysal self-assigned this Feb 28, 2023

obulat moved this from 📋 Backlog to 📅 To do in Openverse Backlog Mar 7, 2023

krysal closed this as not planned Won't fix, can't repro, duplicate, stale Mar 16, 2023

github-project-automation bot moved this from 📅 To do to ✅ Done in Openverse Backlog Mar 16, 2023

AetherUnbound added 🧱 stack: api Related to the Django API and removed 🧱 stack: backend labels May 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add additional logging around search_controller's ES query building #689

Add additional logging around search_controller's ES query building #689

sarayourfriend commented Jun 28, 2022

sarayourfriend commented Jul 12, 2022

zackkrida commented Aug 12, 2022

sarayourfriend commented Aug 19, 2022

sarayourfriend commented Nov 21, 2022

krysal commented Nov 22, 2022

sarayourfriend commented Nov 24, 2022

sarayourfriend commented Mar 15, 2023

krysal commented Mar 16, 2023

Add additional logging around search_controller's ES query building #689

Add additional logging around search_controller's ES query building #689

Comments

sarayourfriend commented Jun 28, 2022

Problem

Description

Implementation

sarayourfriend commented Jul 12, 2022

zackkrida commented Aug 12, 2022

sarayourfriend commented Aug 19, 2022

sarayourfriend commented Nov 21, 2022

krysal commented Nov 22, 2022

sarayourfriend commented Nov 24, 2022

sarayourfriend commented Mar 15, 2023

krysal commented Mar 16, 2023