Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If a query is successfully processed via Buddy log it in the original form #2235

Closed
8 tasks done
donhardman opened this issue May 28, 2024 · 23 comments
Closed
8 tasks done
Assignees
Labels

Comments

@donhardman
Copy link
Contributor

donhardman commented May 28, 2024

Proposal:

We are currently logging all queries, both those with errors and those without. However, when a query is proxied to the Buddy and the Buddy processes it successfully, we do not log those queries at all. We only log when an error is returned.

We should log all successfully processed queries that were originally erroneous but were proxied to the Buddy. If the Buddy returns an OK result for a query that initially had an error, we should log the ORIGINAL query as successful since it was processed without error with the help of the Buddy.

Checklist:

To be completed by the assignee. Check off tasks that have been completed or are not applicable.

  • Task estimated
  • Specification created, reviewed and approved
  • Implementation completed
  • Tests developed
  • Documentation updated
  • Documentation proofread
  • Changelog updated
  • OpenAPI YAML updated and issue created to rebuild clients
@sanikolaev sanikolaev changed the title Improved Buddy logging If a query is successfully processed via Buddy log it in the original form May 30, 2024
@tomatolog
Copy link
Contributor

I need concrete example what queries logs wrong as search, update, replace, system handles errors differently. SphinxQL and HTTP JSON handler also has different error processing.

For example for search query it is hard to do for SphinxQL handler as:

  • query got processed by handler that logs error then up the stack daemon takes decision send reply to client or send request to buddy. And error already got put into query log for that query.
  • if query got fixed by buddy and good result returned there is no internal info about fixed query such as timings \ io and cpu counters \ total counters that is why it is not clear what to log in this case.

@tomatolog tomatolog assigned donhardman and unassigned tomatolog Jun 19, 2024
@donhardman
Copy link
Contributor Author

For queries that will be passed to the Buddy, we can use the following, which is valid and should be handled by Buddy:

SELECT * FROM table WHERE match('test') OPTION fuzzy=1, layouts=''

We need to track timings and all necessary information for logging to maintain a consistent format on the daemon side.

@donhardman donhardman assigned tomatolog and unassigned donhardman Jun 21, 2024
@tomatolog
Copy link
Contributor

daemon does not have any timings \ agent timings \ io and cpu counters \ total counters when got reply from buddy. If these should be put into log to query looks like original buddy should collect these metrics and provide them into daemon.

Another case is that daemon passes all the failed requests into buddy but query_log has only search requests if daemon will log all source strings after buddy handles these well query_log will have inserts \ updates \ bulk \ call keywords and so on. Buddy should reply into daemon with the handled request type to filter out requests and log only search requests.

The only daemon has is the source query string (SphinxQL or JSON) and it can log it as is, ie if the query_log_format option is plain daemon still logs the source SphinxQL or JSON string that will look unusual.

@tomatolog tomatolog assigned donhardman and unassigned tomatolog Jun 26, 2024
@sanikolaev sanikolaev assigned tomatolog and unassigned donhardman Jun 26, 2024
@tomatolog
Copy link
Contributor

after the discussion with the @sanikolaev it worth to change response from Buddy to Manticore Search to include the new object query_log that will be a flag for daemon to store the current request string as is into query log and also include the following metrics into that object:

  • total_found
  • time

These could be get by buddy by the SHOW META query for every SELECT it need to log by the daemon. If the SHOW META output has cpu_time agents_cpu_time these also could be provided and logged by the daemon to make sure the query log entry is similar to Query-logging

{
  "type": "sql response",
  "message": [
    {
      "total": 6,
      "error": "",
      "warning": ""
    }
  ],
  "error_code": 0,
  "version": 2,
"query_log" : {
"total_found": 1000,
"time": 0.01,
"cpu_time":18.004,
"agents_cpu_time":0.00
}
}

Without the new flag \ object daemon has no information is it the search query and should be log into query.log or the bulk \ insert \ ALTER or some system query and should be skipped

@tomatolog tomatolog assigned donhardman and unassigned tomatolog Jun 26, 2024
@donhardman
Copy link
Contributor Author

As discussed before, instead of using SHOW META all the time, it seems to be a better idea to add meta information to EACH response, in case it's requested with Buddy at least. This may contain a "meta" field with all information. In that case, when Buddy performs multiple queries to solve a request, we do not need to add "SHOW META" manually all the time. Instead, we can modify the core to aggregate the meta responses, accumulate them, and return them to the daemon.

@donhardman
Copy link
Contributor Author

We should process and try to log it with daemon.

We can validate it by using the following MRE:

mysql> create table test(name text) min_infix_len='1';
mysql> select * from test where match('hello') option fuzzy = 1;

And we will get response from buddy as follows:

[BUDDY] [671a23e68c8717.40395458] response data: {"version":3,"type":"sql response","message":[{"total":0,"error":"","warning":"","columns":[{"id":{"type":"long long"}},{"name":{"type":"string"}}]}],"meta":{"total":"0","total_found":"0","total_relation":"eq","time":"0.008"},"error_code":200}

Let's proceed with logging such queries. In case we have no meta info it will be null.

@tomatolog
Copy link
Contributor

merged at 210b06c and now if SphinxQL request to buddy passed well it logs into query log

@tomatolog tomatolog reopened this Nov 5, 2024
@tomatolog
Copy link
Contributor

not sure if @PavelShilin89 should add test for this case or not

@tomatolog tomatolog assigned donhardman and unassigned tomatolog Nov 5, 2024
@sanikolaev
Copy link
Collaborator

not sure if @PavelShilin89 should add test for this case or not

Absolutely, let's add a CLT test which tests that various queries processed through Buddy are logged properly.

@sanikolaev
Copy link
Collaborator

@donhardman as we discussed, the remaining issue is that the daemon currently logs all queries processed through Buddy. Instead, we only want to log SELECT queries. Let's handle this by introducing a new log flag.

@donhardman
Copy link
Contributor Author

@tomatolog
Copy link
Contributor

wrong commit reference

@tomatolog tomatolog reopened this Nov 12, 2024
@sanikolaev
Copy link
Collaborator

@donhardman
Copy link
Contributor Author

Let's check the meta property, and when we find a total value in it, we can confirm this is a select operation and log it.

@donhardman donhardman assigned tomatolog and unassigned donhardman Nov 14, 2024
donhardman pushed a commit that referenced this issue Nov 14, 2024
donhardman pushed a commit that referenced this issue Nov 14, 2024
…lines at the query log for error report and buddy report; fixed #2235
donhardman pushed a commit that referenced this issue Nov 14, 2024
donhardman pushed a commit that referenced this issue Nov 14, 2024
donhardman pushed a commit that referenced this issue Nov 14, 2024
…lines at the query log for error report and buddy report; fixed #2235
donhardman pushed a commit that referenced this issue Nov 14, 2024
@tomatolog
Copy link
Contributor

should be fixed at 9222c03 now daemon does not log into query.log if the buddy reply has no meta.total

@sanikolaev
Copy link
Collaborator

should be fixed at 9222c03 now daemon does not log into query.log if the buddy reply has no meta.total

Looks good to me.

Reopening since there are non-checked items in the checklist.

Assigning to @PavelShilin89 to prepare tests. These should include SQL/JSON select queries that are processed via Buddy and sent via MySQL/HTTP. The queries should be logged according to the documentation here: https://manual.manticoresearch.com/Logging/Query_logging#Query-logging

@sanikolaev sanikolaev reopened this Nov 20, 2024
@sanikolaev sanikolaev added the rel::upcoming Upcoming release label Nov 20, 2024
PavelShilin89 pushed a commit that referenced this issue Nov 24, 2024
* Bump buddy version to: 3.0.1 24111207 5d89277

* Bump buddy version to: 3.0.1 24111217 6707485

* Bump buddy version to: 3.0.1 24111305 f599a87

* Bump buddy version to: 3.0.1 24111407 3861c6b

* fixed buddy reply routing at daemon for /cli endpoint (#2734)

* fixed buddy reply routing at daemon for /cli endpoint; fixed #2269

* implemented jieba_user_dict_path (#2735)

implemented jieba_user_dict_path

* added log of the SphinxQL query that successfully processed by buddy into the query log; fixed #2235

* manual: Update Adding_documents_to_a_real-time_table.md

* fixed query wrong length at error report at the query log; fixed new lines at the query log for error report and buddy report; fixed #2235

* Update regex for the new version of fluentbit (#2738)

Co-authored-by: Pavel_Shilin <[email protected]>

* manual: Update Query_logging.md

Added that queries processed through Buddy are not logged in plain mode.

* manual: minor changes

* fix crash when add 1-st stored column

That is very edge case when you alter a table adding stored column when
there were no stored columns before.

Test provided.

That is related to #2708

* fixed JSON SI-related filter transforms vs small index

* added collation support to Expr_JsonFieldIn_c

* added log of the HTTP request that successfully processed with buddy into the query log; fixed #2235

* manual: WSL installation

* fixed multiple stopwords from the lemmatize_xxx_all tokenizers increase hitpos of the following tokens; fixed #2704; added case to test 219

* fixed nsis builder script to use the same MANTICORE_DATE as zip package has

* Test/drop sharded table (#2651)

* Creating files for the test

* Update test-drop-sharding-table.rec

* Moved test-drop-sharding-table

* Deleted buddy_path = manticore-executor /workdir/src/main.php

* Added new sharded tables

* Added new outputs

* Fixed test-drop-sharding-table.rec

* Added new cases for table deletion

---------

Co-authored-by: Pavel_Shilin <[email protected]>

* fixed mysql dll for indexer

* set session meta for the buddy replies; fixed #2235

* added uuid_short select list expression function; fixed #2752; added cases to test 125

* manual: uuid_short()

* Test/create table with data if not exists (#2760)

* Update test-create-table-like-existing-table-with-data.rec

* Added cases with different syntax of IF NOT EXISTS support

---------

Co-authored-by: Pavel_Shilin <[email protected]>

* New fuzzy search and autocomplete cases have been added

---------

Co-authored-by: githubmanticore <[email protected]>
Co-authored-by: Stas <[email protected]>
Co-authored-by: Ilya Kuznetsov <[email protected]>
Co-authored-by: Sergey Nikolaev <[email protected]>
Co-authored-by: Pavel_Shilin <[email protected]>
Co-authored-by: Aleksey N. Vinogradov <[email protected]>
@PavelShilin89
Copy link
Contributor

Tests have already been developed previously covering these cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants