Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLiteException: [SQLITE_TOOBIG] When searching for ♾ symbol #1428

Closed
5 tasks done
DieselTech opened this issue Feb 13, 2024 · 13 comments
Closed
5 tasks done

SQLiteException: [SQLITE_TOOBIG] When searching for ♾ symbol #1428

DieselTech opened this issue Feb 13, 2024 · 13 comments
Labels
bug Something isn't working released

Comments

@DieselTech
Copy link

Steps to reproduce

Search for ♾symbol in komga search bar

Expected behavior

Issues that have the emoji in them should show up

Actual behavior

A 3MB error log is produced. I've linked to it in the discord channel.

https://discord.com/channels/678794935368941569/678795177393127424/1206719317782306897

Logs

https://discord.com/channels/678794935368941569/678795177393127424/1206719445632819250

Komga version

1.10.3

Operating system

Linux

Installation method

Docker

Other details

Looks like this might be similar to #940

Acknowledgements

  • I have searched the existing issues (open AND closed) and this is a new ticket, NOT a duplicate or related to another open issue.
  • I have written a short but informative title.
  • I have checked the FAQ.
  • I have updated the app to the latest version.
  • I will fill out all of the requested information in this form.
@gotson
Copy link
Owner

gotson commented Feb 14, 2024

How many books have this ♾ symbol in their name ?

@gotson
Copy link
Owner

gotson commented Feb 14, 2024

Looks like the lucene query returns 97602 results for that symbol, is that accurate ?

@DieselTech
Copy link
Author

How many books have this ♾ symbol in their name ?

That was the question I was trying to answer by searching. Off the top of my head I think like 3 have it.

@gotson
Copy link
Owner

gotson commented Feb 14, 2024

How many books have this ♾ symbol in their name ?

That was the question I was trying to answer by searching. Off the top of my head I think like 3 have it.

would you be able to check in the DB directly maybe ?

Anyhow there's an optimization to be done in the code to limit the intermediate lucene search result to the same limit as what's being requested in the API.

@DieselTech
Copy link
Author

Do you have any hints as to what table to look at for this? With there being so few entries that might have this, I'm not sure where to start looking.

@gotson
Copy link
Owner

gotson commented Feb 15, 2024

select *
from BOOK_METADATA
where TITLE like '%♾%';

should do the trick

@DieselTech
Copy link
Author

select *
from BOOK_METADATA
where TITLE like '%♾%';

should do the trick

Returned 0 results. I know there is a at least 1 that has it. However I was able to reproduce the same behavior with ½.

Database returns 1 result, while komga throws the SQLITE_TOOBIG error and seems to return every single series that exists.

image

@gotson
Copy link
Owner

gotson commented Feb 15, 2024

However I was able to reproduce the same behavior with ½.

This is different. The Lucene parser does not register ½ as a searchable character, and the resulting query is equivalent to "everything", while with it does search for that character specifically.

@DieselTech
Copy link
Author

I'll need to go through my library another way then and find the issue that has the ♾ in it. Normally komga is how I search for stuff like this. Once I find it I'll bring it up here.

@gotson
Copy link
Owner

gotson commented Feb 15, 2024

For context, here is how the API works:

  1. if a search term is provided, the lucene index is queried with that search term. The number of results is not limited, so it may return many results.
  2. the sqlite database is queried with all the other conditions (none in the case of the search bar search), and also limited to the IDs of entities returned in step 1
  3. in order to keep the sort order from the lucene search, and since sqlite cannot do that easily, the sql query has to perform a massive if/else statement with an int ordering built from the lucene search result. This is what generates a query too big for sqlite.

What i can't understand is why the lucene query returns so many results in your case when using , it should not. For ½ it's expected though.

The way it's done is not the best admittedly, but is the only way to ensure there's enough results when searching with both a search term and some conditions.

Fixing this would require splitting the search API in 2, with one endpoint exclusively for full text search via lucene, and one endpoint allowing for multiple conditions, but no lucene search, only a classic SQL 'like' search.

@gotson gotson added bug Something isn't working and removed triage labels Feb 15, 2024
@gotson gotson changed the title SQLiteException: [SQLITE_TOOBIG] When searching for ♾symbol SQLiteException: [SQLITE_TOOBIG] When searching for ♾ symbol Feb 15, 2024
@gotson
Copy link
Owner

gotson commented Feb 22, 2024

i was thinking of capping the search results to 1000 at max. That would prevent the SQLITE_TOOBIG error, and i don't see much side effects.

If your search returns more than 1000 results, it's probably not a good criteria for searching, right?

@DieselTech
Copy link
Author

If your search returns more than 1000 results, it's probably not a good criteria for searching, right?

That sounds good to me. Having a cap also makes sense so you don't try to return the entire catalog.

I wrote a few scripts to look for the ♾symbol but couldn't find it. It might not of been in the filename, but something that was included in the metadata for an issue.

@gotson gotson closed this as completed in d34d4a5 Mar 8, 2024
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 8, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 9, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 9, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 10, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 11, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 11, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
pull bot added a commit to kaiserbh/komga that referenced this issue Mar 11, 2024
* deps(webui): bump sanitize-html from 2.11.0 to 2.12.1 in /komga-webui

Bumps [sanitize-html](https://github.com/apostrophecms/sanitize-html) from 2.11.0 to 2.12.1.
- [Changelog](https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md)
- [Commits](apostrophecms/sanitize-html@2.11.0...2.12.1)

---
updated-dependencies:
- dependency-name: sanitize-html
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>

* fix(api): cap search results to avoid SQLITE_TOOBIG

Closes: gotson#1428

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Gauthier Roebroeck <[email protected]>
Copy link
Contributor

🎉 This issue has been resolved in 1.10.4 (Release Notes)

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working released
Projects
None yet
Development

No branches or pull requests

2 participants