Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPML export fixes #1439

Merged
merged 4 commits into from
Jun 15, 2023
Merged

OPML export fixes #1439

merged 4 commits into from
Jun 15, 2023

Conversation

jtojnar
Copy link
Member

@jtojnar jtojnar commented Jun 15, 2023

No description provided.

We can just use variables.

Also remove unused `$key` variables from foreach.
@netlify
Copy link

netlify bot commented Jun 15, 2023

Deploy Preview for selfoss canceled.

Name Link
🔨 Latest commit f2fcaa8
🔍 Latest deploy log https://app.netlify.com/sites/selfoss/deploys/648a6518465c0e00082eda7d

@jtojnar jtojnar added this to the 2.20 milestone Jun 15, 2023
@jtojnar jtojnar added the bug label Jun 15, 2023
PHP will turn `string` array keys to `int` when they look like an `int`.
This can result in unexpected runtime errors with `strict_types`.
For example, this happened when tag name was passed to `writeAttributeNS()` during OPML export.

And in the unlikely case  all tags looks like numbers, `json_encode` would serialize the tag object as an array.

Unfortunately, not even PHPStan can save us currently: phpstan/phpstan#6847

Let’s avoid this madness by wrapping array with a custom class that behaves reasonably.
Case-insensitive matching in `hasTag` would prevent tag from getting a colour
when a tag with a different casing already existed.
That would break OPML export, which assumes all tags have colours.

Turns out MySQL used case-insensitive matching in `hasTag`.
This had already been fixed once in 00c7442
but the issue got re-introduced in 5af5737.

Let’s make MySQL daos use binary collation to make it ignore case in line with other database backends.
This will not do Unicode normalization of tag names but neither did `utf8mb4_general_ci`.
(`utf8mb4_general_ci` does not actually implement Unicode Collation Algorithm properly because it does not do normalization:
<https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-sets.html#charset-unicode-sets-uca>)

PostgreSQL is already case-sensitive in both `hasTag` and `csvRowMatches`.

SQLite is case-sensitive in `hasTag` but case-**insensitive** in `csvRowMatches`.
But that is not as problematic, since differently-cased tags will always be shown in the user interface.
And in the rare case user has multiple such tags, viewing one of them will show entries from all of them – again, no inaccessible data.
Making `csvRowMatches` case-sensitive on SQLite would require executing a `PRAGMA` for every connection,
using `GLOB` (which would require escaping using multiple `REPLACE` calls),
or specifying custom `REGEX` function (with associated cost of calling PHP function for every row).
https://www.sqlite.org/lang_expr.html#the_like_glob_regexp_match_and_extract_operators
https://www.sqlite.org/pragma.html#pragma_case_sensitive_like
That is probably not worth it, especially if we will likely convert `tags` column to an association table.

Additionally, let’s disable escape character in `LIKE` expression to prevent tag names containing a backslash from being matched.
(similar to `NO_BACKSLASH_ESCAPES` mode)
https://dev.mysql.com/doc/refman/5.7/en/string-comparison-functions.html#operator_like
https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_no_backslash_escapes
It probably fine to allow unescaped percent signs and underscores since those will match themselves.
`LIKE` operator in SQLite does not do escaping by default and we do not use `LIKE` with PostgreSQL.
When user specified a source tag with a same name but a different letter casing as another existing tag,
the MySQL backend would think the tag already exists and not create a color for it.
This would break, for example, OPML export, which expects every tag to have a color associated with it.

The bug has been fixed in the previous commit but there can still be such erroneously created tags from before.
Let’s go through all tags mentioned in the `source` table and ensure the `tags` table is up to date.
If a tag with a different casing and a color exists, let’s consider it the canonical casing and update the source tag to that.

Fixes: #1438
@jtojnar jtojnar merged commit f2fcaa8 into master Jun 15, 2023
@jtojnar jtojnar deleted the opml-export-fixes branch June 15, 2023 01:14
@jtojnar jtojnar mentioned this pull request Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant