Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AddEnglishSpecifier() producing illegal HTML #3293

Merged
merged 9 commits into from
Jan 27, 2023

Conversation

brianebeling
Copy link
Member

@brianebeling brianebeling commented Dec 16, 2022

image

https://www.coronawarn.app/de/blog/2020-12-16-corona-warn-app-version-1-9/
for example

It fixes it by not replacing the string, if its outside an HTML Element. This can and maybe will have side-effects and needs to be reviewed carefuly. Maybe we can integrate it with Cypress?


Internal Tracking ID: EXPOSUREAPP-14644

Copy link
Member

@larswmh larswmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To improve testing, I would recommend to add the previously removed strings from the english-texts.json (you can see the activity on that file here).

To be specific, we've had the following problems in the past:

The fix for this function should be sufficient to re-add all initially defined English strings without no further problematic occurences. From my review, this PR solves both cases mentioned above.

The problem shown in the opening comment has also been fixed. However, one problem remains. Affected English texts within <title> tags are now being replaced, causing illegal HTML. See the screenshots below.

chrome_t3t1SPhfNO
image

@brianebeling
Copy link
Member Author

Ah, good catch. I will try to include it. Its really not that easy to write efficient Regex queries that are correct and catch all exceptions, but its the easiest fix that doesn't sacrifice functionality. My previous expression wouldn't even finish, it took that long.

@brianebeling
Copy link
Member Author

I will also add documentation for the expression as it is getting progressively harder to understand.

@MikeMcC399
Copy link
Contributor

@brianebeling

https://www.coronawarn.app/de/blog/2020-12-16-corona-warn-app-version-1-9/ still shows bad HTML at the top of the page.

Are you going to continue with this PR?

@brianebeling
Copy link
Member Author

Brian-J. Ebeling

coronawarn.app/de/blog/2020-12-16-corona-warn-app-version-1-9 still shows bad HTML at the top of the page.

Are you going to continue with this PR?

Hey Mike, we made some progress and further discussed the technical side of things. Ultimately it would be nice to traverse the DOM or a tree like structure representing the HTML instead of trying to find a Regular Expression that matches HTML and replaces it. I just got back from the vacation but I won't be able to work on this in the very near future.

We need to decide how to continue with this internally, I will get back to you.

@thorbenkuro
Copy link
Contributor

thorbenkuro commented Jan 25, 2023

@brianebeling @MikeMcC399
The invalid HTML issue as seen on https://www.coronawarn.app/de/blog/2020-12-16-corona-warn-app-version-1-9/ should now be fixed with my recent changes.

Initially I fixed the regex to exclude the <title>-tag properly, which resulted in very long build times, because I used .* inside the expression. To fix the long build times, I reworked the logic. This way the <head>-section is not scanned at all by the regex and each file is processed only once. This way I could greatly improve the performance.

Right now GitHub Actions seems to be unavailable, according to the error message of the build pipeline. I will trigger the checks again as soon as possible.

@thorbenkuro thorbenkuro marked this pull request as ready for review January 25, 2023 11:37
@thorbenkuro thorbenkuro requested a review from a team January 25, 2023 11:37
@thorbenkuro thorbenkuro self-assigned this Jan 25, 2023
@thorbenkuro
Copy link
Contributor

@larswmh I added the previously removed strings to english-texts.json as requested.

@brianebeling brianebeling requested a review from larswmh January 26, 2023 09:05
Copy link
Member Author

@brianebeling brianebeling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, no obvious errors found. LGTM! (Cant review my own PR)

Copy link
Member

@larswmh larswmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, thanks for the fix @thorbenkuro and @brianebeling

@dsarkar dsarkar merged commit eeb2c20 into master Jan 27, 2023
@dsarkar dsarkar deleted the fix-add-english-specifier branch January 27, 2023 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants