Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/whitespace and new line chars #1351

Merged
merged 4 commits into from
Jun 4, 2024

Conversation

alexanderleegs
Copy link
Contributor

@alexanderleegs alexanderleegs commented Apr 29, 2024

This PR fixes issues we have with trailing whitespace and unicode line breaks. See this issue, this issue and this issue for more details.

Concretely, this PR creates a new util method to trim whitespace and replace unicode new line characters with spaces, which is used to parse user provided input for all page/folder/settings create/update endpoints. Note that this means existing folders with a trailing space continue to have their new pages affected until the folder is renamed - this is expected and should be resolved manually as they come up.

Tests

  • Copy a unicode character for new line (e.g. from here)
  • Create a new folder with the unicode character in it + trailing and preceding spaces
  • Verify that the created folder does not have the unicode character or trailing spaces
  • Repeat for pages, media, and special pages e.g. homepage

@alexanderleegs alexanderleegs requested a review from a team April 29, 2024 01:50
Copy link
Contributor

@kishore03109 kishore03109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

implementation of recursiveTrimAndReplaceLineBreaks is incomplete, suggest to use something like

const newLineRegex = /\r?\n|\r/
"hi \n there \x0C \x0D ".split(newLineRegex).map((str) => str.trim()).join("") // returns 'hithere'

@@ -2,6 +2,26 @@ import yaml from "yaml"

import { sanitizer } from "@services/utilServices/Sanitizer"

const LINE_BREAK_REGEX = "[\r\n\x0B\x0C\u0085\u2028\u2029]"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than trying to reimplement white space character check, i suggest only caring about new line characters and let String.prototype.trim() handle spaces.

Currently, in the attempt of implementing both, the check for spaces is not as complete as the one implemented for the function call trim.

Do we care about any space here? ie hi there should be hithere?

Copy link
Contributor Author

@alexanderleegs alexanderleegs May 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm the new line regex is really just to catch the new line characters, and not to replace other whitespace - as far as I can tell these are the specific characters which can be used for new lines. The in between spaces should be preserved

return obj.map((item) => recursiveTrimAndReplaceLineBreaks(item))
}
if (typeof obj === "object" && obj !== null) {
const newObj: any = {}
Copy link
Contributor

@kishore03109 kishore03109 May 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to use some sort of recursive types and avoid the use of any? maybe thr typechecking / recursive types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh i re-requested a review by accident, just ping me on slack once ready

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 1e212c8!

@kishore03109 kishore03109 self-requested a review May 21, 2024 01:02
@seaerchin seaerchin requested a review from a team June 4, 2024 10:18
@seaerchin seaerchin enabled auto-merge (squash) June 4, 2024 10:19
@seaerchin seaerchin merged commit ed77606 into develop Jun 4, 2024
11 of 12 checks passed
@seaerchin seaerchin deleted the fix/whitespace-and-new-line-chars branch June 4, 2024 10:23
@alexanderleegs alexanderleegs mentioned this pull request Jun 13, 2024
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants