Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Behavior: checkbox_choices misses non-english alphabet characters #462

Closed
pwildenhain opened this issue Jan 24, 2023 · 2 comments · Fixed by #463
Closed

Unexpected Behavior: checkbox_choices misses non-english alphabet characters #462

pwildenhain opened this issue Jan 24, 2023 · 2 comments · Fixed by #463
Assignees
Labels
nonascii accommodate non-ascii character

Comments

@pwildenhain
Copy link
Contributor

pwildenhain commented Jan 24, 2023

Describe the behavior: Please provide a clear and concise description of the scenario and the behavior. Be careful not to include tokens, PHI (protected health information), or other information that should not be public!

Certain non-english characters (such as ä) are missed by checkbox_choices()

site_selections <- "1, Hospital A | 2, Hospitäl B | 3, Hospital C"
REDCapR::checkbox_choices(site_selections)
#>   id      label
#> 1  1 Hospital A
#> 2  3 Hospital C

Created on 2023-01-24 by the reprex package (v2.0.1)

Expected behavior: A clear and concise description of what you expected to happen.

checkbox_choices() should be able to handle these characters, especially considering that REDCap is an international phenomenon

Suggested Fix:

Add non-english characters to the regex search pattern in checkbox_choices():

pattern_checkboxes <- "(?<=\\A| \\| )(?<id>\\d{1,}), (?<label>[\x21-\x7B\x7D-\x7E ]{1,})(?= \\| |\\Z)"

Specifically this section: \x21-\x7B\x7D-\x7E. Instead of ending the search at character x7E (~), I suggest ending at xAD (¡), though I think a case could easily be made that there is no harm in extending all the way to xFE (■).

See here for complete list of character codes: https://www.codetable.net/

As you can tell I've already wasted a bunch of time researched this thoroughly, and I'm happy to submit a PR + test cases since I ❤️ the checkbox_choices() function.

Desktop (please complete the following information):

  • OS: MacOS Montery 12.6.1
  • REDCap version: N/A
  • REDCapR Version: 1.1.0
@wibeasley
Copy link
Member

wibeasley commented Jan 24, 2023

A PR would be great. Thanks @pwildenhain.

I've recognized that non-ascii characters have been weakness/limitation of REDCapR for a while. I rarely encounter them in my projects, and I want to accommodate those who do. But many of my attempts to recruit people who use them haven't gone far (eg, #290, #296, #354).

If anyone would like to be involved, please tell me.

@pwildenhain
Copy link
Contributor Author

Geez radio silence haha. I bet they have to use REDCap for like one project, build a workaround, and then move on 😂

People like you and I -- we were born into REDCap, molded by it.

I'll submit a PR 💪

@wibeasley wibeasley self-assigned this Jan 25, 2023
@wibeasley wibeasley added the nonascii accommodate non-ascii character label Jan 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nonascii accommodate non-ascii character
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants