Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken unicode workaround #18

Open
lukas2511 opened this issue Aug 10, 2023 · 2 comments
Open

Broken unicode workaround #18

lukas2511 opened this issue Aug 10, 2023 · 2 comments

Comments

@lukas2511
Copy link

At FrOSCon we had some issues with tickets including unicode characters like german quotation and emojis.
The postencoding worker simply exited during the XMLin function and affected tickets were stuck in the postencoding state.

Afaik this was the first FrOSCon using a Frab version with real unicode support, so we never had this exact case before. It's not entirely clear if that is the issue or if something else is going on.

While trying to figure out what was happening I wasn't able to reproduce the issue when trying to extract the parsed XML in any way. So I basically knew parsing from written files worked fine... So as a quick and dirty workaround I simply wrote the incoming XML into a file and used it inside of the XMLin function, that worked perfectly. This is not a good solution, but it worked, and I'm posting the patch here in case anybody else runs into the same problem and needs a quick workaround.

diff --git a/lib/CRS/Executor.pm b/lib/CRS/Executor.pm
index 7a699a8..05c9438 100644
--- a/lib/CRS/Executor.pm
+++ b/lib/CRS/Executor.pm
@@ -127,8 +127,16 @@ sub load_job {
     my $jobfile = shift;
     die 'You need to supply a job!' unless $jobfile;

+    my @cset = ('0' ..'9', 'A' .. 'F');
+    my $tstr = join '' => map $cset[rand @cset], 1 .. 8;
+    my $tmpfile = "/tmp/fnord-" . $tstr . ".xml";
+
+    open(my $fh, '>:utf8', $tmpfile);
+    print $fh $jobfile;
+    close $fh;
+
     my $job = XMLin(
-        $jobfile,
+        $tmpfile,
         ForceArray => [
             'option',
             'task',
@@ -137,6 +145,9 @@ sub load_job {
         ],
         KeyAttr => ['id'],
     );
+
+    unlink($tmpfile);
+
     return $job;
 }
@a-tze
Copy link
Collaborator

a-tze commented Aug 15, 2023

@lukas2511 Do you know the Perl/libs versions used or the linux distro/release? Or do you have a ticket number, the jobfile-XML is retrieved from the tracker as-is and written to a file. this could be some problem in unicode normalization/c14n. german quotes were present before, I think we also tested a pile of poo in a title at some time.

@lukas2511
Copy link
Author

@lukas2511 Do you know the Perl/libs versions used or the linux distro/release? Or do you have a ticket number, the jobfile-XML is retrieved from the tracker as-is and written to a file. this could be some problem in unicode normalization/c14n. german quotes were present before, I think we also tested a pile of poo in a title at some time.

The system was an up-to-date Debian bullseye, Perl v5.32.1. Problem happened with tons of tickets, e.g. 2916.

The same system was used last year without any issues and there also were some talks with german-style quotation in their description... I'm not sure if anything about the tracker or crs scripts changed in the meantime, the only real difference I know of were the encoding changes on the frab database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants