Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: fix xml_escape(), again #2811

Merged
merged 1 commit into from
Nov 14, 2018
Merged

lib: fix xml_escape(), again #2811

merged 1 commit into from
Nov 14, 2018

Conversation

JuhaSointusalo
Copy link
Contributor

Commit ff0b8d0 (fix CDATA escaping) introduced two bugs to xml_escape():

  • it eats ']' characters that are not part of ']]>'
  • it eats the next character after ']]>'

Fixes #2809

Commit ff0b8d0 (fix CDATA escaping) introduced two bugs to xml_escape():

- it eats ']' characters that are not part of ']]>'
- it eats the next character after ']]>'

Fixes #2809
@JuhaSointusalo
Copy link
Contributor Author

Test code:

// cl /MDd /W4 /EHsc test-xml_escape.cpp /I \dev\boinc\src\lib /I /dev/boinc/src/win_build /link /libpath:\dev\boinc\src\win_build\Build\x64\Debug libboinc.lib shell32.lib advapi32.lib
#include <iostream>
#include <string>
#include "parse.h"

using namespace std;

void test(string str) {
    char out[2000] = "";
    cout << "test: '" << str << "'\n";
    xml_escape(str.c_str(), out, sizeof(out));
    cout << "escaped: '" << out << "'\n";
    xml_unescape(out);
    cout << "unescaped: '" << out << "'\n";
    cout << "test==unescaped: " << (str == out) << "\n\n";
}

int main() {
    test("");
    test("abc");
    test("abc<def");
    test("abc&def");
    test("abc]]>def");
    test("<");
    test("&");
    test("]]>");
    test("]");
    test("abc]def");
    test("abc]]]>def");
    test("abc&lt;def");
    test("abc&&def");
    test("abc]]>]]>def");
    string str = "\x9\xa\xd";
    for (int i = 32; i < 256; ++i) {
        str = str + (char)i;
    }
    test(str);
    test("abc]\30]\30>def");
    return 0;
}

Results:

test: ''
escaped: ''
unescaped: ''
test==unescaped: 1

test: 'abc'
escaped: 'abc'
unescaped: 'abc'
test==unescaped: 1

test: 'abc<def'
escaped: 'abc&lt;def'
unescaped: 'abc<def'
test==unescaped: 1

test: 'abc&def'
escaped: 'abc&amp;def'
unescaped: 'abc&def'
test==unescaped: 1

test: 'abc]]>def'
escaped: 'abc]]&gt;def'
unescaped: 'abc]]>def'
test==unescaped: 1

test: '<'
escaped: '&lt;'
unescaped: '<'
test==unescaped: 1

test: '&'
escaped: '&amp;'
unescaped: '&'
test==unescaped: 1

test: ']]>'
escaped: ']]&gt;'
unescaped: ']]>'
test==unescaped: 1

test: ']'
escaped: ']'
unescaped: ']'
test==unescaped: 1

test: 'abc]def'
escaped: 'abc]def'
unescaped: 'abc]def'
test==unescaped: 1

test: 'abc]]]>def'
escaped: 'abc]]]&gt;def'
unescaped: 'abc]]]>def'
test==unescaped: 1

test: 'abc&lt;def'
escaped: 'abc&amp;lt;def'
unescaped: 'abc&lt;def'
test==unescaped: 1

test: 'abc&&def'
escaped: 'abc&amp;&amp;def'
unescaped: 'abc&&def'
test==unescaped: 1

test: 'abc]]>]]>def'
escaped: 'abc]]&gt;]]&gt;def'
unescaped: 'abc]]>]]>def'
test==unescaped: 1

test: '	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~�ÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜø£Ø׃áíóúñѪº¿®¬½¼¡«»░▒▓│┤ÁÂÀ©╣║╗╝¢¥┐└┴┬├─┼ãÃ╚╔╩╦╠═╬¤ðÐÊËÈıÍÎÏ┘┌█▄¦Ì▀ÓßÔÒõÕµþÞÚÛÙýݯ´­±‗¾¶§÷¸°¨·¹³²■ '
escaped: '&#9;&#10;&#13; !"#$%&amp;'()*+,-./0123456789:;&lt;=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~�&#128;&#129;&#130;&#131;&#132;&#133;&#134;&#135;&#136;&#137;&#138;&#139;&#140;&#141;&#142;&#143;&#144;&#145;&#146;&#147;&#148;&#149;&#150;&#151;&#152;&#153;&#154;&#155;&#156;&#157;&#158;&#159;&#160;&#161;&#162;&#163;&#164;&#165;&#166;&#167;&#168;&#169;&#170;&#171;&#172;&#173;&#174;&#175;&#176;&#177;&#178;&#179;&#180;&#181;&#182;&#183;&#184;&#185;&#186;&#187;&#188;&#189;&#190;&#191;&#192;&#193;&#194;&#195;&#196;&#197;&#198;&#199;&#200;&#201;&#202;&#203;&#204;&#205;&#206;&#207;&#208;&#209;&#210;&#211;&#212;&#213;&#214;&#215;&#216;&#217;&#218;&#219;&#220;&#221;&#222;&#223;&#224;&#225;&#226;&#227;&#228;&#229;&#230;&#231;&#232;&#233;&#234;&#235;&#236;&#237;&#238;&#239;&#240;&#241;&#242;&#243;&#244;&#245;&#246;&#247;&#248;&#249;&#250;&#251;&#252;&#253;&#254;&#255;'
unescaped: '	

 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~�ÇüéâäàåçêëèïîìÄÅÉæÆôöòûùÿÖÜø£Ø׃áíóúñѪº¿®¬½¼¡«»░▒▓│┤ÁÂÀ©╣║╗╝¢¥┐└┴┬├─┼ãÃ╚╔╩╦╠═╬¤ðÐÊËÈıÍÎÏ┘┌█▄¦Ì▀ÓßÔÒõÕµþÞÚÛÙýݯ´­±‗¾¶§÷¸°¨·¹³²■ '
test==unescaped: 1

test: 'abc]�]�>def'
escaped: 'abc]]>def'
unescaped: 'abc]]>def'
test==unescaped: 0

@JuhaSointusalo
Copy link
Contributor Author

The code still has (at least :/) one bug. For some reason xml_escape() eats characters less than 32. If it's given a string like ]\xf]\xf> it munches the \xfs and leaves ]]> behind, putting us back to where we started from.

I think there's two ways to fix it. Either escape not ]]> but all > characters or don't eat characters. The first one is allowed by XML standard. The second one would require figuring out why such a low level function gets to decide which characters are good and which are bad. Both would require full testing of entire BOINC and I'm not too keen on doing that right now.

Quite frankly, I'm beginning to think that the next time someone finds a bug in BOINC's XML code it's time to start porting the code to some XML library.

@lfield
Copy link
Contributor

lfield commented Nov 13, 2018

Looks ok to me. I would suggest that if there is any other issue we back out these changes, keep the original issue for this release and then move to an XML library afterwards.

@lfield lfield merged commit 7e9d41b into BOINC:master Nov 14, 2018
lfield added a commit that referenced this pull request Nov 14, 2018
lib: fix xml_escape(), again
(cherry picked from commit 7e9d41b)
@JuhaSointusalo JuhaSointusalo deleted the lib-fix-xml_escape-again branch April 11, 2019 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stats export with ] in them break after #2784
2 participants