From 8ed9c341b93f464b551e9f685a700418ff7e11c0 Mon Sep 17 00:00:00 2001 From: Simon Tatham Date: Fri, 31 May 2024 15:45:32 +0100 Subject: [PATCH] Support reg updates full of "--------" with too few chars. I also heard recently of a syntax for AArch64 core register updates that just reads "R X0 --------", which our parser rejects: it's OK with '-' replacing a hex digit in principle, but for a 64-bit register, it expected 16 of them, and here it only saw 8. But we already have a special case in which "R X0 00000000" is treated the same as "R X0 0000000000000000", because the Tarmac producer (understandably!) thought zero was too boring a value to write out in full. So it seems reasonable to do the same for the all-'-' case as well as the all-'0' case. I'm not sure we're treating this _right_, though. From context in the trace it looked as if the register really was being updated, and it's just that the trace generator was unable to find the new value while producing that line. A later read from the register returned a value that wasn't what had been in it before that update, and which looked sensible. So perhaps a better treatment of these lines would be to reset the register contents to 'unknown', like at the very start of a trace. But (a) this would need a lot more upheaval in the code, because RegisterEvent currently has no way to specify that; (b) I don't know what's the right basis to draw the distinction between -------- meaning "value is left unchanged" (as in the Q0 examples already in our test collection) and -------- meaning "value has changed but I'm not telling you what to". So there doesn't seem much point in putting a lot of effort into (a) until we have an answer to (b). I'll wait for more data. --- lib/parser.cpp | 7 +++++++ tests/parsertest.ref | 2 ++ tests/parsertest.txt | 12 ++++++++++++ 3 files changed, 21 insertions(+) diff --git a/lib/parser.cpp b/lib/parser.cpp index 55bd20e..134a20b 100644 --- a/lib/parser.cpp +++ b/lib/parser.cpp @@ -808,6 +808,13 @@ class TarmacLineParserImpl { contents.append(hex_digits_expected - contents.size(), '0'); break; + } else if (tok.iseol() && + contents.find_first_not_of('-', data_start_pos) == + string::npos) { + // Similar special case if all the digits are '-'. + contents.append(hex_digits_expected - contents.size(), + '-'); + break; } if (!tok.isregvalue()) parse_error(tok, _("expected register contents")); diff --git a/tests/parsertest.ref b/tests/parsertest.ref index c86e2c3..60d900c 100644 --- a/tests/parsertest.ref +++ b/tests/parsertest.ref @@ -282,3 +282,5 @@ Parse warning: unsupported system operation 'AT' * RegisterEvent time=8677000000 reg=x0 offset=0 bytes=01:23:45:67:89:ab:cd:ef --- Tarmac line: R SP_EL2 fedcba98 76543210 * RegisterEvent time=8677000000 reg=xsp offset=0 bytes=fe:dc:ba:98:76:54:32:10 +--- Tarmac line: R X0 -------- -------- +--- Tarmac line: R X1 -------- diff --git a/tests/parsertest.txt b/tests/parsertest.txt index bd2bac0..fd59f94 100644 --- a/tests/parsertest.txt +++ b/tests/parsertest.txt @@ -279,3 +279,15 @@ R SP 0123456789abcdef # data for a 64-bit register update can be split by a space. R X0 01234567 89abcdef R SP_EL2 fedcba98 76543210 + +# From the same Tarmac generator: sometimes core register updates list +# their contents as -------- (perhaps because the Tarmac generator +# can't retrieve the value for some reason?). And sometimes they don't +# list the full 64 bits of the value. We interpret this as a +# non-update (because the same syntax in other contexts means some of +# the register is left alone, e.g. examples above that update only +# half of Q0), which might not be the best interpretation, but we +# should at least not reject the input that doesn't have enough - +# signs to fill X1. +R X0 -------- -------- +R X1 --------