Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify: use one form of serialization used by fuzzer and ir/verilog simulation #1645

Open
hzeller opened this issue Oct 3, 2024 · 1 comment
Assignees

Comments

@hzeller
Copy link
Member

hzeller commented Oct 3, 2024

This is something to simplify.

Currently, there are two types of serializations used to represent test vectors to simulation

  • eval_proc_main, eval_ir_main and simulate_module_main use function data serialized as strings (in the args file),and similar for channel data ('channel values file'). These is essentially an ad-hoc serialization format.
  • The fuzzer uses its own serialization as protocol buffer (defined in sample.proto), which it uses to serialize the relevant information as textproto inside the crasher.

The fuzzer runner then reads these, and converts into the other serialization format needed for the evaluators. This works,but it makes it harder to add additional information (I am currently looking into adding valid holdoff fuzzing), as that then has to be added to the proto, but also to the ad-hoc serialization.

The proto describing the input data should be broken out of the fuzzer context, made to be a serialization of test vectors (like before, but proto lives somewhere differently), then used in the fuzzer but as well in the evaluators/simulators.

@hzeller hzeller self-assigned this Oct 3, 2024
copybara-service bot pushed a commit that referenced this issue Oct 4, 2024
Context: #1645

Goal is to have a single representation of test vectors presented to
various evaluation and simulation backends (eval_ir, eval_proc and
verilog simulate). Currently they use some ad-hoc serialization,
that has to be parsed and copied to and from proto buffers.
The fuzzer alreay has some good way of representing these
as proto buffer, so this is the starting point.

No functional change or change in serialization format in this
change, just moving out test vector representations used in fuzzer
out into testing directory.

This step:

  * (this change:) Move representation of test vectors from the
    fuzzer context into separate file in xls/tests/. Protos are
    kept as-is for now to allow for compatibility.

Upcoming changes roughly:

  * allow the various evaluation programs to _also_ read this
    protocol buffer (alongside compatibility for their original
    --args_file, --args, --inputs_for_all_channels,
    --channel_values_file)
  * have the fuzzer invoke these tools with the text proto file
    with test vector inputs (and not anymore --args/--channel_values).
    Allows to remove the ad-hoc serialization, as we can use the
    plain proto directly.
  * Find possibly other places (if any) that use the 'old' flags and
    convert them to use the new textproto serialization.
  * Remove old flags and their ad-hoc deserialization of these
    inputs.
  * Now: ready for changes in the proto buffer (Next step
    here is actually to encode valid holdoffs).
    Note: Any change that modify the structure might need a
     fresh re-encode of all the crasher files, which are the
     only places where these are at rest currently.

PiperOrigin-RevId: 682358771
copybara-service bot pushed a commit that referenced this issue Oct 24, 2024
Issues: #1645
PiperOrigin-RevId: 689429389
copybara-service bot pushed a commit that referenced this issue Oct 24, 2024
Add new --testvector_proto flag that allows to read a
testvector::SampleInputsProto textproto.
(in preparation of deprecating
--channel_values_file, --args_file.)

Issues: #1645
PiperOrigin-RevId: 689435990
copybara-service bot pushed a commit that referenced this issue Oct 25, 2024
Issues: #1645
PiperOrigin-RevId: 689846135
copybara-service bot pushed a commit that referenced this issue Nov 8, 2024
copybara-service bot pushed a commit that referenced this issue Nov 8, 2024
There are various serializations to represent sample inputs: for
functions, it is args.txt, for proc it is some other ad-hoc serialization,
and internal to the fuzzer datastructure it is serialized as
testvector::SampleInputsProto.

The tools have been prepared to take a single protobuffer. This CL
writes the protobuffer (as testvector.pbtxt) but still alongside
the 'old' serialization formats.

Part of the refactoring to universally use testvector::SampleInputsProto
(follow-up steps: use this format exclusivey and remove old
serialization formats).

Issues: #1645
PiperOrigin-RevId: 694671946
copybara-service bot pushed a commit that referenced this issue Nov 9, 2024
Instead of the various ad-hoc serializations and different flags
such as --args_file/--input_file and --channel_values, use testvector proto
universally.

Issues: #1645
PiperOrigin-RevId: 694676298
copybara-service bot pushed a commit that referenced this issue Nov 9, 2024
These are now all covered by testvector.textproto.

Issues: #1645
PiperOrigin-RevId: 694679325
@hzeller
Copy link
Member Author

hzeller commented Nov 12, 2024

This is now simplified in a way that all simulation inputs take a --testvector_textproto which is of type xls::testvector::SampleInputsProto. The --args/--args_file/--channel_file flags are still supported for now, but they are not used anymore by the fuzzer, so possibly clean to remove.

Nexs steps

  • remove the legacy flags to keep the user interface simple and less confusing (however, it might complicate just passing some args 'by hand' if that is desired)
  • Consolidate the inputs in xls::testvector::SampleInputsProto with xls::ProcChannelValuesProto to just have one proto representing the data; will require changing the existing crasher*.x as they contain a SampleInputsProto

copybara-service bot pushed a commit that referenced this issue Nov 15, 2024
This allows to add holdoff-data to the crasher input and
get it all the way to arriave at simulate_module_main.
A sample is included, which is just a copy of an existing
crasher with some valid-holdoffs added.

Make xls::Sample's primary constructor accept a testvector, but
also provide access to legacy args/channel_names data, mostly
used in unit tests now.

The separation of args() and channel_names() was confusing and
with holdoff data, there would be another piece of data that had
to be round-tripped to proto buffers and back. So this needs
to be one data structure, ideally directly backed by a proto
buffer.
For now, this will be the testvector proto (which then also allows
to be directly serialized and is already used in the reproducible
crasher files), but will probably need to be refined with better
serialization of Values (that are right now semicolon-separated
text representations) in follow-up changes.

Left to do: get the fuzzer to actually emit this holdoff data.

Issues: #1645
PiperOrigin-RevId: 696688803
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant