Replies: 5 comments 2 replies
-
Thanks, Erling, for getting the discussion started. I've thought quite a bit on the topic of formally verifying LF, and here is an ongoing attempt using SMT solvers. The progress is saved on the smt-gen branch (PR #794). In this case, the LF program is augmented with logical specifications and the An SMT-based verifier for temporal safety properties Summary: The idea is that we want to exhaustively check whether an LF program can produce traces (with each element being a reaction invocation) that violate certain linear temporal properties. To do this, we formulate an SMT problem with an uninitialized bounded trace and rules for how reactions are triggered. When we give the SMT problem to the SMT solver (Z3, in this case), it will try to find an instantiation of the trace that satisfies the rules but violates the safety property. Properties of interest: safety properties (counter-examples have finite lengths.) Modeling the execution of an LF program: a transition system with states being the LF variables (including state variables, values assigned to input/output ports, values assigned to actions), and transitions being the invocation of a reaction. Specification: linear temporal logic Verification tactics: bounded model checking (BMC) & induction Open questions:
Weaknesses:
|
Beta Was this translation helpful? Give feedback.
-
Update from meeting June 1:
|
Beta Was this translation helpful? Give feedback.
-
I absolutely agree that we will need a standard way of testing LF applications. For usability, It would be great if this would be somehow integrated with the LF language and tooling (for instance like you can run Rust tests with cargo). I think this also intersects with another old discussion: packages. @revol-xut is currently working on a prototype for a simple package manager for lf, and we should probably consider testing in its design. I also would like to mention that @jhaye is currently looking into some formal tools for Rust. I think that these could help in both formal verification of the Rust runtime and formal verification of reaction bodies or even entire LF programs. We will discuss this in the semantics meeting once @jhaye collected a bit more knowledge about the tools. |
Beta Was this translation helpful? Give feedback.
-
@cmnrd @lhstrh
I hope I can get to explore this project further during my stay here |
Beta Was this translation helpful? Give feedback.
-
Thanks for adding these notes, @erlingrj, and reviving this discussion. As @cmnrd mentioned in this thread, it might make sense to discuss this when we sit down to talk about packaging. This has been high on @revol-xut's list, and I think would make sense to carve out some meeting time for this while @cmnrd is visiting so we can construct a roadmap. |
Beta Was this translation helpful? Give feedback.
-
LFtb - A testbench framework for Lingua Franca programs
I have been thinking a bit about how a Lingua Franca program can be tested and verified and the result was this idea about making a verification framework.
I believe that LF actually can simplify the verification of real-time systems as it is very easy to create temporal fault scenarios.
I mentioned this to Marten who told me that there had already been some discussion on this topic, I would like to reboot that discussion.
Summary
Testing and verification are crucial when building safety-critical and time-sensitive systems.
I believe that LF enables a level of system verification that could be very hard to achieve using an ad-hoc framework.
The core feature of LF is the two timelines and their interaction. By controlling the physical time we can easily test the response of the system to logical time lagging behind physical time and trigger the deadline handlers.
We also should support applying test signals on the Input Ports and inspecting/asserting signals on the Output Ports.
Ideally also inspecting/asserting the internal state of the Reactor and also which mode it is in.
Some of these are easy to implement as a LF program with a TestBench Reactor which instantiates the Reactor we wish to test (often called the DUT (device under test)).
However, some require access to the runtimes.
These abilities could be gathered into a Verification framework which lets the user efficiently build a testbench and verify their LF program.
Verification
Verification can be split into two parts.
Formal verification
Formal verification deals with building mathematical proofs about the system (or rather the model of the system).
I believe it has been discussed already to use temporal logic and model checking to prove safety and liveness properties of the Reactor graph.
Formal verification is useful and important, but it is, AFAIK, not that widespread because programmers normally don't know the required math.
Functional Verification
This is also referred to as Testing or Unit Testing or just Verification.
In Functional Verification, we execute or simulate the system for various scenarios to build confidence in it.
Functional Verification is probably used in all real-time software.
To functionally verify a LF program you could
Verification of temporal behavior
My experience is that it is hard to properly test a systems reactions to temporal faults when using ad-hoc time management. Since you have built your own time management (with timestamping etc) you probably have to build your own testing infrastructure to create different fault scenarios.
Consider the following Reactor Graph which represents a quite normal sensor fusion application, a Kalman Filter combining IMU and GNSS measurements.
Testing that each individual Reaction does what you expect it to do is easy.
Also applying a dataset of IMU and GNSS messages to verify that the Kalman Filter converges to some known ground truth is also quite easy.
This is also easy to test with other middlewares because it is essentially just unit testing.
What is harder is to verify all the different fault handlers. What if there is no GNSS message in 5 sec. What if there is a dropped IMU message etc.
What makes it hard to verify is that these are temporal faults and there is no portable way of specifying those scenarios because you have implemented your own ad-hoc time management.
LF can really simplify this because of this explicit interaction of logical and physical time.
By specifying both the logical and the physical timestamp of an event you can generate any temporal fault scenario.
Currently, I believe there is no clean and simple solution to reliably generate different temporal fault scenarios in LF.
You could create a LF program that instantiates the DUT and sleeps and blocks the program for some time to trigger the deadline violations, but we can do better.
It would be great to know how temporal fault scenarios are tested in the industry, if someone has any references please shout out.
Coverage
Is a measure of how much of the system has been tested. It is typically summarized in a percent. Types of coverages include:
In LF-tb we could operate with another measure which is "reaction coverage" which measures the percentage of reactions that have been triggered during the testing.
This would also include Deadline handlers. It would give nice feedback to the user where they have holes. We also want the "normal" coverage measures, but that can be achieved by bundling tools like GCOV with LF-tb
Requirements
What exactly does LF-tb look like?
This is really the big question and so far an open question that I would love some feedback on. We could either
My gut feeling is telling me that the latter might be a better approach. This Python library would auto-generate the LF program which wraps the DUT and translates input signals to Timers, Reactions, State and output signal assertions to Reactions which asserts the value and timestamp of its trigger. I think cocotb could be an inspiration for such a project. The tracing mechanism could potentially be used also.
The big unknowns, which I think require modifications to the run-times are
... and of course, we have all the unknown unknowns.
I would love some feedback on this topic.
Beta Was this translation helpful? Give feedback.
All reactions