-
Notifications
You must be signed in to change notification settings - Fork 51
(Better) feedback on actuation / set of actuators #560
Comments
I can see that we possibly need to support two type of calls, possibly for the same signal:
The reason is that some clients may want to follow up on (long-term) result, others not. So that is the first decision, do we need to support "both types"? Then I assume that we want the databroker to be relatively stupid. I assume the databroker must know if there is a "smart provider", i.e. if databroker actually can expect an explicit response. So maybe a "smart provider" needs:
I.e. databroker needs to keep the gRPC call open, until either it get a positive/negative response from the "smart provider", or the default or signal-specific timeout occurs. If no "smart provider" has registered for the signal maybe the databroker shall report something like ACCEPTED to indicate that the set request was accepted, but no result on if it will succeed will be provided. Here I assume that the Databroker needs to be able to keep multiple requests for the same actuator active in parallel. I.e. it is up to the "smart provider" to decide what shall happen, if it will perform the two operations after each other, or if it will reject the second request or interrupt the first request and give the second priority. |
Good points raised above. I do agree that one main goal should be that the solution should not "overcomplicate" things for neither providers or consumers. That is actually a main point of introducing this as it would enable consumers to act on failures to set actuators (try again or propagate this error). I also agree that it could make sense to differentiate between when an actuation is considered complete. Both in terms of what a consumer is interested in knowing, and in terms of what providers are able to communicate (i.e. they may not be able to monitor the progress past initiating the actuation). Arguably, if a provider is not able monitor the progress, the fallback could just be to monitor the actual value by subscribing to it (using databroker) while retaining the responsibility to decide when/if timeouts should occur.
I do not agree that databroker should ever return a positive response if there is no provider available for a particular actuator. That behaviour is currently causing unnecessary complexity in both providers and consumers. If there are no providers available when a client is trying to set an actuator, it should be informed of this in order to decide what to do next (try again or propagate the error for example). And if there is no provider available when a "set actuator" request is issued, providers connecting at a later point should not be triggered by these "pending" requests as there is no way to know if they are still valid. A minimum requirement for providers would thus be that they inform databroker of their existence in order for databroker to forward actuation requests to them. I believe it would be beneficial if they where also required to acknowledge when they have received such a request and "initiated" the actuation, as this is something that should be possible even for the most limited providers. This would benefit the consumer side of the API in that they can rely on being informed if there are any errors initiating the actuation. |
Makes sense to me: A response by databroker "I hear you, but currently no provider registered for that path" makes sense. However would the value be dropped, our would it be stored as "target" anyway, in case a provider comes around later (maybe it was not yet started, crashed or was otherwise "indisposed" and ,might act on it (via timestamp it could decide to act on it or not, but that would be up to a provider). Basically databroker being maximally dumb/generic", telling the consumer "hey currently no provider upon set", but still telling any late-joined provider (or actually even a "consumer" just reading/subscribing a target state), "jup, last known desired target state for that signal was 'true',I got that last Thursday". |
I think dropping the value would absolutely be the best option (given the complexities introduces by opting to "cache" it). Firstly, that would mean users can reason about "set actuator" as a request / response concept. And this (in my experience) is what most users expect when trying to set an actuator. If the request would be "cached" indefinitely, the actuation could happen at any point in the future, which could easily lead to unintended consequences. Silly example: If I press the "unlock the car" button in my parking lot and it doesn't work, I probably don't want it to open randomly when I'm no longer there. The user sending the initial request is best suited to decide whether it makes sense to retry that request at a later point. Secondly, storing the target value introduces complexity in the providers if they need to figure out if the request is current or if it's just some historic request (as you are hinting at in your description). The providers probably don't have access to the information needed to know whether it's desirable to try again. It also introduces a form of complexity in consumers, as they don't know if an existing target value means that any actuation is ongoing. If there is already a target value present, should the client set it (to the same value) again in order to trigger providers waiting for an up to date request? I don't really see anything gained by adding this complexity / ability. One thing that could be valuable though, is to introduce a way for a consumers to subscribe to the status of actuators (i.e. availability of providers), to use if they want to retry when a provider comes online. As I said, the client/consumer is probably best suited to decide if they want to try again, as they were the one triggering the request in the first place (and thus have access to the conditions under which it should be triggered). |
Adding to this, this could lead to another thing we could add. Here databroker could give some clients a higher priority and tell it to the providers. Maybe complicates thing and maybe not in scope of databroker itslef (for now) but as Erik mentioned priorities this came to my mind. It's more a comment than a suggestion.
I think this would be really good as it could be combined with feedbacking some properties like how fast the expected response time might be. I think @SebastianSchildt linked those issues too but mentioning again :) |
I came across this issue during today's meeting and would like to contribute some different/similar perspectives as an external observer and a novice in this context. Firstly, I find also the current response provided by the databroker to be quite confusing. Considering that there will be providers responsible for determining whether a Set operation succeeded or not on the actuator, I agree that it would be beneficial to start by adding more semantics to the Set response. One possible approach would be to introduce an additional field "ProviderResponse" (containing values such as NewValue, UpdateInProgress, Error) to the existing Set response provided by the databroker (which currently includes values like Ok, Error, No Access, etc.). This would provide the consumers with a meaningful response regarding the ongoing process on the other side, enabling them to decide whether to retry the set operation. I consider it is important to empower the consumer to take deterministic and robust actions depending on the Set response information, by this the uncertainty would be reduced if a value was set correctly I guess. Furthermore, if the consumer wants to receive constant updates from a signal, that's precisely what the "subscribe" method is intended for. This would eliminate the need for additional overhead to the "set" method such as blocking calls, waiting, timeouts, etc. Additionally, if we need to know the status of a provider, the "subscribe" method could also include the response status of the provider, in case there is no value to be transmitted to the consumer. From what I understand, the databroker simply interconnects components. Therefore, I agree with having a comprehensive error semantics in place, rather than introducing more methods and complexity to the databroker(like values caching, etc). |
Here is an initial proposal. I have introduced a new enum called ValueUpdateStatus in the DataEntry message. The purpose of this new field is to provide to the response messages (GetResponse and SubscribeResponse) with a status indicator. By doing so, we gain valuable insights into the state of the system following the execution of a 'Set' operation. The ValueUpdateStatus enum mainly looks like this:
One of the benefits of introducing this new field in DataEntry is that the Database structure we have in the broker contains a hashmap that actually holds all the DataEntry objects along with their corresponding new values.
This means that, for example, if some update operation fails, we can retrieve and update the update status on the database broker. Consequently, if a client wants to know the status of its operations, the broker will return the last status stored, providing the client with control over its performed operation. API protobuf would like this: I have also been considering creating an additional protocol diagram, similar to the TCP protocol or the ONVIF protocol for cameras. This diagram would help users understand the correct process to update values on actuators and check whether the operations have been successfully performed. |
@rafaeling and I had a quick discussion. A quick summary:
I think we need to prototype something and then discuss and have a look at it. Therefore we should keep it simple for now. I think |
While being able to subscribe to the status of actuators would be a useful feature, I don't think it's really addressing the main issue at hand here. In order to even have this feature, something along the lines of what @SebastianSchildt showed in the table above would be needed, i.e. at a minimum: having an open GRPC connection (stream) between databroker <--> provider. Without this, there is no way for databroker to know whether a provider is available, and it would not be possible to correctly maintain a correct state of the actuator status. Secondly, with the current proposal, I think it would become unnecessary complex for clients to determine whether a "set actuator" request was successful. (I agree with the observation that there should probably be two different definitions of when a "set actuator" should be considered complete, but lets consider the most simple case first: It is considered complete / successful once the actuator provider has acknowledged the command to actuate, which would typically be done just after the actuation is initialized) If I understand the proposed usage of
In pseudo code something like (ignoring synchronization : set_actuator_target_called = false
# start the subscription in a separate thread
for message in client.subscribe(Window.Front.Left.Position, FIELD_ACTUATOR_STATUS):
if not set_actuator_target_called:
continue # Not caused by us yet
elif message.actuator_status == UPDATE_SUCCESS or message.actuator_status == UPDATE_IN_PROGRESS:
result = SUCCESS
else:
result = FAILURE
# in the original thread, set actuator target
client.set(Window.Front.Left.Position, FIELD_ACTUATOR_TARGET, 20)
set_actuator_target_called = true # possible race condition
# wait (somehow, probably involving threads / condition vars) for result to be populated
print(result) I could be missing something, but I would not categorize this as being easy to use. Having the "set actuator" call block until the actuation has been acknowledged by the provider (or databroker responding with an error in case of no provider) seems like a much simpler API. result = client.actuate(Window.Front.Left.Position, 20)
# ^ blocks until actuator has acknowledged actuation start (or failure)
print(result) and perhaps (at some point) even have a complementing function for when the client wants to wait until the actuation has fully completed. result = client.actuate_to_completion(Window.Front.Left.Position, 20)
# ^ blocks until actuator has reached the desired state (or failed)
print(result) Perhaps these "convenience functions" could be implemented in the python library, client side. But it would still be an unnecessary complex implementation, which would need to be done for every language library wanting this. I think it would be a lot smarter to have this implemented once databroker side, exposed as a ready to use GRPC function. The expression
comes to mind, when looking at solving this using only I do understand that this was done in order to keep the API "simple". But at some point, trying to fit everything into a "too simple" model will end up making it more complex. PS Let me just repeat that having actuator status as a field (i.e. metadata) that can be subscribed to can still be a useful feature that is not incompatible with having dedicated |
I just want to make sure we are on the same page (in terms of understanding the current behaviour of the API).
When I'm talking about "caching", I'm describing the current behaviour of The current way to "set actuator" is to use the generic This doesn't match the concept of "set actuator" which is fundamentally a momentary request that would only make sense at the time it is made (in general). This conceptual mismatch is something I think we should address going forward. Personally, I'm convinced that the best way to address it is by adding a dedicated function for this concept ( |
We are about to archive this repo soon. If you consider this issue as important please file a new issue at one of the new Kuksa repos at https://github.com/eclipse-kuksa |
This should serve as a discussion/starting point of designing better support for actuators in the API
For most use cases involving actuation, it might be good if optionally you can get some feedback.
Current situation: You can write target values. You don't know, whether that works. You DO get feedback in terms of "Unknown datapoint" or "No Access" but a successful set only indicates, databroker set and accepted it sucessfully in its in-memory database. Whether the intended change is happening in the vehicle is unknown on the API level.
You maybe observe current_value in your app, and decide with some custom logic whether the actuate was successful or not.
The functionality missing is an easy to use way of knowing whether something failed (and / or finished). As this is dependent on the southbound, only the provider can be the source of this truth. With he current design, where providers and consumers are completely asynchronous and after a consumer did a set, there is no "contract" if/when and when a provider might pick it up (currently it does so via subscribing target_value), and thus there is also no way to give any kind of useful error information back to consumers.
The problem is also hard to solve "generic" because things like timeouts are very use case dependant
We should agree on API extension/modifications to the kuksa.val.v1 interface. If at all possible this should be a backwards compatible extension
In terms of return codes/feedback, we might need something like
My initial feeling is, that the set API is ok, it might just have some new errors defined.
For providers we likely need a slightly different concept, one solutions might be, they are opening a GRPC bidi stream, "claiming" to provide a certain datapoint and then getting request form databroker
Rough sketch
On that not, it is super important that we do not "overcomplicate" things for writers of providers (and consumers). It is a matter of design "how diligent" a provider is checking results (depending on how it is connected to the vehicle system, it's own methods may be limited, or in more "classic" use cases/control loops where you send a more low level actuator 30 times a second, you would not care on a individual level anyway. So it should be easy (in terms of lines of code/complexity) for an actuation provider to "Just return OK"
We also might want to think about what our recommendation is for actuations that take longer (e.g. multiple seconds). Will we always block the set? Or just return OK when actuation is in progress? Or is this up to the provider implementer to decide? Do we then have recommendation towards provider "providers" how long these "should" take to answer?
The text was updated successfully, but these errors were encountered: