`source_id` should default to having a non-empty value #93

cboulay · 2024-12-10T20:33:15Z

Having an empty source_id is undesirable. We want to improve the naive experience by enabling auto-reconnect and also not causing the inlet to crash when the outlet is closed -- which seems to happen when source_id="".

I propose that we change to source_id: typing.Optional[str] = None and, when it is None we generate a new source_id that is built from a hash of name, type, channel_count, nominal_srate, and channel_format.

I would like to include the hostname in the hash to prevent collisions. However, while a manually-created StreamInfo object is most likely going to be used to initialize a StreamOutlet where grabbing the hostname is easy, it could also be used to initialize a StreamInlet when the hostname is unknown.

Note: A valid StreamInlet can be created from a manually-created StreamInfo, rather than one returned by a stream-resolution function, if the name, type, channel_count, and channel_format are all known.

Instead, when creating the outlet, we can examine the source_id and if it matches perfectly with the expected hash then we further add the hostname to the hash.

The text was updated successfully, but these errors were encountered:

chkothe · 2024-12-10T21:29:49Z

I'd suggest giving this proposal some more time for refinement. There's a foundational assumption baked into LSL that, if there's a source ID set (eg in an inlet) and it matches some stream on the network then the streams are certified (by whoever set the id) to be one and the same stream. If we always set it to a value, then use cases where two instances of some data source are running that were resolved with whatever query the user chose to identify them (e.g., most recently created, some other desc/ property) then the moment one of them gets lost, the inlet can silently flip over to the other source without the user knowing, and so they can't be sure they're not silently getting the wrong data from that point onward.

Obviously it's not great that one can get a LostError, but then that's meant to give the application an opportunity to deal with the case that the source went away and the source didn't provide an (actually) unique identifier to ensure its recovery (if one wants to have a program that will survive even that, at the risk of getting the wrong data, the application could always manually re-resolve the stream and eg print a warning etc), or in many cases it may be necessary for the user to step in to handle it.

One thing one could potentially do is have a config value that always assigns a (pseudo-)unique id from a set of fields, so that power users who have data sources that they know are unique in their setup but which don't set id's (and are thus not recoverable) can still have a robust setup (I suppose in a future incarnation one could also make that more fine-grained than a true/false, e.g., it could be a dictionary that allows the power users to do surgery on the ids of their streams, like say forcibly overriding the id for a stream named BioSemi to '123' even though that one may currently set a different id that may have other uniqueness characteristics, say if the app developer used a fresh random number each time).

edit: though actually I misread this as a git issue on liblsl rather than pylsl: with the latter one could potentially make it explicit enough to the user in documentation or logging etc that they don't fall into that trap, like always logging that a putative default of source_id=True caused a pseudo-unique id being generated and scolding them for not setting it to either None (if they really don't want one because they can't guarantee uniqueness) or a string of their own choosing...

cboulay · 2024-12-10T22:09:38Z

make it explicit enough to the user in documentation or logging etc

Gah, I forgot to update the docstring in the PR!

cboulay · 2024-12-10T22:31:34Z

actually I misread this as a git issue on liblsl rather than pylsl

I don't think this is appropriate on liblsl. Developers using the C / C++ API are somewhat forced to reckon with this concept.

In Python-land however, there are many more users blindly using source_id="" which IMO is a worse experience than having a non-empty default that is mostly unique.

Power users with multiple streams with the same name-type-channelcount-srate-format can use source_id="" to get the original behaviour back or, preferably, set the source_id to something truly uniquely identifying in their setup.

chkothe · 2024-12-12T18:51:07Z

Agreed that in Python one can have a developer UX that strikes maybe a better middle ground between convenience and strictness, and I saw in your PR that you're printing in that case. I guess that's enough of a nag to get users to choose one by themselves (or nulling it) rather than enduring that showing up in their console...

cboulay changed the title ~~source_id should default to having a legitimate value~~ source_id should default to having a non-empty value Dec 10, 2024

cboulay linked a pull request Dec 10, 2024 that will close this issue

source_id defaults to a non empty value #94

Merged

cboulay mentioned this issue Dec 10, 2024

source_id defaults to a non empty value #94

Merged

cboulay closed this as completed in #94 Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`source_id` should default to having a non-empty value #93

`source_id` should default to having a non-empty value #93

cboulay commented Dec 10, 2024

chkothe commented Dec 10, 2024 •

edited

Loading

cboulay commented Dec 10, 2024

cboulay commented Dec 10, 2024 •

edited

Loading

chkothe commented Dec 12, 2024

source_id should default to having a non-empty value #93

source_id should default to having a non-empty value #93

Comments

cboulay commented Dec 10, 2024

chkothe commented Dec 10, 2024 • edited Loading

cboulay commented Dec 10, 2024

cboulay commented Dec 10, 2024 • edited Loading

chkothe commented Dec 12, 2024

`source_id` should default to having a non-empty value #93

`source_id` should default to having a non-empty value #93

chkothe commented Dec 10, 2024 •

edited

Loading

cboulay commented Dec 10, 2024 •

edited

Loading