Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replaceable events... maybe to change them to "send latest only" by default #1036

Open
mleku opened this issue Feb 10, 2024 · 40 comments
Open

Comments

@mleku
Copy link

mleku commented Feb 10, 2024

in most dynamic systems, let alone distributed systems, events that delete items from the state tend to be racy, very racy, and for users, losing their profile or relay lists or follow lists is pretty damn irritating

my suggestion is to change the filter in NIP-01 to add a flag that is default false, which when present enables the retrieval of a string of prior states of replaceable events

this doesn't change the logic of existing apps, but when the new filter field is seen, the relay can then instead send old states, and relays can then store old states and default to sending only the latest

this would be a big benefit to users enabling them to reassert prior states of their replaceable events

hell, it might make it possible to make most events replaceable and then make it so you can view the history if the relay stores it and if you ask for it

i will add this feature to my relay in the near future and it won't trigger unless the client knows it, this is an easy change that makes a big difference to user experience... add the flag, and keep the old events and not delete the old versions but retrieve "replaceable event" new versions by default

@fiatjaf
Copy link
Member

fiatjaf commented Feb 10, 2024

Not a bad idea.

But I wonder if it isn't better to just have "archival" relays that just return multiple versions of these events regardless of a different flag.

This doesn't add any new requirements in comparison to your idea since clients would already have to know what relays would be storing multiple versions of replaceable events anyway.

So, if nostr.wine, for example, wants to offer such feature, it could expose the archival relay under nostr.wine/archival or archival.nostr.wine.

@pablof7z
Copy link
Member

You can also just request with a since that is lower than the current state, a bit more work for the client but requires absolutely no changes, the only difference is that relays that want to do this would just not delete replaced events.

These relays could announce this feature with NIP-66 to make the feature discoverable within the protocol

@mleku
Copy link
Author

mleku commented Feb 11, 2024

i have been thinking about it a bit more and after changing my profile and then exporting the database on it lo and behold yes it returns two versions

most request filters for this anyway specify they only want one copy, so what i'm doing today is changing the results code slightly so it just delivers them by default in reverse chronological order, newest first

it actually simplifies the implementation substantially, what started me thinking in this direction was removing the search and delete code and then last night it dawned on me, as i was watching people talking about losing their kind 0 event that "replaceable" should not in fact delete anything, deleting events should be a matter for garbage collection and cache management

the reason why i removed that code was because of the race condition i have been telling you about in eventstore/badger - it also unblocked the process temporarily but wasn't the solution because it still was racing sometimes - but now that the code is missing all i have to now do is add a datestamp sort to the events before it runs the limit filtering on it and this will also allow a higher limit value to return multiple versions - and it may also be necessary for client compatibility to default a missing limit in the filter to mean 1 so the behaviour follows the expected

@mleku
Copy link
Author

mleku commented Feb 11, 2024

You can also just request with a since that is lower than the current state, a bit more work for the client but requires absolutely no changes, the only difference is that relays that want to do this would just not delete replaced events.

These relays could announce this feature with NIP-66 to make the feature discoverable within the protocol

where is the NIP-66 draft?

@pablof7z
Copy link
Member

You can also just request with a since that is lower than the current state, a bit more work for the client but requires absolutely no changes, the only difference is that relays that want to do this would just not delete replaced events.
These relays could announce this feature with NIP-66 to make the feature discoverable within the protocol

where is the NIP-66 draft?

cc @dskvr

@jb55
Copy link
Contributor

jb55 commented Feb 11, 2024

this is what me and @Semisol have been talking about for ages. it's a good idea. nostrdb already supports this since it doesn't actually "replace" events, it versions them.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 11, 2024

Funny thing... Amethyst had to separate filters between replaceables and non-replaceables and manually add a limit:n for each replaceable we wanted to download as a protection to avoid receiving ALL past versions from non-compliant relays.

This is how we actually download the contact list after logging in:

Filter(
    kinds = listOf(ContactListEvent.KIND),
    authors = listOf(account.userProfile().pubkeyHex),
    limit = 1,
),

Most new relays or relays that were quickly built from scratch forget to make the special case of sending just the last event for the replaceable range + kind:0 and kind:3 and end up returning everything. This is particularly hard on replaceables that update very frequently, like Live Streams status events.

So, at this point, I am forced to already expect some relays to send past versions and always code accordingly.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 11, 2024

Similar protections were also added for relays that are not compliant with ephemeral events (must add since: now()) and return thousands of stored NIP-46, NIP-47 calls.

In other words, compliance to since and limit are more common than compliance to specific behaviors per kind/kind range.

@alexgleason
Copy link
Member

Similar protections were also added for relays that are not compliant with ephemeral events (must add since: now()) and return thousands of stored NIP-46, NIP-47 calls.

In other words, compliance to since and limit are more common than compliance to specific behaviors per kind/kind range.

In theory limit: 0 is supposed to prevent you have to use since: now(). In practice, some relays spit events at you anyway.

@alexgleason
Copy link
Member

Versioned events is a good idea. It would solve #349. But it can't be relied on since the very high disk requirement would discourage people from doing it unconditionally.

@jb55
Copy link
Contributor

jb55 commented Feb 11, 2024 via email

@alexgleason
Copy link
Member

I could see keeping maybe the last 3 versions of each replacable event. But man those follow lists are brutal.

@mleku
Copy link
Author

mleku commented Feb 11, 2024

Similar protections were also added for relays that are not compliant with ephemeral events (must add since: now()) and return thousands of stored NIP-46, NIP-47 calls.
In other words, compliance to since and limit are more common than compliance to specific behaviors per kind/kind range.

In theory limit: 0 is supposed to prevent you have to use since: now(). In practice, some relays spit events at you anyway.

what is "limit":0 supposed to give you? i set mine to interpret that to mean 1 if it's replaceable or parameterized replacable and max if it's anything else

I could see keeping maybe the last 3 versions of each replacable event. But man those follow lists are brutal.

they are, sometimes fills several screenfuls in my logs when they come through

can't really avoid the size on the wire but in the database you could be compressing these with a truncated hash and an index table

Versioned events is a good idea. It would solve #349. But it can't be relied on since the very high disk requirement would discourage people from doing it unconditionally.

events are inherently versioned by unique ID and timestamp, but different types of events probably should be pruned at some point... this is not a protocol issue though, it's an implementation question

@jb55
Copy link
Contributor

jb55 commented Feb 11, 2024 via email

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 11, 2024

what is "limit":0 supposed to give you?

limit:0 shouldn't return anything already stored/from the past. Only new events coming in live in the subscription should be sent to the client. Ephemeral connections use this a lot. If I send a "pay ln invoice" event, I want to observe the response to that event, which is going to be new, and not anything stored in the db.

@mleku
Copy link
Author

mleku commented Feb 12, 2024

what is "limit":0 supposed to give you?

limit:0 shouldn't return anything already stored/from the past. Only new events coming in live in the subscription should be sent to the client. Ephemeral connections use this a lot. If I send a "pay ln invoice" event, I want to observe the response to that event, which is going to be new, and not anything stored in the db.

what about if the field is missing, is this different again?

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 12, 2024

what about if the field is missing, is this different again?

Yep, then there is no limit. It should send everything you have that matches the filter.

@staab
Copy link
Member

staab commented Feb 12, 2024

I'm fine with this change, but I still think switching to better primitives for editing lists would be preferable. If we're keeping all updates anyway, the argument against add/remove events becomes irrelevant, since you would have the same number of events, but smaller. Compatibility, as always, is the problem with that.

@pablof7z
Copy link
Member

I'm fine with this change, but I still think switching to better primitives for editing lists would be preferable. If we're keeping all updates anyway, the argument against add/remove events becomes irrelevant, since you would have the same number of events, but smaller. Compatibility, as always, is the problem with that.

Not really.

having to download and verify a bunch of events to compute the latest state is going to be more work than getting the state computed by the relay.

Relays can use deltas if they want to save storage, but there's huge benefits for a client to receive a single event with the state.

Many nostr devs work on long-running/frequently accessed clients, where the booting from no-state is rare, but if a new app has to download hundreds of events to boot that increases the friction for new players (although introduces the opportunity for minute-long spinners 👀👀👀)

@staab
Copy link
Member

staab commented Feb 12, 2024

Read optimizations can be supported by DVMs, race conditions having to do with writes can't be fixed without changing the event format.

@staab
Copy link
Member

staab commented Feb 13, 2024

Was thinking about this last night, there are two things that can be done to reduce the number of events that have to be fetched to get up to date: snapshots, and batch updates. For example, #875 does this:

Admins MAY publish member lists using `kind 27`. This MAY be published as a normal event, or wrapped
and sent to the group. An `op` tag indicates whether the listed pubkeys are being `add`ed to the group,
`remove`d from the group, or whether the member list is being `set`. An `a` tag MUST be included
pointing to the group definition event's address.

{
  "kind": 27,
  "content": "",
  "tags": [
    ["a", "35834:<admin pubkey>:<group name>"],
    ["op", "add"],
    ["p", "<pubkey 1>"],
    ["p", "<pubkey 2>"],
  ]
}

The reason I did this was to allow the group member list to scale past the maximum event size acceptable by relays, but it also allows an admin to set the list and add to it. The op isn't indexable, but a single letter tag could be used, which would make it possible for clients to 1. go back to the most recent set and 2. fetch all changes since that point.

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Feb 13, 2024

I suppose only ops that are created AFTER the a's latest created_at count, right? So, to get the full list, you get the latest a + all kind:27s since the a's created_at and then apply them.

How do you know if the relay you are using has the latest a? It could be in another relay with it's own kind:27 changes.

@staab
Copy link
Member

staab commented Feb 13, 2024

Right, exactly. And because of the nature of nostr you don't know. But as long as you don't publish a new set you won't break anything. You'll be working off an inaccurate list, but you can still accurately publish updates to it. An optional prev tag could be used to indicate a missing set op, but I don't really know when you could be confident enough to publish it, when you wouldn't also be willing to create a new snapshot.

@pablof7z
Copy link
Member

I described a very similar idea for collaboration to jeffg today where a bunch of pubkeys are p-tagged (ie. authorized) to modify a document and you only need to REQ since the latest replaceable event's created_at to compute the state and the original document's current state serves as a checkpoint/canonical (depending on the situation).

I think that's a cool idea, although I'm still concerned about how much it complicates the protocol if we use it for everything; isn't just the ability of easily rolling back to a previous kind:{0,3,etc) enough?

@vitorpamplona
Copy link
Collaborator

Delta events in basic kinds (0,3) certainly complicate things because we have to:

  1. map out the complete event set to rebuild the state.
  2. deal with multiple branches, with missing intermediary events, in separate relays/clients.

I don't think we should expect consistency where delta events are applied. If the lack of consistency is a dealbreaker, then we shouldn't use it.

Unbound lists are less consistent than the current replaceable event structure for lists but more consistent than the delta events (missing elements vs fully branching structures).

@staab
Copy link
Member

staab commented Feb 13, 2024

If you want to reconcile conflicting list updates you have to expand them into operations and rebuild anyhow. And in either case you've got to handle missing events. The two are really no different in terms of what you can infer in terms of eventual consistency. If we're worried about complexity, it seems to me unbounded lists won't help with that.

Maybe some back of the envelope math would help. Publishing all versions as a full list would result in a total of follows*follows/2 tags being published. Publishing a snapshot every 50 versions with add ops in between would result in follows*follows/100+(follows-follows/100) (or something like that)? So assuming a follow list that grew from 0 to 1000 pubkeys, in the first case you'd need to store 500k tags, but with ops you would need to store ~11k tags.

@vitorpamplona
Copy link
Collaborator

The two are really no different in terms of what you can infer in terms of eventual consistency.

I don't think so: Delta events need to deal with the missing events (same as unbound) but on top of that it has to deal with changes to a that don't have an op event, resetting ops. Or if the latest a itself is missing then it resets to the wrong state. If new ops are made in the wrong state and then the old a now comes back (the relay was unavailable for a few hours), things can get really messy. More messy than just missing events in an unbound list.

But yes, full lists are MUCH heavier to keep history.

@staab
Copy link
Member

staab commented Feb 13, 2024

Sorry, I think I led you astray. a doesn't have anything to do with what we're talking about, it's just the group's address. So there's no linking of events at all. But you're right that if a set is created without the correct preceding set event, you'll mangle the state. The point is if you're not 100% sure you should send set, don't, just send something that aligns with the user's intention (add/remove) instead. This results in self-healing the list because it gives the user the ability to authoritatively say whether a key is in the state at any time without involving any of the other entries. Sending the whole list every time means that if you're not in sync, the most recent entry may actually be wrong, and override a valid previous entry.

This is really just back to granularity; with kind 0's it's possible to update your website in one place and your lud16 elsewhere, and have the website get reset because the old value got set along with the lud16.

@mleku
Copy link
Author

mleku commented Feb 15, 2024

gone off on a tangent a bit here but just wanted to comment that deltas are a bad idea for nostr because of the lack of consistency

the thing that is missing from replaceable events is a tag saying replaces event <id>

then it becomes a blockchain btw

@jb55
Copy link
Contributor

jb55 commented Feb 15, 2024 via email

@mleku
Copy link
Author

mleku commented Feb 19, 2024

i'm just suggesting that the best solution for replaceable events is to have them refer to what they are replacing, then they can be pruned

deltas are out of the question for any data structure that does not have blockchain style backreferences and racing deletes going on... first to fix the races, because this can cause the nukening

it is trivial to trace blockchains history and would be a very small and easy to implement addition

i may attempt to make a spec for doing it, after i finish my relay task and have a GUI toolkit to use to build a client so i can test such a scheme

@arthurfranca
Copy link
Contributor

Follow lists get wiped not because of replaceable events but because the list event wasn't found on any of what the client considers the user's write relays. No doubt that it happens because some clients use kind:3 while others kind:10002.

It would still happen for parameterized not-replaceable events.

For new use cases where it fits, use the 1000 <= n < 10000 kind range of regularevents but also add a d tag to the event to reference on a tags. Now you get parameterized not-replaceable events.

@dskvr
Copy link
Contributor

dskvr commented Mar 1, 2024

You can also just request with a since that is lower than the current state, a bit more work for the client but requires absolutely no changes, the only difference is that relays that want to do this would just not delete replaced events.
These relays could announce this feature with NIP-66 to make the feature discoverable within the protocol

where is the NIP-66 draft?

I still opt for amending NIP-11 as the source of truth for relay specifications, which would be easy to transpose to NIP-66.

If you believe otherwise please comment on NIP-66 #230

@mleku
Copy link
Author

mleku commented Mar 1, 2024

You can also just request with a since that is lower than the current state, a bit more work for the client but requires absolutely no changes, the only difference is that relays that want to do this would just not delete replaced events.
These relays could announce this feature with NIP-66 to make the feature discoverable within the protocol

where is the NIP-66 draft?

I still opt for amending NIP-11 as the source of truth for relay specifications, which would be easy to transpose to NIP-66.

If you believe otherwise please comment on NIP-66 #230

where does this "source of truth" meme come from exactly, have you actually done any study into the matter of truth in distributed systems?????

@dskvr
Copy link
Contributor

dskvr commented Mar 2, 2024

Perhaps the wrong way to describe it, IMO not really the best place for needless animosity though. I was just signaling that I believe NIP-11 to be adequate for sharing this kind of information.

Reasoning

  1. It's an existing NIP with wide support
  2. It is more simple to implement for relay developers and operators
  3. It doesn't require a relay to be "hot" to sign it's own events
  4. It is accessible outside of the protocol itself

@mleku
Copy link
Author

mleku commented Mar 2, 2024

Perhaps the wrong way to describe it, IMO not really the best place for needless animosity though. I was just signaling that I believe NIP-11 to be adequate for sharing this kind of information.

Reasoning

  1. It's an existing NIP with wide support
  2. It is more simple to implement for relay developers and operators
  3. It doesn't require a relay to be "hot" to sign it's own events
  4. It is accessible outside of the protocol itself

i like nip-11 also... it needs some more fields... and clients need to rely on this information a bit more, like, if it says in limitations "auth required": true and the client can't do this, DON'T USE THE RELAY you wouldn't believe how many times i see over and over and over clients trying to make requests and ignoring auth responses and closed envelopes with auth-required: machine readable prefix

i don't like this new trendy expression because it is deceptive, in a distributed systems servers can self report all kings of things that are meaningless until they are proven through correct responses... like, my relay has a series of NIP activations listed in it but tbh i dunno if it actually correctly implements them all!

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 2, 2024

As a client, it's impossible to rely on NIP-11. The majority of relays never set or update their NIP-11 and when they do it's usually wrong/inconsistent with the behavior of the relay. I don't see that improving with NIP-66 unless we figure out a way to punish operators that don't pay attention to it.

However, NIP-66 has an opportunity to list relays without using DNS. Every relay should broadcast their IP addresses (and not DNS domains) to as many other relays as possible in a small event kind. Connections should happen to the IP. Clients register which PUBKEYS to connect. The pubkey is mapped to a DNS kind and the client connects using the IP directly. So, something like NIP-66 is a better solution IMO.

But I think that is a discussion for another thread.

@mleku
Copy link
Author

mleku commented Mar 4, 2024

right there you see why the idea of "source of truth" is so offensive @dskvr

none of this stuff is gonna get fixed unless some people get red faces being called out for their slackness

@vitorpamplona as for distributed relay p2p node lists this would be pretty easy to create with a libp2p DHT, i don't know of any other cross language system for creating a uniform DHT like this, but as well as this, there is the lack of a protocol for how relays might propagate or poll each other for events their clients ask for but don't have

IMO that's something like what nostrocket is aiming towards but i am not sure it's going the right way about it either, and it's why i'm happy to be working on a relay that is designed to be able to plug into multiple data sources... i think that this kind of extensible modular interfacing is probably the way to do it, keep the relay out of any consensus but make it easy to connect them to it

back to the topic though - i don't think that the replaceable events specification is concurrent safe, the number of times i personally have seen this utterly fail to work, to not update, or to totally lose lists due to client shennanigans, there needs to be some firm specifications in NIP-01 that loudly nag implementers to consider this issue and account for it properly

perhaps someone needs to start up a certification service that evaluates the state of each relay and client implementation and publishes regular reports on nostr of the state of things

@vitorpamplona
Copy link
Collaborator

vitorpamplona commented Mar 4, 2024

Send a PR for this @mleku. Is the new tag supposed to work for ephemeral events as well? We know operators who believe ephemeral events should not be deleted as well. So, maybe this is a way to fix it for them too.

However, if we want to keep Nostr truly simple, I favor a breaking change where relays treat replaceables and ephemerals as regular events (no more special behaviors for kind ranges) and clients are required to send a limit:1 and since:now when asking for each type respectively. The expiration tag creates ephemerals. By default, the relay returns all versions it has until its limit.

@alexgleason
Copy link
Member

A big problem that versioned events solves in SQL is that soft-deletes have a lot better performance than hard deletes.

How to soft-delete:

  • Add a deleted_at (integer) column to the events table.
  • Use UPDATE events SET deleted_at = {now} ... instead of DELETE ...
  • Query with SELECT * FROM events WHERE deleted_at IS NOT NULL ...

You can then schedule hard deletions after a certain point instead of every time a new replaceable event comes through.

Anecdotally, this is much faster. I don't have benchmarks to prove it, but it made my slow DELETE queries disappear. So in conclusion, event "versioning" functionality is desired from both a user perspective and technical perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants