Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bech32 embeds #1078

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

vitorpamplona
Copy link
Collaborator

@vitorpamplona vitorpamplona commented Feb 23, 2024

Creates a nembed bech32 encoding to carry JSON-stringified events as Nostr URIs and facilitate the loading of those events by Clients without having to ping a relay.

This is particularly helpful for exporting existing/pre-signed health information payloads from corporate/private relays and send to users who might not have direct access to them. Health providers are used to including these payloads inside private communications with patients and adding information around them for better context.

A use case would be the following encrypted DM:

Hi Vitor, 

These are your new prescription for eyeglasses. This one is for distance. 
It's a small power but it can help when you are driving. 

nostr:nembed1...

The next one is for reading. This was adjusted to the distance of books. 
If you find yourself struggling to use a desktop computer, we can adjust 
those numbers for that need as well. 

nostr:nembed1...

Please double-check the sign of each number when ordering your glasses online. 

Here's your complete medical record with all the raw data from your visit today: 

nostr:nembed1...

Thanks for dropping by. Your business is very much appreciated. 

PS: I am not attached to the embed name. Suggestions are welcome.

@staab
Copy link
Member

staab commented Feb 23, 2024

I like this. It could solve the ninvite problem in #1062, and it seems like it could be applied very broadly to limit the spread of events that shouldn't stand on their own or need to be private. The one question I have is how long are these embeds? If it's 50 lines of text, that breaks the human readable (or at least parseable) requirement of kind 1s. Also, encoding json in a tlv is silly, we should at least use the tlv to reduce the payload size.

@vitorpamplona
Copy link
Collaborator Author

So an eyeglasses payload from an electronic medical record looks like this:

{
  "resourceType": "VisionPrescription",
  "status": "active",
  "created": "2014-06-15",
  "patient": {
    "reference": "Patient/example"
  },
  "dateWritten": "2014-06-15",
  "prescriber": {
    "reference": "Practitioner/example"
  },
  "lensSpecification": [
    {
      "eye": "right",
      "sphere": -2,
      "prism": [
        {
          "amount": 0.5,
          "base": "down"
        }
      ],
      "add": 2
    },
    {
      "eye": "left",
      "sphere": -1,
      "cylinder": -0.5,
      "axis": 180,
      "prism": [
        {
          "amount": 0.5,
          "base": "up"
        }
      ],
      "add": 2
    }
  ]
}

It is signed as a kind:82 (Medical Data) like this:

{
  "id": "73a5ace1c95de239ea355a745abfafee39d4dc613081a5c3dc25158c9385620d",
  "pubkey": "bcd4715cc34f98dce7b52fddaf1d826e5ce0263479b7e110a5bd3c3789486ca8",
  "created_at": 1708705561,
  "kind": 82,
  "tags": [],
  "content": "{\"resourceType\":\"VisionPrescription\",\"status\":\"active\",\"created\":\"2014-06-15\",\"patient\":{\"reference\":\"Patient/example\"},\"dateWritten\":\"2014-06-15\",\"prescriber\":{\"reference\":\"Practitioner/example\"},\"lensSpecification\":[{\"eye\":\"right\",\"sphere\":-2,\"prism\":[{\"amount\":0.5,\"base\":\"down\"}],\"add\":2},{\"eye\":\"left\",\"sphere\":-1,\"cylinder\":-0.5,\"axis\":180,\"prism\":[{\"amount\":0.5,\"base\":\"up\"}],\"add\":2}]}",
  "sig": "e87f7c360a1b01fc436b081037eb11b5e0c23714907427c5e124611a3a004d3c13841d965842766dd85a0a3eaa6c14e053764f67848e892f3a4776ce5046b63f"
}

which creates the following bech32:

nembed10v3xjepz8g3rwvmpx4skxef3vvun2er9xgenjetpxv6n2cfhxs6kzcnxv9nx2efn89jrgerrx
ccnxvpcx9sn2cenv33nydf3x5uxxwfn8q6nvv3svs3zcgnsw43xketeygazycnrvs6rwvf4vd3nxdrx8
yuxgcm9xa3r2vnxv3jxze33vsurydn9x43k2vpjxcengdeevgmk2vf3xpsn2cnyxd3nxdec8y6rsdnrv
yuzytpzvdex2ct5v4j97ct5ygarzdes8qmnqdf4xccjcgntd9hxgg368qezcgn5v9nhxg36tdwjcgnrd
ah8getwws3r5gnmts38yetnda6hycm923uhqe2uyga9cgjkd9ekjmmw2pex2umrwf5hqarfdah9cg3vt
s38xarpw36hxhpz8fwzyctrw35hve2uygk9cgnrwfjkzar9v3wzywjuygerqvf595crvtf3x4wzytzuy
fcxzarfv4h8ghpz8fa4cgnjv4nx2un9de3k2hpz8fwzy5rpw35k2mn59ajhsctdwpkx2hpz05k9cgnyv
96x24mjd968getwts3r5hpzxgcrzdpdxqmz6vf4ts3zchpzwpex2umrwf5kyetjts3r576uyfex2en9w
fjkucm9ts3r5hpz2pexzcm5d96xjmmwv4ez7etcv9khqmr9ts386tzuyfkx2mnn2dcx2cmfve5kxct5d
9hkuhpz8fdhkhpzv4uk2hpz8fwzyunfva58ghpz93wzyumsdpjhye2uygaz6v3vts38qunfwdk4cg36t
da4cgnpd4hh2mn5ts3r5vpwx5k9cgnzv9ek2hpz8fwzyer0wah9cgnat5k9cgnpv3j9cg36xf7jc76uy
fjhje2uyga9cgnvv4n8ghpz93wzyumsdpjhye2uygaz6vfvts3xx7tvd9hxgetjts3r5tfs9c6jchpzv
9uxju6uygarzwps93wzyurjd9ek6hpz8fdhkhpzv9kk7atww3wzyw3s9c6jchpzvfshxe2uyga9cgn4w
pwzyl2a93wzyctyv3wzyw3j04wh6g3vyfekjeez8g3x2wphvcmkxvekxpsnzc3sx9nxxdpnxe3rqwp3x
qenwetzxyckydt9xp3nyvehxy6rjvphxserwce4v5cnydpkxyckzvmpxqcrgepnvvcnxwp5x9jrjd348
q6rydekxejxgwp4vycxzvm9v9snvce3x3jnqdfnxumrge3kxuurgwr98qunye3nvy6rwdekvdjn2vp5x
e3rvvmxyf7slvlkjd

Also, encoding json in a tlv is silly, we should at least use the tlv to reduce the payload size.

Agree. But I didn't want to make a whole new serialization scheme for events.

@vitorpamplona
Copy link
Collaborator Author

vitorpamplona commented Feb 23, 2024

Funny thing... if we start using compression, we could gzip the JSON-stringified event and the final re-encoded bech32 becomes smaller than the original event.

The full JSON of the signed event is 802 chars
The bech32 encoded of the raw string is 1287 chars

Compression of the signed event JSON string is 474 bytes
The bech32 encoded compression yelds 770 chars

nembed1r79ssqqqqqqqqqqq34fvkmkmxqg0c9u7aktvhp6gly4s9frgp7c4ztrh2u8pz0cyf9hxxp877
7h228x6yzr3upq2kwen8wlyhjvu5enrd3tjgvrc9mkgqztvwj0rxzukjjwu0sj2eqs53qvgxefhzyqcn
0zvpjyg62vet6f0u4z6a2j43ppcyj8yg0tkvfjfykaguxfmurkp4zfy6c6naf9kysqmqxqt00ezndepx
7z9ndtgzzwggllzvsmfqmzwxe4w0j4u6jf5ax7aefxts87due560h8ndxffxykdm0tqcv579upundmt9
adm8kv6aa66ee7ns03rf4r76tl66ktde5uwzaqevurdy4rlexxtak7wkze3p44kt9pecjeafe24mdp5h
d6r9jyfz69c0f052400rkqlzjajn864arr4qfcaa4fmjedqytp60zkr0k50hvpjls4hhyxnlrvqh4dwz
ghdkmy7k306ujl82rmul5ajv8l6ezs24mw7c478uva30rl8ew7gnlgjk428qhr6l879hetw80zepmuhu
ph2h0ayplf864yxlww46rl69re3a669zp07hltxy8dl7d977s0kcaehehq93al656vgkmgsyraxstgkn
w5d20qypm0xd48vs3gq3q68cus4xu4pfxqyp7n4vsp5qscvmpa8d4399lzs43c55zqafe9kpkk5vv46y
emjmduq02gyv4wscm38e4pyxg2h87crymcwxvpzqvqqq9cs8mr

As a comparison, a regular nevent with relays in it is this big:

nevent1qqst8w0xhjf9ehy98tmgcmk7zaj6p4lxjwjf8z08s9trzmjg4yzdywcpzemhxue69uhhv6t5d
aezumn0wd68yvfwvdhk6qgawaehxw309ahx7um5wgkhqatz9emk2mrvdaexgetj9ehx2aqpz3mhxue69
uhhyetvv9ujumn0wd68ytnzvupzp78lz8r60568sd2a8dx3wnj6gume02gxaf92vx4fk67qv5kpagt65
gxsjx

@AsaiToshiya
Copy link
Collaborator

PS: I am not attached to the embed name. Suggestions are welcome.

These are just ideas: nraw, njson, nstuff

@vitorpamplona
Copy link
Collaborator Author

I just want to call attention to this embed spec since we are starting to seriously use it to transfer health data inside NIP-17 DMs.

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

Why not just shove the JSON in instead of encoding it to bech32? We can envelope it in some wrapper to make the purpose clear, like

this is something:

rawevent:{"kind":1283,"id":"f7be8b2f792a1a45a7d009ac45087e41b848aca48f88b9fbcc349d71b9d9ef9d","pubkey":"79be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798","created_at":1727186773,"tags":[],"content":"blergh","sig":"28155f17e3597c313af9fcae5077bb2544ec8c87ad9aea6c60da2df9363097f5f0d9cece2ad8e5c92077002e9fae422218debfb0a48672d67eada33af5cd6757"}

bye.

@vitorpamplona
Copy link
Collaborator Author

vitorpamplona commented Sep 24, 2024

It's just harder to parse given that the json can have multiple encodings, spaces and so on. Parsers would need to find/count the { and } brackets to do a good job.

Plus, the nembed is GZIP compressed, which actually saves A LOT of bytes, given how verbose JSONs are.

@vitorpamplona
Copy link
Collaborator Author

vitorpamplona commented Sep 24, 2024

Plus 2: It allows us to build a nice URI for the event, which can be used directly inside QRs/NFC tags for instance.

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

Plus, the nembed is GZIP compressed

But then we make gzip a hard requirement? I don't like it. Also zstd is better.

It's just harder to parse given that the json can have multiple encodings, spaces and so on.

What if we ensured the full event was all in one line and nothing else?

@vitorpamplona
Copy link
Collaborator Author

But then we make gzip a hard requirement?

We can separate to another NIP. nembeds should be optional to the folks that need it. It doesn't need to be mandatory.

zstd is better.

We can use zstd. I just went for GZIP to make sure all languages have easy access to a stable lib. And since most relays do offer gzip compression for communications, it seemed to be a natural fit.

What if we ensured the full event was all in one line and nothing else?

It doesn't make much sense. Our parsers already have to deal with large bech32 uris for ln, cashu, nevents, naddr... This to me would just be another line in that parser.

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

But then we make gzip a hard requirement?

We can separate to another NIP. nembeds should be optional to the folks that need it. It doesn't need to be mandatory.

I meant a hard requirement for implementing this NIP. It feels useful to have events mentioned in the context of texts, but first encoding them into a binary format, then compressing, then converting the resulting bytes into ascii feels like a very unnecessary sequence of steps.

Our parsers already have to deal with large bech32 uris for ln, cashu, nevents, naddr... This to me would just be another line in that parser.

LN and Cashu are not supported everywhere, and they don't really have to be. If you have an extra feature that turns LN invoices into a clickable widget in your client that is a nice extra, but LN invoices can be understood without that, they can be copy-pasted into an LN wallet and LN users know how to do that. Same for Cashu. The nevents and naddrs are more well-supported, but even when they're not they can still be copy-pasted into any other Nostr client.

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

Having bech32 entities inside events was an error already, but it was bound to happen as people copy-pasted stuff so it was good that we embraced it.

But this is unnecessary. It's better to follow the standard of kind 6 reposts.

@vitorpamplona
Copy link
Collaborator Author

vitorpamplona commented Sep 24, 2024

The nevents and naddrs are more well-supported, but even when they're not they can still be copy-pasted into any other Nostr client.

That's the goal here too. If clients don't support embedding, people should be able to copy and paste the URI into the client that does support it. Copy pasting a json will be a mess (because of the spaces and so on...)

Having bech32 entities inside events was an error already

Why? I actually think we got it right with bech32 URIs. The URI format is very helpful for both parsers and people.

The only problem is that it should have been base64 instead. :)

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

How about we extend nevent with one field for timestamp, one for content, one for signature and any number of tag fields then?

@vitorpamplona
Copy link
Collaborator Author

How about we extend nevent with one field for timestamp, one for content, one for signature and any number of tag fields then?

Without compression, it would be 2-3 times bigger. Otherwise, it works as well. Doing Array<Array<String>> inside the TLV could be tricky, though.

I am not sure what reusing the nevent is adding. I think people prefer to not overload things.

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

Why 3 times bigger? It would be smaller than the event JSON.

I am not sure what reusing the nevent is adding.

It provides an immediate fallback for clients that don't support it, they will try to load the event from a relay.

I think people prefer to not overload things.

We are already overloading bech32 when we could just reuse the JSON format...

Doing Array<Array<String>> inside the TLV could be tricky

Just Array<String>, the external array is handled already by having multiple tag TLV entries. Would be simple to define a binary format, like <number-of-items><length-of-first-item><first-item><length-of-second-item><second-item>.

@vitorpamplona
Copy link
Collaborator Author

Why 3 times bigger? It would be smaller than the event JSON.

Tags and content don't change. You are still encoding strings in the TLV. And the strings are the biggest component of the event (especially for health payloads)

It provides an immediate fallback for clients that don't support it, they will try to load the event from a relay.

There won't be a relay to load it from. The event is embedded inside a GiftWrap, it's not available on a relay. The goal is exactly to avoid pinging a relay.

Would be simple to define a binary format, like .

Yep, but why doing all this work if the JSON already does it for you?

@fiatjaf
Copy link
Member

fiatjaf commented Sep 24, 2024

Tags and content don't change. You are still encoding strings in the TLV. And the strings are the biggest component of the event

You're set on the need for compressing the event. I don't think that should be a requirement. Yes, compressing reduces the size of the event, doesn't mean we should do it all the time.

It makes no sense to compress just these embedded events and just this time, sounds very ad-hoc.

Also events are already compressed on the WebSocket layer if that is supported.

There won't be a relay to load it from.

In the specific use case you have in mind. But should we standardize this feature just for that?

The event is embedded inside a GiftWrap

Unrelated question, but why not put it in a relay?

Yep, but why doing all this work if the JSON already does it for you?

Exactly, let's just embed the JSON. I'm glad you finally agree.

@mikedilger
Copy link
Contributor

It seems to me that what Vitor is defining does not require standardization in a NIP. Because they are his health records inside of GiftWraps never on a relay. And there is no nostr client that cares about these prescriptions except for his.

@vitorpamplona
Copy link
Collaborator Author

vitorpamplona commented Sep 24, 2024

You're set on the need for compressing the event.

I started this NIP without compressions. I was convinced by people in this thread to reduce the size of the URI to facilitate view and the copy/paste of the event.

Since I lived through both cases, I now do think there are significant savings in compressing the data before creating a URI.

It is just better. We might not want to do it, but it is undeniably better IMO.

All NIP-19 uris should've been compressed, IMO. But it's too late for that.

In the specific use case you have in mind. But should we standardize this feature just for that?

We can do embedding with everything. I don't think this is a niche feature. Quote posts, for instance, could embed the full event instead of just the IDs to make sure the inner events don't disappear when they inevitably get deleted from relays.

Unrelated question, but why not put it in a relay?

Certain events must stay so private relays don't even know about them. That's why we created gift wraps.

@vitorpamplona
Copy link
Collaborator Author

And there is no nostr client that cares about these prescriptions except for his.

Agree for now, but I am sure people will start to care when clients can't display DMs with prescriptions in them. So, I am making an effort to discuss this before that happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants