[RFC] Magic URL Payload Format for Twitter (v1.0.0.0) #198

neruthes · 2019-09-20T09:26:09Z

Background

Twitter has a strict length restriction for tweets which we must find ways to bypass in order to publish armored payloads within one tweet.

Basic Idea

URLs on Twitter will be converted to t.co short links and only the length of the t.co short link is considered actually occupying characters in the tweet. We add a prefix before the actual payload to make it look like a real URL for Twitter, in order to take advantage of this feature.

Prefix

We maintain a list of popular websites (like Alexa top 100) and use a random prefix to avoid basic pattern detection.

Encoding

Since there are restrictions on which characters may be used in URI, according to IETF and W3C specifications, we use a subtly different payload encoding method.

Refer to specifications of location object in DOM API for JavaScript.

Prefix

The prefix part always look like https://www.amazon.com/item/233.html.

It has protocol, host, and path.

It does not have user, password, search, hash.

It may or may or have port.

Prefix Randomization (PR)

Prefix Randomization is not mandatory but recommended. For now, it is ok to maintain a simple list of prefixes.

The prefix has a static part and a dynamic part.

Static part includes host.

Dynamic part includes:

Part Name	Value Range	Examples
`protocol`	...	`http`, `https`
`host`	domain names, ~~IPv4 addresses~~	`twitter.com`
`port`	implied, 8000, 8080, 9527-12315
`path`	`\/[\w-_\.]{0, 24}/`	`/_233_/`

Update: URIs whose host is an IPv4 address will not be converted to short links on Twitter.

Base64 Alternative Characters

In Base64 encoding, we need +, /, and =. These characters need to be replaced in order to ensure the reliability of URI detection on Twitter. We should follow RFC 4648 for URI-safe Base64 codec.

From	To
+	-
/	_
=	=

Padding characters may optionally be discarded.

Header Token

We use %20 to mark the payload sequence starting.

Footer Token

We use %40 to mark the payload sequence ending.

Garbage Bytes (GB)

Random garbage bytes may be added after footer token.

Garbage Bytes is not mandatory but recommended.

Separation Token

We use . to mark the separation between two adjacent fields.

Discussion wanted.

The text was updated successfully, but these errors were encountered:

Tedko · 2019-09-20T09:35:24Z

I think this approach is too hacky. Considering the fact that we will support iMgs / files sooner or later, let’s rather use IPFS or some storage solutions directly. It’s important to make our solutions elegant and don’t have any single point failure — imagine what if twitter decides to modify the URL policy, all of our users post lose permanently. I’m strongly against this way. It’s worse than BaseCJK/Emoji or store in IPFS and put hash pointers in tweet.

…

On Fri, Sep 20, 2019 at 02:26 Neruthes 0x5200DF38 ***@***.***> wrote: Background Twitter has a strict length restriction for tweets which we must find ways to bypass in order to publish armored payloads within one tweet. Basic Idea URLs on Twitter will be converted to t.co short links and only the length of the t.co short link is considered actually occupying characters in the tweet. We add a prefix before the actual payload to make it look like a real URL for Twitter, in order to take advantage of this *feature*. Prefix We maintain a list of popular websites (like Alexa top 100) and use a random prefix to avoid basic pattern detection. Encoding Since there are restrictions on which characters may be used in URI, according to IETF and W3C specifications, we use a subtly different payload encoding method. Refer to specifications of location object in DOM API for JavaScript. Prefix The prefix part always look like https://www.amazon.com/item/233.html. It has protocol, host, and path. It does not have user, password, search, hash. It may or may or have port. Prefix Randomization (PR) Prefix Randomization is not mandatory but recommended. For now, it is ok to maintain a simple list of prefixes. The prefix has a static part and a dynamic part. Static part includes host. Dynamic part includes: Part Name Value Range Examples protocol ... http, https host domain names, IPv4 addresses twitter.com port implied, 8000, 8080, 9527-12315 path \/[\w-_\.]{0, 24}/ /_233_/ Base-64 Alternative Characters In Base-64 encoding, we need +, /, and =. These characters need to be replaced in order to ensure the reliability of URI detection on Twitter. From To + - / / = _ Header Token We use %20 to mark the payload sequence starting. Footer Token We use %40 to mark the payload sequence ending. Garbage Bytes (GB) Random garbage bytes may be added after footer token. Garbage Bytes is not mandatory but recommended. Separation Token We use .to mark the separation between two adjacent fields. ------------------------------ Discussion wanted. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#198?email_source=notifications&email_token=ABTAVTKZ74IEQR2MRVCNNULQKSJLDA5CNFSM4IYU2LBKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HMUDQMQ>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABTAVTLUFBPKBABT34FJIMDQKSJLDANCNFSM4IYU2LBA> .

Tedko · 2019-09-20T09:36:22Z

Over all we’re building a protocol and a product. This hacky solution may help the product in short term but will harm the protocol largely.

…

On Fri, Sep 20, 2019 at 02:35 Suji Yan ***@***.***> wrote: I think this approach is too hacky. Considering the fact that we will support iMgs / files sooner or later, let’s rather use IPFS or some storage solutions directly. It’s important to make our solutions elegant and don’t have any single point failure — imagine what if twitter decides to modify the URL policy, all of our users post lose permanently. I’m strongly against this way. It’s worse than BaseCJK/Emoji or store in IPFS and put hash pointers in tweet. On Fri, Sep 20, 2019 at 02:26 Neruthes 0x5200DF38 < ***@***.***> wrote: > Background > > Twitter has a strict length restriction for tweets which we must find > ways to bypass in order to publish armored payloads within one tweet. > Basic Idea > > URLs on Twitter will be converted to t.co short links and only the > length of the t.co short link is considered actually occupying > characters in the tweet. We add a prefix before the actual payload to make > it look like a real URL for Twitter, in order to take advantage of this > *feature*. > Prefix > > We maintain a list of popular websites (like Alexa top 100) and use a > random prefix to avoid basic pattern detection. > Encoding > > Since there are restrictions on which characters may be used in URI, > according to IETF and W3C specifications, we use a subtly different payload > encoding method. > > Refer to specifications of location object in DOM API for JavaScript. > Prefix > > The prefix part always look like https://www.amazon.com/item/233.html. > > It has protocol, host, and path. > > It does not have user, password, search, hash. > > It may or may or have port. > Prefix Randomization (PR) > > Prefix Randomization is not mandatory but recommended. For now, it is ok > to maintain a simple list of prefixes. > > The prefix has a static part and a dynamic part. > > Static part includes host. > > Dynamic part includes: > Part Name Value Range Examples > protocol ... http, https > host domain names, IPv4 addresses twitter.com > port implied, 8000, 8080, 9527-12315 > path \/[\w-_\.]{0, 24}/ /_233_/ Base-64 Alternative Characters > > In Base-64 encoding, we need +, /, and =. These characters need to be > replaced in order to ensure the reliability of URI detection on Twitter. > From To > + - > / / > = _ Header Token > > We use %20 to mark the payload sequence starting. > Footer Token > > We use %40 to mark the payload sequence ending. > Garbage Bytes (GB) > > Random garbage bytes may be added after footer token. > > Garbage Bytes is not mandatory but recommended. > Separation Token > > We use .to mark the separation between two adjacent fields. > ------------------------------ > > Discussion wanted. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#198?email_source=notifications&email_token=ABTAVTKZ74IEQR2MRVCNNULQKSJLDA5CNFSM4IYU2LBKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HMUDQMQ>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABTAVTLUFBPKBABT34FJIMDQKSJLDANCNFSM4IYU2LBA> > . >

Misaka-0x447f · 2019-09-20T10:29:01Z

for insider preview, it will be implemented as this:
https://google.com/%203/4|${data.ownersAESKeyEncrypted}|${data.iv}|${data.encryptedText}|${data.signature}%40

Tedko · 2019-09-30T07:12:52Z

So no other feedbacks?
I'm wondering do we have other ways besides this? like the IPFS/etc way i mentioned.

neruthes · 2019-09-30T15:54:28Z

If you want to delay to February.

neruthes · 2019-09-30T16:04:58Z

We may initiate a project to examine the availability of these solutions and arrange implementations accordingly. Also, I prefer not to put backward compatibility at risk, unless we revert the banner-removal commit and pretend it is always early-access beta test version.

neruthes · 2019-09-30T16:15:12Z

These will require much amount of work and I prefer putting these resources on features which have greater priority, including new dashboard, automatic recipient amending, and Misakanet.

Tedko · 2019-10-17T06:27:56Z

These will require much amount of work and I prefer putting these resources on features which have greater priority, including new dashboard, automatic recipient amending, and Misakanet.

@neruthes
Let's do this:

For making Mask twitter available for users soon, so

automatic recipient amending
'Shared' recipient (?) for PR and growth hack
URL hack
base Emoji research
then some image/file decentralized storage hack? (such as IPFS I mentioned)

3->5 need to be relatively fast since we don't want to got banned etc.

neruthes · 2019-10-17T06:51:23Z

These will require much amount of work and I prefer putting these resources on features which have greater priority, including new dashboard, automatic recipient amending, and Misakanet.

@neruthes
Let's do this:

For making Mask twitter available for users soon, so

automatic recipient amending

'Shared' recipient (?) for PR and growth hack

URL hack

base Emoji research

then some image/file decentralized storage hack? (such as IPFS I mentioned)

3->5 need to be relatively fast since we don't want to got banned etc.

IPFS can be an option for fallback, with regards to our Principle of Saturation. I have no idea how much times does it require to build IPFS compatibility.

Base-Emoji may have difficulties. Keep watching #139.

And explain 'Shared' recipient (?) for PR and growth hack, please.

Artoria2e5 · 2019-10-17T14:22:55Z

Instead of encoding it as a path, it is also very viable to encode it as a #-fragment. This has the benefit of allowing native base64. For example:

https://prettier.io/playground/#N4Igxg9gdgLgprEAuEB6VACASgMQMIYAsAbIQBwA6UYANgIYDODGAQo3KQKpYAyGwVDBnQZODOBgAWMGAAcGSdAHMAljEkBXAEYA6SAFtUWgE4q6UGhABucBgHcIxgCYMj7UhuM0MKqA3h0TjpUAL4gADQgELIwKtAMyKB0xsYQdgAKyQgJKHRWECpOESAmdGAA1nAwAMqyZb5KyDDGGnCR0vo0AOqSarZ1YHDV2WoqVmoAnsjgTMW+4sYw6cZ0Svp0yABmdDTikQBWDAAeLCsVVdV0+nA8vnBbO3sgh0fVDTRwAIoaEPAPu20QHVjAtpjAJrJbGBTDFirJTLAuoV1MgyAAGSLwiDiLorWTTKBwGzGYoARx+8GW0RyIEYAFpCXAnEzisY4OSVGzlqt1v8nuJ9Co+YCGO8vhT7khtgDIjA6FokU4UUgAEyylYqGgNPAQfS8lBQaD3SIacQAFXlOWl4hCISAA

Another thing we can use for an alternative-base64 is the RFC 4648 base64url. The equal-signs can be discarded.

Tedko · 2019-10-19T11:48:27Z

@Artoria2e5 good idea.

neruthes · 2019-10-20T04:01:35Z

Updated with link to RFC 4648.

I would still recommend not to rely on location.hash. It creates extra reliance, although the extra risk may be little. Since we always struggle to build a stable software, it is reasonable to avoid this reliance even when it is small.

neruthes · 2019-10-20T04:04:30Z

I appears that this RFC is open for long enough and a lot of improvements have been merged. I will move this RFC to become a current technical specification on Tuesday. Later on, all suggestions for modification will be difficult.

Tedko · 2019-10-20T06:43:38Z

These will require much amount of work and I prefer putting these resources on features which have greater priority, including new dashboard, automatic recipient amending, and Misakanet.

@neruthes
Let's do this:

For making Mask twitter available for users soon, so

automatic recipient amending

'Shared' recipient (?) for PR and growth hack

URL hack

base Emoji research

then some image/file decentralized storage hack? (such as IPFS I mentioned)

3->5 need to be relatively fast since we don't want to got banned etc.

IPFS can be an option for fallback, with regards to our Principle of Saturation. I have no idea how much times does it require to build IPFS compatibility.

Base-Emoji may have difficulties. Keep watching #139.

And explain 'Shared' recipient (?) for PR and growth hack, please.

@neruthes similar functionality like 'all maskbook user can see' etc. As we discussed before in the chat. This is mainly for growth. Imagine some KOL post something encrypted.

Tedko · 2019-10-20T06:44:55Z

IPFS can be an option for fallback, with regards to our Principle of Saturation. I have no idea how much times does it require to build IPFS compatibility.

@neruthes I will talk to our friends 👬 who share similar vision and using ipfs now. Will try to talk to textile as well.

neruthes · 2019-10-20T06:52:06Z

These will require much amount of work and I prefer putting these resources on features which have greater priority, including new dashboard, automatic recipient amending, and Misakanet.

@neruthes
Let's do this:

For making Mask twitter available for users soon, so

automatic recipient amending

'Shared' recipient (?) for PR and growth hack

URL hack

base Emoji research

then some image/file decentralized storage hack? (such as IPFS I mentioned)

3->5 need to be relatively fast since we don't want to got banned etc.

IPFS can be an option for fallback, with regards to our Principle of Saturation. I have no idea how much times does it require to build IPFS compatibility.
Base-Emoji may have difficulties. Keep watching #139.
And explain 'Shared' recipient (?) for PR and growth hack, please.

@neruthes similar functionality like 'all maskbook user can see' etc. As we discussed before in the chat. This is mainly for growth. Imagine some KOL post something encrypted.

For this matter, we may amend UserGroup Abstraction Model (#12). It is not within the scope of RFC 198.

Tedko · 2019-10-20T06:53:40Z

These will require much amount of work and I prefer putting these resources on features which have greater priority, including new dashboard, automatic recipient amending, and Misakanet.

@neruthes
Let's do this:

For making Mask twitter available for users soon, so

automatic recipient amending

'Shared' recipient (?) for PR and growth hack

URL hack

base Emoji research

then some image/file decentralized storage hack? (such as IPFS I mentioned)

3->5 need to be relatively fast since we don't want to got banned etc.

IPFS can be an option for fallback, with regards to our Principle of Saturation. I have no idea how much times does it require to build IPFS compatibility.
Base-Emoji may have difficulties. Keep watching #139.
And explain 'Shared' recipient (?) for PR and growth hack, please.

@neruthes similar functionality like 'all maskbook user can see' etc. As we discussed before in the chat. This is mainly for growth. Imagine some KOL post something encrypted.

For this matter, we may amend UserGroup Abstraction Model (#12). It is not within the scope of RFC 198.

ACK

neruthes · 2019-10-20T06:55:46Z

IPFS can be an option for fallback, with regards to our Principle of Saturation. I have no idea how much times does it require to build IPFS compatibility.

@neruthes I will talk to our friends 👬 who share similar vision and using ipfs now. Will try to talk to textile as well.

Good to hear that. But I recommend moving the IPFS middleware to the next milestone as we all see the risk of introducing extra delay.

neruthes · 2019-11-16T19:39:57Z

The internal payload structure may be subject to refactoring as we design #329

neruthes · 2019-11-19T04:47:14Z

Ratified.

close #428

neruthes added Help Wanted Extra attention is needed Discussion: General Component: Multiple Network Network: twitter.com labels Sep 20, 2019

neruthes assigned neruthes, yisiliu, Jack-Works, Artoria2e5, SunriseFox and Misaka-0x447f Sep 20, 2019

neruthes changed the title ~~[RFC] A Method to Bypass Tweet Length Restriction on Twitter (v1.0.0.0) [Draft]~~ [RFC] A Method to Mitigate Tweet Length Restriction on Twitter (v1.0.0.0) [Draft] Sep 20, 2019

Misaka-0x447f mentioned this issue Oct 29, 2019

twitter support, finalizing #286

Merged

5 tasks

Misaka-0x447f mentioned this issue Oct 31, 2019

social network debugging, stage LTE #350

Merged

12 tasks

neruthes added the RFC: Ratified label Nov 7, 2019

neruthes mentioned this issue Nov 11, 2019

[RFC] Maskbook Twitter payload specification #373

Closed

neruthes added the RFC: Draft label Nov 11, 2019

Misaka-0x447f removed their assignment Nov 12, 2019

neruthes mentioned this issue Nov 15, 2019

[RFC] Maskbook Post Binary Payload Format v1 #329

Closed

neruthes changed the title ~~[RFC] A Method to Mitigate Tweet Length Restriction on Twitter (v1.0.0.0) [Draft]~~ [RFC] Magic URL Payload Format for Twitter (v1.0.0.0) Nov 19, 2019

neruthes removed RFC: Draft Help Wanted Extra attention is needed Component: Multiple Network Discussion: General labels Nov 19, 2019

neruthes closed this as completed Nov 19, 2019

neruthes mentioned this issue Nov 26, 2019

[Bug] Twitter payload format incorrect #428

Closed

guanbinrui added a commit that referenced this issue Nov 27, 2019

fix: make twitter payload adhere the spec #198

a388792

close #428

guanbinrui added a commit that referenced this issue Nov 27, 2019

fix: make twitter payload adhere the spec #198

01b58a4

close #428

guanbinrui added a commit that referenced this issue Nov 27, 2019

fix: make twitter payload adhere the spec #198

f2109a9

close #428

Jack-Works pushed a commit that referenced this issue Nov 27, 2019

fix: make twitter payload adhere the spec #198

934ff04

close #428

neruthes mentioned this issue Dec 8, 2019

Twitter only allows up to 4088 chars in one URL DimensionDev/Maskbook-Talks#12

Open

neruthes removed the Network: twitter.com label Dec 9, 2019

neruthes mentioned this issue Dec 12, 2019

[Demand] Import/Export OpenPGP Subkey and Use OpenPGP to Exchange Message. DimensionDev/Maskbook-Talks#14

Open

Jack-Works added the Kind: Protocol label Feb 10, 2020

neruthes mentioned this issue May 18, 2020

[Demand] Apply better UUID to the text part of Image-Based Payload feature #1084

Closed

neruthes mentioned this issue May 26, 2020

[Demand] Post Limbo payload "MagicURL" & Post Shell payload "BasicShell" revamp #1140

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Magic URL Payload Format for Twitter (v1.0.0.0) #198

[RFC] Magic URL Payload Format for Twitter (v1.0.0.0) #198

neruthes commented Sep 20, 2019 •

edited

Loading

Tedko commented Sep 20, 2019 via email

Tedko commented Sep 20, 2019 via email

Misaka-0x447f commented Sep 20, 2019

Tedko commented Sep 30, 2019

neruthes commented Sep 30, 2019

neruthes commented Sep 30, 2019

neruthes commented Sep 30, 2019 •

edited

Loading

Tedko commented Oct 17, 2019

neruthes commented Oct 17, 2019

Artoria2e5 commented Oct 17, 2019

Tedko commented Oct 19, 2019

neruthes commented Oct 20, 2019

neruthes commented Oct 20, 2019

Tedko commented Oct 20, 2019

Tedko commented Oct 20, 2019

neruthes commented Oct 20, 2019

Tedko commented Oct 20, 2019

neruthes commented Oct 20, 2019

neruthes commented Nov 16, 2019

neruthes commented Nov 19, 2019

[RFC] Magic URL Payload Format for Twitter (v1.0.0.0) #198

[RFC] Magic URL Payload Format for Twitter (v1.0.0.0) #198

Comments

neruthes commented Sep 20, 2019 • edited Loading

Background

Basic Idea

Prefix

Encoding

Prefix

Prefix Randomization (PR)

Base64 Alternative Characters

Header Token

Footer Token

Garbage Bytes (GB)

Separation Token

Tedko commented Sep 20, 2019 via email

Tedko commented Sep 20, 2019 via email

Misaka-0x447f commented Sep 20, 2019

Tedko commented Sep 30, 2019

neruthes commented Sep 30, 2019

neruthes commented Sep 30, 2019

neruthes commented Sep 30, 2019 • edited Loading

Tedko commented Oct 17, 2019

neruthes commented Oct 17, 2019

Artoria2e5 commented Oct 17, 2019

Tedko commented Oct 19, 2019

neruthes commented Oct 20, 2019

neruthes commented Oct 20, 2019

Tedko commented Oct 20, 2019

Tedko commented Oct 20, 2019

neruthes commented Oct 20, 2019

Tedko commented Oct 20, 2019

neruthes commented Oct 20, 2019

neruthes commented Nov 16, 2019

neruthes commented Nov 19, 2019

neruthes commented Sep 20, 2019 •

edited

Loading

neruthes commented Sep 30, 2019 •

edited

Loading