Faster Tunnel crypto by re-implementing IPv8 crypto in "Rust" programming language #4567

synctext · 2019-06-14T10:29:58Z

This is a long-term issue for re-implementing our crypto tunnel core for raw speed, zero-copy protocol stack, and usage of work stealing thread pool. Everything in Rust. This means leaving our ideally-suitable-for-rapid-prototyping Python stack, making it less easy to modify and freezing the wire format and behavior.

This issue partly addresses #1 issue Tribler anonymous downloads are fast and secure.

This is an honor students project at TUDelft, complimentary to ongoing tunnel re-factoring and tweaking like: #4459.

Current code repo: https://github.com/ip-v8/rust-ipv8

jdonszelmann · 2019-06-14T16:07:59Z

@synctext could i be assigned too? thanks!

jdonszelmann · 2019-06-14T17:16:00Z

also, could the main repo link be edited to this? https://github.com/ip-v8/rust-ipv8

jdonszelmann · 2019-06-15T10:17:11Z

the other assignees were correct though :)

devos50 · 2019-06-15T11:07:14Z

also, could the main repo link be edited to this? https://github.com/ip-v8/rust-ipv8

I edited the link in the first post.

devos50 · 2019-06-15T13:06:32Z

Also, not entirely related to this ticket, but could you guys let me know when the message serialization method is working? I have an experiment where I test the market community under a high system load and it seems that one of the bottlenecks is IPv8 message serialization. If there is something faster, I can integrate it into this experiment and 1) check the speed improvement and 2) check your implementation for correctness. 👍

jdonszelmann · 2019-06-15T13:25:41Z

Yes of course. We use the Serde library for it which provides zero copy serialization, extendable to custom objects. In many cases we could basically use the default serialization strategy as python does this too with the struct library. We provide some "atoms" which correspond to python types which we guarantee to be serialized in a certain way. An example of this is the Varlen16 struct which guarantees to serialize to 2 bytes of length + data of max 65536 bytes. There is also a Varlen8, Varlen32 and Varlen64 (which arent strictly necessary as pyipv8 doesnt use them, but adding them was really easy and might prove useful). Similarly we have a Bits struct which serializes to an u8. etc. For these we implemented a custom serde serializer process. As long as you use these types in your structs instead of the builtin rust types you are guaranteed that python is able to interpret them. Serde obviously does all the hard work.

Now about the coupling to your python project. As we haven't even started to make the python FFI you would have to do this yourself. This won't be too easy and it also is what we are starting on next week. So to use it in the market community will be something you have to figure out on your own mostly. (or you wait a few more weeks)

However, verifying our serializer would be highly appreciated so go ahead! We already have a lot of testcases but more never hurts. More detailed explanations of the serializer can also be found in comments around it as we document our code pretty well in my opinion. We also generate a documentation page which can be found here.

jdonszelmann · 2019-06-15T13:29:44Z

Update about the project itself: we now run our tests on windows and mac too which might be useful to know. In the past only linux testing was performed

devos50 · 2019-06-15T13:42:29Z

@jonay2000 thanks for the update! It's not an urgent need but it would allow me to squeeze out more throughput under high load 👍

jdonszelmann · 2019-06-15T13:44:40Z

Yes, that would be great. Honestly, that's why we make rust-ipv8. To remove some bottlenecks in certain communities. We hope to have a working FFI to python around the end of august.

jdonszelmann · 2019-06-15T21:53:54Z

We have done some benchmarks of the system and they achieve some very promising results:

The deserializing and signature verifying of a packet takes 57 us (microseconds) per packet.
This packet was a relatively small packet which consisted of a header, BinMemberAuthenticationPayload, TimeDistributionPayload and an IntroductionRequestPayload (basically an introduction request). Real world packets will often be larger, and as the bottleneck right now is the verification process this will only improve speeds.

Taking this 57us time- taken as an average over about 500,000 runs, running on one thread of one core of a desktop cpu with a clockspeed of 3.5 Ghz - this means we can process around 3.5 megabytes a second per cpu core. (almost twice that with hyperthreading) . This all gives you a theoretical througput of 2.8 terabytes per day on just one core of one cpu.

Now there will be some overhead of the communities themselves - we know that - but even if you halve this speed this will be a great improvement of the system in place right now. However we don't think the impact of communties will be even close to this much as signature verification is very clearly the bottleneck of the system.

Another idea we have, which can be implemented in the future, is batch verification. At this moment the verification of one ED25519 signature takes 273364 instructions on a modern x86 cpu. When we wait with verification until multiple packets have come in one could theoretically half this number. Although we haven't looked into how we could do this, nor have we planned to do this any time soon, this could greatly improve speeds. (note: 64 ed255129 signatures in a batch takes 134000 instructions per signature checked)

We have even noticed that we are (marignally) faster than some quite optimized alternatives to our verification process. We expect this to be due to the link time optimization we do which does drastically increase compile time (double to triple) but yields a speed bonus of around 1.5% for crypto and 50% for serialization/deserialization. (though the ser/de is still way faster than the crypto)

More optimizing will be done and more benchmarks will be taken for sure.

P.S. note that we do true multithreading so all speeds stated will roughly scale with core count.

synctext · 2019-06-16T07:12:20Z

Great progress. Please consider integrating as early as possible with Python and our complete exe build process. That has been identified as the cardinal pain point. For instance, merely owning the udp socket and acting as a "Rust proxy" with the rest in Python. Can be parallel development track.

qstokkink · 2019-06-16T16:53:30Z

When we wait with verification until multiple packets have come in one could theoretically half this number.

I would advise against any buffer in a networking library. You should treat packets like hot potatoes: never hold on to them. In this case, it's better to be a bit less efficient.

Two examples: (1) overzealous use of buffers and batching led to packets having a propagation time of up to 10 seconds in Dispersy and (2) a bit more generally, the problem of bufferbloat.

devos50 · 2019-06-16T17:18:09Z

Also, the batching mechanism in Dispersy made it a nightmare for developers to debug tests, since it was very hard to see whether a message is/was in the buffer, is being processed or has been processed. It was also one of the reasons why these individual tests could take up to 10 seconds to complete.

I agree with @qstokkink here, we learned from buffering messages in Dispersy and it is not a mechanism I would like to see back, even if it (marginally) improves performance.

jdonszelmann · 2019-06-17T07:10:33Z

Alright that's very clear. No batch processing. Thanks for that feedback. We wouldn't have done it for months anyways but now we won't even research the idea.

ichorid · 2019-06-17T13:28:25Z

@jonay2000 , could you guys please measure the performance of processing Tunnel Community AES-GCM encrypted packets? That is where our real bottleneck is.

jdonszelmann · 2019-06-17T15:14:12Z

will do, probably done this week

jdonszelmann · 2019-06-19T11:33:41Z

@ichorid Does tribler/ipv8 use AES-128-GCM or AES-256-GCM?

ichorid · 2019-06-20T11:47:02Z

@jonay2000 , I guess it's 128. You better ask @egbertbouman about this to be sure.
However, you can produce tests for both. Or you can try out both and see which one fails to be decrypted by Tribler tunnels crypto 😉

egbertbouman · 2019-06-20T14:32:05Z

@ichorid You're right, it's AES-128-GCM.

jdonszelmann · 2019-06-20T15:12:38Z

Thank you both!

synctext · 2019-10-02T09:00:17Z

ToDo, schedule update meeting

devos50 · 2020-04-06T13:46:11Z

Due to inactivity on this issue, I will move it to backlog.

synctext · 2024-04-15T14:07:59Z

160 Mbit/sec download speed. Amazing work on Experimental release 😲 🚀 🎉

synctext added long-term type: BSc work labels Jun 14, 2019

synctext assigned dsluijk Jun 14, 2019

synctext mentioned this issue Jun 14, 2019

end-to-end anonymous seeding and download performance test #2548

Open

synctext assigned NULLx76 Jun 14, 2019

synctext assigned jdonszelmann and unassigned NULLx76 and dsluijk Jun 15, 2019

synctext mentioned this issue Oct 2, 2019

Universal communication using imperfect hardware #4827

Closed

ichorid added this to the V7.5: core refactoring milestone Nov 28, 2019

ichorid modified the milestones: V7.5: core refactoring, V7.6: Collective authoring Dec 23, 2019

devos50 unassigned jdonszelmann Apr 6, 2020

devos50 modified the milestones: V7.6: Faster, Quicker and more Responsive, Backlog Apr 6, 2020

drew2a added the component: tunnels label Jan 15, 2021

This was referenced Oct 21, 2021

Usability roadmap ticket #6480

Closed

Vadim's testament #6481

Closed

synctext mentioned this issue Nov 9, 2023

Anonymous download performance enhancements #7684

Closed

qstokkink assigned egbertbouman Nov 30, 2023

egbertbouman mentioned this issue Feb 19, 2024

Enable rust endpoint for anonymous downloads #7908

Merged

egbertbouman closed this as completed in #7908 Feb 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster Tunnel crypto by re-implementing IPv8 crypto in "Rust" programming language #4567

Faster Tunnel crypto by re-implementing IPv8 crypto in "Rust" programming language #4567

synctext commented Jun 14, 2019 •

edited by devos50

Loading

jdonszelmann commented Jun 14, 2019

jdonszelmann commented Jun 14, 2019

jdonszelmann commented Jun 15, 2019

devos50 commented Jun 15, 2019

devos50 commented Jun 15, 2019 •

edited

Loading

jdonszelmann commented Jun 15, 2019

jdonszelmann commented Jun 15, 2019 •

edited

Loading

devos50 commented Jun 15, 2019

jdonszelmann commented Jun 15, 2019

jdonszelmann commented Jun 15, 2019 •

edited

Loading

synctext commented Jun 16, 2019

qstokkink commented Jun 16, 2019

devos50 commented Jun 16, 2019

jdonszelmann commented Jun 17, 2019

ichorid commented Jun 17, 2019 •

edited

Loading

jdonszelmann commented Jun 17, 2019

jdonszelmann commented Jun 19, 2019

ichorid commented Jun 20, 2019

egbertbouman commented Jun 20, 2019

jdonszelmann commented Jun 20, 2019

synctext commented Oct 2, 2019

devos50 commented Apr 6, 2020

synctext commented Apr 15, 2024 •

edited

Loading

Faster Tunnel crypto by re-implementing IPv8 crypto in "Rust" programming language #4567

Faster Tunnel crypto by re-implementing IPv8 crypto in "Rust" programming language #4567

Comments

synctext commented Jun 14, 2019 • edited by devos50 Loading

jdonszelmann commented Jun 14, 2019

jdonszelmann commented Jun 14, 2019

jdonszelmann commented Jun 15, 2019

devos50 commented Jun 15, 2019

devos50 commented Jun 15, 2019 • edited Loading

jdonszelmann commented Jun 15, 2019

jdonszelmann commented Jun 15, 2019 • edited Loading

devos50 commented Jun 15, 2019

jdonszelmann commented Jun 15, 2019

jdonszelmann commented Jun 15, 2019 • edited Loading

synctext commented Jun 16, 2019

qstokkink commented Jun 16, 2019

devos50 commented Jun 16, 2019

jdonszelmann commented Jun 17, 2019

ichorid commented Jun 17, 2019 • edited Loading

jdonszelmann commented Jun 17, 2019

jdonszelmann commented Jun 19, 2019

ichorid commented Jun 20, 2019

egbertbouman commented Jun 20, 2019

jdonszelmann commented Jun 20, 2019

synctext commented Oct 2, 2019

devos50 commented Apr 6, 2020

synctext commented Apr 15, 2024 • edited Loading

synctext commented Jun 14, 2019 •

edited by devos50

Loading

devos50 commented Jun 15, 2019 •

edited

Loading

jdonszelmann commented Jun 15, 2019 •

edited

Loading

jdonszelmann commented Jun 15, 2019 •

edited

Loading

ichorid commented Jun 17, 2019 •

edited

Loading

synctext commented Apr 15, 2024 •

edited

Loading