NEEDS VOLUNTEER: Communities ported to Protocol Buffers #2314

qstokkink · 2016-06-20T08:17:16Z

Related to #706, #2106.

Ported the AllChannel, Bartercast4, Channel, Demers and Template communities to Google's Protocol Buffers serialization. This implementation is backward compatible with the old wire format (± 2000 LOC will be thrown out when backward compatibility is removed).

Current status:

TODO	Status
Windows dependencies	Waiting for verification of installation instructions. NEEDS VOLUNTEER
Code review	Requires one or more masochists to actually review 10k lines of code. NEEDS VOLUNTEER

tribler-ci · 2016-06-20T08:17:17Z

Can one of the admins verify this patch?

lfdversluis · 2016-06-20T09:29:30Z

add to whitelist

whirm · 2016-06-20T09:57:28Z

add to whitelist

qstokkink · 2016-06-20T10:07:44Z

Oh right, the testing framework does not have the Protocol Buffers python package installed. What would be the proper place to add this?

whirm · 2016-06-20T12:12:19Z

@qstokkink debian/control needs to be updated. I'll update the Ansible scripts.

whirm · 2016-06-20T12:20:17Z

bbq has protobuf now

qstokkink · 2016-06-20T12:28:21Z

@whirm thanks!

whirm · 2016-06-27T13:22:41Z

Let me know when you need the allchannel experiment to be enabled. I disabled it for now as it still hasn't the free cluster discovery stuff I made integrated.

qstokkink · 2016-06-27T14:20:15Z

@whirm
tl;dr: I think now would be a good time to enable it.

long answer:
All tests on the backward compatibility code pass (on Linux) and all but 6 tests for the pure new code pass (which -seem to- fail because AllChannel can't find anything in a live search using only the new protocol). At this point I will start writing my own tests to validate new behavior and interactions.

So, I think now would be a good time to see what kind of performance benefit would be gained from the new system. Or in the worst case: what kind of performance loss. Then it can be decided if this project is worth merging.

whirm · 2016-06-27T15:38:25Z

Oh, apparently I didn't disable it for dispersy, you just need for the rest to pass for the experiment to be triggered.

synctext · 2016-06-27T16:46:46Z

Nice progress on this PR. Thousands of lines of code altered... wow.

qstokkink · 2016-07-02T18:38:29Z

Alright the results are in for the old protocol (with new interface):
https://jenkins.tribler.org/job/GH_Tribler_PR_tests_linux/2380/
And the new Protocol Buffers implementation (same interface), using the old create missing channel (Channel community) and create channel search (AllChannel community) functions:
https://jenkins.tribler.org/job/GH_Tribler_PR_tests_linux/2389/

For reference, I will now specify the infrastructure changes between the old and the new system, as they exist right now.

Old system

Incoming packets are identified by means of their metadata byte and then forwarded (conversion.py) to the appropriate handlers for their header information (authentication, distribution, etc.) and payload (payload.py). This requires community programmers to specify a conversion and a payload entry per message type.

New system

The new system expands on the old system's functionality by adding an additional message (the basemsg). This message is handled in the same fashion as specified by the old system (with a basemsg conversion and payload specification). However, this basemsg is handled as such by the BaseCommunity that it forwards its payload to other handlers. A side effect of this approach is that communities are no longer restricted to around 200 message definitions (instead the new limit is 2¹⁶⁰ over all communities). Furthermore, messages no longer require their own conversion and payload definitions, instead this is handled by a Protocol Buffers .proto file.

Note that another party trick is required for this to function as expected. This is because in the old system handlers for a particular message are statically specified with their network traversal information (authentication, destination, distribution and resolution) in a Message object. To handle this, a basemsg needs to be capable of using dynamic traversals. For the community programmer, this means that instead of initializing a Message, a traversal is registered with the same parameters (exluding handlers).

Backward compatibility

Because a basemsg is just another message, it can be defined alongside the old messages. This means that new communities (based on the BaseCommunity) can always receive old messages. This means that new communities can read from old communities, but will no longer write to old communities. However to avoid immediate isolation of nodes running old communities an implementation with a grace period has been made. This is done by having communities start by communicating using the old messages and switch to new messages once new messages have been detected. In other words, no-one will switch to the new messages unless someone switches to the new messages.

qstokkink · 2016-07-07T14:44:32Z

@devos50 How much of the communities are you already covering with the REST API tests? It seems you are already covering most (if not all) of the allchannel and channel communities.
This would put the focus of my tests more on the compatibility interactions.

devos50 · 2016-07-08T07:26:36Z

@qstokkink it's not that good in a sense that there are many error handling methods left uncovered. However, the core functionality of most communities (search, all channel, channel, tunnel) are covered with our (GUI) tests.

I think we still need some stable and small unit tests to test the logic within these communities without being dependent on a running Tribler session but it's not really a top priority issue.

qstokkink · 2016-07-08T07:50:32Z

@devos50 It's really hard (read: ugly private-field-dependant code) to separate the Session from a community. For instance the AllChannel community will either make a database stub or a real handler, internally, depending on whether or not it is initialized with a session. To make matters worse, these stubs are not shared between communities. Furthermore, because of the coupling side effects of the real handlers, checking proper database calls is only really possible using a Session's created databases.

That said, using a very limited SessionStartupConfig it is possible to make some somewhat nimble unittests. What I did for my AllChannel unittests (local WIP, keep an eye out for it soon), is generation of the databases using a Session and then overwrite the stubs (private field access <- not cool) with the proper databases. This is the only way to get everything working with Dispersy's DebugNodes without getting Session singleton errors.

Since your unittests are initialized and checked more elegantly than mine, I would have dropped my approach in favor of yours. However, since there is still some merit to it apparently, I will continue work on the low-level unittests.

qstokkink · 2016-07-13T09:05:35Z

For anyone interested: the first results show an average 14% reduction in test completion time using Protocol Buffers. These results are based on the time it takes between the AllChannel community to create a message on one node and handle it on another node (ergo mostly serialization and unserialization of messages).
The ChannelCommunity shows both reductions and increases in test completion time.

qstokkink · 2016-07-18T19:34:54Z

@whirm I created build instructions for (x64) Windows, which are not quite as elegant as I would have liked. I also see I triggered 5000 builds while wrestling with the ReStructuredText syntax.

whirm · 2016-07-19T10:38:40Z

doc/development/development_on_windows.rst

+Copy **and rename** this file to:
+:code:`%FULLSOURCELOCATION%\python\protobuf.lib`
+
+Finally run :code:`python %FULLSOURCELOCATION%\python\setup.py install --cpp_implementation` **with a 32-bit python version**.


We support both 32 and 64 bit builds of Tribler

@whirm The most recent version has the 64 bit instructions.
EDIT: In the even newer version I made the difference explicit.

qstokkink · 2016-07-20T08:43:30Z

Could someone with a Mac try the following for me? @devos50 are you available?

Experimental/unverified Mac instructions:

Download the latest version from https://developers.google.com/protocol-buffers/docs/downloads
Execute make install in the top folder.
Execute python setup.py build --cpp_implementation in the \python folder.

Test:

Execute python -c "import google.protobuf.pyext._message": this should not generate any import errors.

devos50 · 2016-07-20T09:01:34Z

@qstokkink no need to compile it by yourself :)

brew install protobuf

MBP-van-Martijn:tribler martijndevos$ python -c "import google.protobuf.pyext._message"
MBP-van-Martijn:tribler martijndevos$

qstokkink · 2016-07-20T10:43:09Z

@devos50 thanks! That's much easier than the Windows build.

…ion_rc

qstokkink · 2017-01-18T11:01:16Z

@devos50 Any idea why all of my PR tests are being aborted by anonymous?

...
[TemplateProject] Starting builders from: Test_tribler_devel
Build was aborted
Aborted by anonymous
Archiving artifacts
...

devos50 · 2017-01-18T11:04:50Z

@qstokkink probably because if the spelling checker fails, it aborts the execution of the whole job. You should check if that's the case and if so, fix it so the failure gets ignored and the job continues to execute.

qstokkink · 2017-01-18T11:22:32Z

@devos50 thanks, forgot a checkbox in GH_Tribler_PR :)
retest this please

qstokkink · 2017-01-19T15:16:12Z

Well now.. I appear to be a bit of a ding dong. The reason the graph wasn't showing any records being received is because I didn't tell it to plot "basemsg" messages. 😕

Anyway, there are no known bugs remaining for this PR now (/again).

EDIT: You can find all of the pretty graphs of both the old code and new code runs together here:
https://jenkins.tribler.org/job/pers/job/allchannel_basecommunity_qstokkink/141/
(Pay close attention to the file names when comparing)

devos50 · 2017-01-20T13:34:14Z

@qstokkink can you explain this error? Should I reinstall protobuf on the Windows machines?

Error Message

cannot import name _message
-------------------- >> begin captured logging << --------------------
DEBUG   1484870014         _legacy:twisted:154  PythonLoggingObserver hooked up
--------------------- >> end captured logging << ---------------------
Stacktrace

  File "C:\Python27\lib\unittest\case.py", line 329, in run
    testMethod()
  File "C:\Python27\lib\site-packages\nose\loader.py", line 418, in loadTestsFromName
    addr.filename, addr.module)
  File "C:\Python27\lib\site-packages\nose\importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "C:\Python27\lib\site-packages\nose\importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\Test\Core\Modules\RestApi\test_util.py", line 4, in <module>
    from Tribler.Core.Modules.restapi.util import convert_search_torrent_to_json, convert_db_channel_to_json, \
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\Core\Modules\restapi\util.py", line 12, in <module>
    from Tribler.community.channel.community import ChannelCommunity
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\community\channel\community.py", line 11, in <module>
    from Tribler.community.basecommunity import BaseCommunity
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\community\basecommunity.py", line 3, in <module>
    from Tribler.community.TriblerProtobufSerialization.serializer import Serializer
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\community\TriblerProtobufSerialization\serializer.py", line 10, in <module>
    import messages
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\community\TriblerProtobufSerialization\messages\__init__.py", line 4, in <module>
    MESSAGES = [__import_handle('allchannel_pb2'),
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\community\TriblerProtobufSerialization\messages\__init__.py", line 2, in __import_handle
    return __import__(name, globals(), locals(), [], -1)
  File "C:\workspace\jenkins\workspace\GH_Tribler_PR_tests_win32\tribler\Tribler\community\TriblerProtobufSerialization\messages\allchannel_pb2.py", line 6, in <module>
    from google.protobuf import descriptor as _descriptor
  File "C:\Python27\lib\site-packages\google\protobuf\descriptor.py", line 50, in <module>
    from google.protobuf.pyext import _message
'cannot import name _message\n-------------------- >> begin captured logging << --------------------\nDEBUG   1484870014         _legacy:twisted:154  PythonLoggingObserver hooked up\n--------------------- >> end captured logging << ---------------------'

devos50 · 2017-01-20T13:41:34Z

@qstokkink this breaks backwards compatibility right? If so, I would save this PR for Tribler 7.1 or maybe later.

qstokkink · 2017-01-20T14:16:39Z

@devos50 (1) It has never been installed on Windows (I still need someone to verify the Windows install instructions).

(2) In its current form the code IS backwards compatible (by means of additional code, which can be removed when backwards compatibility is broken); that said, I think this would be a good one to postpone after the 7.0 release, yes.

qstokkink · 2017-09-26T12:41:16Z

Closing this in favor of performing a clean protobuf implementation on top of IPv8. Additional unit tests have already been exported to another PR (#2911).

qstokkink force-pushed the protobufserialization_rc branch 3 times, most recently from c5e0f7d to 295a4dc Compare July 2, 2016 18:25

qstokkink force-pushed the protobufserialization_rc branch 4 times, most recently from bf9fe83 to d198192 Compare July 5, 2016 13:22

qstokkink force-pushed the protobufserialization_rc branch from dab00e8 to cf0ed91 Compare July 18, 2016 06:52

whirm reviewed Jul 19, 2016
View reviewed changes

qstokkink force-pushed the protobufserialization_rc branch from 1b34d7d to e896577 Compare July 20, 2016 10:55

Merge remote-tracking branch 'upstream/devel' into protobufserializat…

3770bc9

…ion_rc

qstokkink force-pushed the protobufserialization_rc branch 2 times, most recently from 91a3589 to ad7e76d Compare January 17, 2017 14:35

Jenkins requires submodule name to equal path

5b9356c

qstokkink force-pushed the protobufserialization_rc branch from ad7e76d to 5b9356c Compare January 17, 2017 14:52

qstokkink added 2 commits January 18, 2017 11:49

Removed session.prestart()

e9f2e73

Set serialization submodule to correct branch

fdc04cb

qstokkink added 2 commits January 18, 2017 13:49

Removed channel conversion test

2f031be

Removed old allchannel tests

3e55483

qstokkink force-pushed the protobufserialization_rc branch 8 times, most recently from 8e92a98 to 50543d7 Compare January 19, 2017 12:59

Added generic check to check_basemsg

9a8d18d

qstokkink force-pushed the protobufserialization_rc branch from 50543d7 to 9a8d18d Compare January 19, 2017 14:50

qstokkink changed the title ~~WIP: Communities ported to Protocol Buffers~~ NEEDS VOLUNTEER: Communities ported to Protocol Buffers Jan 19, 2017

qstokkink mentioned this pull request May 8, 2017

Tribler core freezes when we add a torrent to a channel that contains a too long tracker #2931

Closed

qstokkink closed this Sep 26, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEEDS VOLUNTEER: Communities ported to Protocol Buffers #2314

NEEDS VOLUNTEER: Communities ported to Protocol Buffers #2314

qstokkink commented Jun 20, 2016 •

edited

Loading

tribler-ci commented Jun 20, 2016

lfdversluis commented Jun 20, 2016

whirm commented Jun 20, 2016

qstokkink commented Jun 20, 2016

whirm commented Jun 20, 2016

whirm commented Jun 20, 2016

qstokkink commented Jun 20, 2016

whirm commented Jun 27, 2016

qstokkink commented Jun 27, 2016

whirm commented Jun 27, 2016

synctext commented Jun 27, 2016

qstokkink commented Jul 2, 2016 •

edited

Loading

qstokkink commented Jul 7, 2016

devos50 commented Jul 8, 2016 •

edited

Loading

qstokkink commented Jul 8, 2016

qstokkink commented Jul 13, 2016 •

edited

Loading

qstokkink commented Jul 18, 2016

whirm Jul 19, 2016

qstokkink Jul 19, 2016 •

edited

Loading

qstokkink commented Jul 20, 2016

devos50 commented Jul 20, 2016

qstokkink commented Jul 20, 2016

qstokkink commented Jan 18, 2017

devos50 commented Jan 18, 2017

qstokkink commented Jan 18, 2017

qstokkink commented Jan 19, 2017 •

edited

Loading

devos50 commented Jan 20, 2017

devos50 commented Jan 20, 2017

qstokkink commented Jan 20, 2017

qstokkink commented Sep 26, 2017

NEEDS VOLUNTEER: Communities ported to Protocol Buffers #2314

NEEDS VOLUNTEER: Communities ported to Protocol Buffers #2314

Conversation

qstokkink commented Jun 20, 2016 • edited Loading

tribler-ci commented Jun 20, 2016

lfdversluis commented Jun 20, 2016

whirm commented Jun 20, 2016

qstokkink commented Jun 20, 2016

whirm commented Jun 20, 2016

whirm commented Jun 20, 2016

qstokkink commented Jun 20, 2016

whirm commented Jun 27, 2016

qstokkink commented Jun 27, 2016

whirm commented Jun 27, 2016

synctext commented Jun 27, 2016

qstokkink commented Jul 2, 2016 • edited Loading

Old system

New system

Backward compatibility

qstokkink commented Jul 7, 2016

devos50 commented Jul 8, 2016 • edited Loading

qstokkink commented Jul 8, 2016

qstokkink commented Jul 13, 2016 • edited Loading

qstokkink commented Jul 18, 2016

whirm Jul 19, 2016

Choose a reason for hiding this comment

qstokkink Jul 19, 2016 • edited Loading

Choose a reason for hiding this comment

qstokkink commented Jul 20, 2016

devos50 commented Jul 20, 2016

qstokkink commented Jul 20, 2016

qstokkink commented Jan 18, 2017

devos50 commented Jan 18, 2017

qstokkink commented Jan 18, 2017

qstokkink commented Jan 19, 2017 • edited Loading

devos50 commented Jan 20, 2017

devos50 commented Jan 20, 2017

qstokkink commented Jan 20, 2017

qstokkink commented Sep 26, 2017

qstokkink commented Jun 20, 2016 •

edited

Loading

qstokkink commented Jul 2, 2016 •

edited

Loading

devos50 commented Jul 8, 2016 •

edited

Loading

qstokkink commented Jul 13, 2016 •

edited

Loading

qstokkink Jul 19, 2016 •

edited

Loading

qstokkink commented Jan 19, 2017 •

edited

Loading