Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Session god-mediator-object #6206

Merged
merged 62 commits into from
Aug 11, 2021
Merged

Remove Session god-mediator-object #6206

merged 62 commits into from
Aug 11, 2021

Conversation

ichorid
Copy link
Contributor

@ichorid ichorid commented Jul 4, 2021

🎉 💀 Session is dead 💀 🎉

Dear All!
This is our chance to slay ⚔️ the wicked Session 🐉 once and for all, thus fulfilling the ancient prophecy 🧙 of freedom from 🏇 race conditions.

I managed to completely remove Session from run_tribler which works fine (aside from statistics endpoint). Any kind of help with this PR is very appreciated, that's why I added the branch to our main repo 🙏. Just make PRs on this PR, as we usually do with "collective effort" stuff.

Upsides

Some arguments for removing Session:

  • no more dumb Session-mocks
  • explicit dependencies between components (almost killed myself with 🤦 a couple of times while untangling those)
  • the ability to make startup sequence asynchronous in the future
  • much cleaner code
  • lean and simple startup and shutdown sequences
  • proper dependency injection for testability and object reuse
  • much easier to assemble custom variations of the Core, such as headless runs, seeders, etc.
  • no more object is None race conditions on startup/shutdown
  • IDE type hints are by your side now
  • tests should become much faster and more parallelizable

⚠️ Caveats ⚠️

  • AIOHTTP does not allow adding endpoints after the app has been started, so Upgrader notifications do not work sadly. We'll have to figure that out.
  • IPv8 code references Session explicitly, which is a bad antipattern (i.e. backreference) that bitten 🐺 us now. I had to add ProstethicSession workaround object to make the thing work. Dear @qstokkink , would you kindly make IPv8 loading procedure follow the dependency injection pattern and accept the exact dependencies instead of the Session mediator-object? 😉
  • DownloadManager is badly entangled with TunnelCommunity. We'll have to untangle them later, after we fix all the tests.
  • A couple of minor Statistics endpoints is missing. I removed them for the clarity of dependencies. We must redistribute their functions between other endpoints.
  • This is by far not the end of the Core startup evolution, but this is the crucial step. After this, everything will become so much easier.

UPDATE:
after completing refactoring of the tests suite, it can now be run completely in parallel with pytest-xdist. On my machine (Core i7 6 cores) it takes only 18 seconds to complete the whole suite. Yay! 🎉

@qstokkink
Copy link
Contributor

Dear @qstokkink , would you kindly make IPv8 loading procedure follow the dependency injection pattern and accept the exact dependencies instead of the Session mediator-object? 😉

Well you probably won't want to use the launcher for that then. The old format still supported by IPv8. You can pretty much copy this verbatim and insert it wherever you like:

def load_ipv8_overlays(self):
if self.config.get_trustchain_testnet():
peer = Peer(self.trustchain_testnet_keypair)
else:
peer = Peer(self.trustchain_keypair)
discovery_community = DiscoveryCommunity(peer, self.ipv8.endpoint, self.ipv8.network)
discovery_community.resolve_dns_bootstrap_addresses()
self.ipv8.overlays.append(discovery_community)
self.ipv8.strategies.append((RandomChurn(discovery_community), -1))
self.ipv8.strategies.append((PeriodicSimilarity(discovery_community), -1))
self.ipv8.strategies.append((RandomWalk(discovery_community), 20))
# TrustChain Community
if self.config.get_trustchain_enabled():
from ipv8.attestation.trustchain.community import TrustChainCommunity, \
TrustChainTestnetCommunity
community_cls = TrustChainTestnetCommunity if self.config.get_trustchain_testnet() else TrustChainCommunity
self.trustchain_community = community_cls(peer, self.ipv8.endpoint,
self.ipv8.network,
working_directory=self.config.get_state_dir())
self.ipv8.overlays.append(self.trustchain_community)
self.ipv8.strategies.append((EdgeWalk(self.trustchain_community), 20))
tc_wallet = TrustchainWallet(self.trustchain_community)
self.wallets[tc_wallet.get_identifier()] = tc_wallet
# DHT Community
if self.config.get_dht_enabled():
from ipv8.dht.discovery import DHTDiscoveryCommunity
self.dht_community = DHTDiscoveryCommunity(peer, self.ipv8.endpoint, self.ipv8.network)
self.ipv8.overlays.append(self.dht_community)
self.ipv8.strategies.append((RandomWalk(self.dht_community), 20))
self.ipv8.strategies.append((PingChurn(self.dht_community), -1))
# Tunnel Community
if self.config.get_tunnel_community_enabled():
from tribler_core.modules.tunnel.community.triblertunnel_community import TriblerTunnelCommunity,\
TriblerTunnelTestnetCommunity
from tribler_core.modules.tunnel.community.discovery import GoldenRatioStrategy
community_cls = TriblerTunnelTestnetCommunity if self.config.get_tunnel_testnet() else \
TriblerTunnelCommunity
random_slots = self.config.get_tunnel_community_random_slots()
competing_slots = self.config.get_tunnel_community_competing_slots()
dht_provider = DHTCommunityProvider(self.dht_community, self.config.get_ipv8_port())
settings = TunnelSettings()
settings.min_circuits = 3
settings.max_circuits = 10
self.tunnel_community = community_cls(peer, self.ipv8.endpoint, self.ipv8.network,
tribler_session=self,
dht_provider=dht_provider,
ipv8=self.ipv8,
bandwidth_wallet=self.wallets["MB"],
random_slots=random_slots,
competing_slots=competing_slots,
settings=settings)
self.ipv8.overlays.append(self.tunnel_community)
self.ipv8.strategies.append((RandomWalk(self.tunnel_community), 20))
self.ipv8.strategies.append((GoldenRatioStrategy(self.tunnel_community), -1))
# Market Community
if self.config.get_market_community_enabled() and self.config.get_dht_enabled():
from anydex.core.community import MarketCommunity, MarketTestnetCommunity
community_cls = MarketTestnetCommunity if self.config.get_trustchain_testnet() else MarketCommunity
self.market_community = community_cls(peer, self.ipv8.endpoint, self.ipv8.network,
trustchain=self.trustchain_community,
dht=self.dht_community,
wallets=self.wallets,
working_directory=self.config.get_state_dir(),
record_transactions=self.config.get_record_transactions())
self.ipv8.overlays.append(self.market_community)
self.ipv8.strategies.append((RandomWalk(self.market_community), 20))
# Popular Community
if self.config.get_popularity_community_enabled():
from tribler_core.modules.popularity.popularity_community import PopularityCommunity
self.popularity_community = PopularityCommunity(peer, self.ipv8.endpoint, self.ipv8.network,
metadata_store=self.mds,
torrent_checker=self.torrent_checker)
self.ipv8.overlays.append(self.popularity_community)
self.ipv8.strategies.append((RandomWalk(self.popularity_community), 20))
# Gigachannel Community
if self.config.get_chant_enabled():
from tribler_core.modules.metadata_store.community.gigachannel_community import GigaChannelCommunity, GigaChannelTestnetCommunity
from tribler_core.modules.metadata_store.community.sync_strategy import SyncChannels
community_cls = GigaChannelTestnetCommunity if self.config.get_chant_testnet() else GigaChannelCommunity
self.gigachannel_community = community_cls(peer, self.ipv8.endpoint, self.ipv8.network, self.mds,
notifier=self.notifier)
self.ipv8.overlays.append(self.gigachannel_community)
self.ipv8.strategies.append((RandomWalk(self.gigachannel_community), 20))
self.ipv8.strategies.append((SyncChannels(self.gigachannel_community), 20))
# Gigachannel RemoteQuery Community
from tribler_core.modules.metadata_store.community.remote_query_community \
import RemoteQueryCommunity, RemoteQueryTestnetCommunity
community_cls = RemoteQueryTestnetCommunity if self.config.get_chant_testnet() else RemoteQueryCommunity
self.remote_query_community = community_cls(peer, self.ipv8.endpoint, self.ipv8.network, self.mds,
notifier=self.notifier)
self.ipv8.overlays.append(self.remote_query_community)
self.ipv8.strategies.append((RandomWalk(self.remote_query_community), 50))

@ghost
Copy link

ghost commented Jul 7, 2021

Congratulations 🎉. DeepCode analyzed your code in 2.954 seconds and we found no issues. Enjoy a moment of no bugs ☀️.

👉 View analysis in DeepCode’s Dashboard | Configure the bot

👉 The DeepCode service and API will be deprecated in August, 2021. Here is the information how to migrate. Thank you for using DeepCode 🙏 ❤️ !

If you are using our plugins, you might be interested in their successors: Snyk's JetBrains plugin and Snyk's VS Code plugin.

Copy link
Contributor

@devos50 devos50 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have spent a few hours going through this PR, hereby my comments.

Let me start with the positive aspects of this PR. I particularly like that dependencies between Tribler modules are more explicit now. The Session object used to trickle down to all components, which was kind-of a lazy way of managing dependencies. Dependencies are much more explicit now, so regarding that, job well done! These changes are particularly apparent in the TriblerTunnelCommunity that heavily relies on various components in the Tribler session object. I see opportunities for follow-up PRs where we can revise some of these dependencies and improve them further (for example, the circular dependency between DownloadManager and Download).

Another thing I like is how the tests transition more to real unit tests, where we only initialise a particular component and assess the behaviour of that component in isolation. The changes in this PR are a good step towards that. I also believe that at one point we can think of extracting a few existing tests into dedicated integration tests, where we test the interactions between different components. One thing though is that I am not convinced by your claim that 'tests should become much faster and more parallelizable' by this PR only. I don't think the achieved speedup is that significant but maybe I missed something during my review. We also cannot judge that currently since there are still 156 tests failing.

Having said that, I have two main objections against this PR, one that relates to the content of the PR and another one regarding the nature of the PR.

We agreed last Friday that Tribler benefits from more explicit dependencies. While this PR includes these changes, I was surprised that the whole Session object has been removed, and I am not convinced that this is a required change to address the upsides you identified. Why not preserve the Session class but redistribute most of the functionality of this class to other components instead? One of the aspects I liked about the Session object was that it provides a convenient abstraction to work with Tribler (create a Session, start it, work with it, and shut it down). Now we have to work with Asyncio tasks to achieve this. We have always worked with the Session object and, as I already said, am not convinced it should be killed with fire right now. Even though you can argue that this argument based on history (and maybe emotion) is weak, the truth is that the Session object has always been an integral component of Tribler, and of software that builds on Tribler (especially Gumby). As apparent by the sheer size of this PR, it is not trivial to change this without introducing other bugs.

Let me also comment on a few of your arguments in favour of removing the session:

no more dumb Session-mocks

This is definitely a good point and these large mocks are something that always bothered me. But at the same time, I argue that the solution to these dumb mocks is not to remove the Session, but rather to refactor or tests to focus on a single argument and not on an entire Tribler session (something you have done in this PR actually). Tribler sessions would still be valuable when having more sophisticated tests (e.g., the tunnel tests).

explicit dependencies between components (almost killed myself with 🤦 a couple of times while untangling those)

You can still have a session object while making these dependencies explicit. I consider removing the session class and making dependencies more explicit as two different (mostly unrelated) tasks.

the ability to make startup sequence asynchronous in the future

How does the Session object prevent you from doing this? Also, I do not think that there are many components that can be asynchronously loaded.

lean and simple startup and shutdown sequences

In this PR, the entire startup/shutdown procedure is moved to a god method (core_session). This does not make it leaner. As an alternative, I think the way of launching communities (with the not_before, not_after etc) is a better way to achieve this goal.

So my second objection is related to the nature of this PR. This PR is currently an amalgamation of many (unrelated) changes. It was definitely not easy for me to navigate through all these changes and to understand the reason behind them. Reviewing this is not for the faint-hearted. To elaborate, here are a few of the (independent) changes that this PR contains:

  • Variable renaming (tribler_config -> config)
  • Cosmetic changes, for example, in torrent_checker.py line 28-30 or in rest_manager.py, line 60.
  • The exception handler is moved to a separate class (even though I’m in favour of this change, it could have been done in another PR, making it easier to review stuff and the affected tests).
  • Removing the setup function from the Download class.
  • Removing dead code (e.g., the safe_files_to_remove method).
  • Combining the ChannelsEndpointBase and ChannelsEndpoint classes.
  • The removal of community launchers.

To properly review this PR, one has to be an expert on all areas of Tribler, and having these unrelated changes in a single PR makes it unnecessarily harder to find potential bugs. It also does not help that this PR is mostly in an unfinished state. The PR could have been much more focussed, only touching the changes that we discussed last Friday (which I believed just concerned making dependencies more explicit). Is there a particular reason why all of these changes have to be included in a single super-commit/super-PR?

Minor comment: I believe there are a few unintended files committed to src/tribler-core/tribler_core/modules/bandwidth_accounting.

src/tribler-core/tribler_core/exception_handler.py Outdated Show resolved Hide resolved

def load_communities(config: TriblerConfig, trustchain_keypair, ipv8, dlmgr, metadata_store,
torrent_checker, notifier, bootstrapper):
""" This method will be splitted after grand Vadim's PR is merged
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not fix it in this PR? Doesn't seem to require much work and it would make the review easier.

(I do realise that this contradicts a bit with my other points on incremental PRs)

Copy link
Contributor

@drew2a drew2a Jul 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed it in this PR yesterday.

The core idea is to explicitly pass communities into core_session in DI style:

async def core_session(
        config: TriblerConfig,
        communities_cls: List[Factory],
        shutdown_event = Event()
):
    ...
    for community_cls in communities_cls:
        community = community_cls.create_class(peer, ipv8.endpoint, ipv8.network, mediator=mediator, **community_cls.kwargs)
        community.fill_mediator(mediator)

        for strategy in community.strategies:
            ipv8.strategies.append((strategy.create_class(community), strategy.target_peers))

        ipv8.overlays.append(community)

So, there are two types of applications that use core_session (except tests):

  1. Applications that rely on config "enable/disable" flags for the communities (Tribler itself)
  2. Applications that don't rely on config "enable/disable" flags for the communities (helpers and experiments)

For the first group the list can be created as:

run_tribler.py

def communities_gen(config: TriblerConfig):
    yield Factory(create_class=TriblerDiscoveryCommunity) if config.discovery_community.enabled else ...
    yield Factory(create_class=TriblerDHTDiscoveryCommunity) if config.dht.enabled else ...

    bandwidth_accounting_kwargs = {'database': config.state_dir / "sqlite" / "bandwidth.db"}
    bandwidth_accounting_cls = BandwidthAccountingTestnetCommunity if config.general.testnet or config.bandwidth_accounting.testnet else BandwidthAccountingCommunity
    yield Factory(create_class=bandwidth_accounting_cls, kwargs=bandwidth_accounting_kwargs)

    tribler_tunnel_cls = TriblerTunnelTestnetCommunity if config.general.testnet or config.tunnel_community.testnet else TriblerTunnelCommunity
    yield Factory(create_class=tribler_tunnel_cls) if config.tunnel_community.enabled else ...
    yield Factory(create_class=PopularityCommunity) if config.popularity_community.enabled else ...

    giga_channel_cls = GigaChannelTestnetCommunity if config.general.testnet else GigaChannelCommunity
    yield Factory(create_class=giga_channel_cls) if config.chant.enabled else ...

For the second group the list can be created as:

initial_filling.py

def communities(self):
    return [
        Factory(create_class=TriblerDiscoveryCommunity),
        Factory(create_class=TriblerDHTDiscoveryCommunity),
        Factory(create_class=ObservablePopularityCommunity,
                kwargs={'interval_in_sec': self._interval_in_sec,
                                 'output_file_path': self._output_file_path}),
    ]

self.handle.set_max_connections(self.session.config.libtorrent.max_connections_download)

# Set limit on download for a bootstrap file
if self.config.get_bootstrap_download():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though I know that the bootstrap download is currently disabled, removing this code might backfire in the future if we decide to revive this feature again.

self.trust_graph = TrustGraph(self.public_key, self.bandwidth_db)

# Start bootstrap download if not already done
if not self.session.bootstrap:
self.session.start_bootstrap_download()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous comment on removing code related to bootstrapping.

src/tribler-core/tribler_core/session.py Outdated Show resolved Hide resolved
src/tribler-core/tribler_core/session.py Outdated Show resolved Hide resolved
src/tribler-core/tribler_core/statistics.py Show resolved Hide resolved
config.popularity_community.enabled = True
config.chant.enabled = True

community_loader = create_default_loader(config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_default_loader is undefined at the moment. This test probably has to be redesigned.

@drew2a drew2a force-pushed the refactor/remove_session branch from 722ac74 to c69f55c Compare July 7, 2021 13:37
@ichorid
Copy link
Contributor Author

ichorid commented Jul 29, 2021

retest this please

1 similar comment
@ichorid
Copy link
Contributor Author

ichorid commented Jul 29, 2021

retest this please

@ichorid
Copy link
Contributor Author

ichorid commented Aug 4, 2021

retest this please

3 similar comments
@ichorid
Copy link
Contributor Author

ichorid commented Aug 5, 2021

retest this please

@ichorid
Copy link
Contributor Author

ichorid commented Aug 6, 2021

retest this please

@ichorid
Copy link
Contributor Author

ichorid commented Aug 7, 2021

retest this please

community = DiscoveryCommunity(self.peer, ipv8.endpoint, ipv8.network, max_peers=100)
ipv8.strategies.append((RandomChurn(community), INFINITE))
ipv8.strategies.append((PeriodicSimilarity(community), INFINITE))
ipv8.strategies.append((RandomWalk(community), INFINITE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ipv8.strategies.append((RandomWalk(community), INFINITE))
ipv8.strategies.append((RandomWalk(community), 20))

The current PR code would disproportionally (regarding the connections in other network overlays) introduce you to peers you're not interested in. Of course you want some random connections, but not an "infinite" amount.

Trivia: 20 is the default setting.

@ichorid ichorid marked this pull request as ready for review August 11, 2021 21:40
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 1 Security Hotspot
Code Smell A 58 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@ichorid ichorid merged commit a155330 into main Aug 11, 2021
@ichorid
Copy link
Contributor Author

ichorid commented Aug 11, 2021

изображение

This was referenced Oct 21, 2021
@drew2a drew2a deleted the refactor/remove_session branch May 2, 2023 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants