Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve tracker stats importer performace #569

Open
josecelano opened this issue Apr 26, 2024 · 1 comment
Open

Improve tracker stats importer performace #569

josecelano opened this issue Apr 26, 2024 · 1 comment
Labels
Optimization Make it Faster

Comments

@josecelano
Copy link
Member

There is a background task to import torrent stats form the tracker using the tracker REST API. The Index imports a number of seeders and leechers for all torrents in the Index (all torrents in the Index are also in the Tracker, but not all torrents in the Tracker are in the Index).

Old solution

loop
  start timer
  for all torrents in Index
    import stats
  end for
  check timer: if less than 1 hour has passed wait until 1 hour has passed from start timer time.
end loop

Pros:

  • It's the fastest way to import all torrents. The process does not stop until it imports all torrents.

Cons:

  • If the process is interrupted, the process will start again from the beginning, and some torrents can not be imported at all.

Current solution

I changed the old solution adding an updated_at field to the stats records.

loop
  import 50 torrents that have been unimported for more than one hour
  wait 2 seconds
end loop

Pros:

  • If the process (or server is restarted) it will start from the torrents that have not been imported in the last hour. It's more robust but slower.
  • The tracker does not receive too many requests.

Cons

  • We are limited to 50 torrents per second.

NOTICE: we wait 2 seconds between importations because if there is nothing to import, the CPU is constantly running queries against the database to get the updated list of torrents pending import.

Newly proposed solution

loop
  while there are torrents pending to import
    import 50 torrents that have been unimported for more than one hour
  end while
  no more torrents pending to update -> wait 2 seconds
end loop

NOTICE: we wait only when there is nothing to import. Instead of continuously checking if there is more job to do, we only check every 2 seconds (config value). However, when there are torrents to update, we don't stop updating them.

Pros:

  • Faster importation
  • Avoid polling the DB too many times

Cons

  • Maybe, the tracker could receive too many requests. You cannot control that with a config option, unless we also add a limit for the inner loop in this solution.

cc @da2ce7

@josecelano josecelano added the Optimization Make it Faster label Apr 26, 2024
@josecelano
Copy link
Member Author

Reminder: check if we need to create and drop the weak pointer (weak_tracker_statistics_importer.upgrade). Maybe we can always use the same because now it's executed more often. In the first implementation it was supposed to be used only every hours for some minutes, so I made sense to create it and drop it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optimization Make it Faster
Projects
None yet
Development

No branches or pull requests

1 participant