Proposal for an alternative to read_piece_alert #6259
AllSeeingEyeTolledEweSew
started this conversation in
Ideas
Replies: 2 comments 4 replies
-
I agree that there should be a more direct way to access the file storage, cutting through most of the layers right now. I would still hope that there could be some interface implemented by the underlying |
Beta Was this translation helpful? Give feedback.
4 replies
-
@elgatito for input |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Current state
The details of
set_piece_deadline()
+read_piece_alert
cause some trouble, especially in python.read_piece_alert
s for the entire range they want to read. Since alerts must be processed as soon as they're delivered, andread_piece_alert
s may be delivered out-of-order anyway, this means callers may accumulate a lot of memory if they want to read a large range.read_piece_alert
s are "in-flight" at a given time, globally in the app. But the design ofset_piece_deadline(alert_when_available)
makes this difficult: if a piece is available then it posts an alert immediately; otherwise it waits. The caller won't know which, so it must be conservative to stay within desired memory limits; but if few pieces are available, then this is wasteful as the buffer will just remain empty for a long time.set_piece_deadline
calls, or else make manyset_piece_deadline(p, 0)
calls which are later overridden withset_piece_deadline(p, alert_when_available)
when we want more alerts. But messing with piece deadlines is onerous due to the limited interface ofset_piece_deadline
.read_piece
code copies data into a heap buffer. In python it must gets copied again into abytes
object, and again if we create slices withbytes[lo:hi]
. If we want tosend()
orwrite()
the data, that's another copy.read_piece_alert
only reads on piece boundaries, but most I/O will naturally occur on file boundaries or aligned with files.A caller who wants to read data can either
set_piece_deadline()
+read_piece_alert
, orpiece_finished_alert
s (for all torrents) + manually open and read files (working aroundmove_storage
andrename_file
, etc)Personally I think the second one is easier, more optimal and less error-prone.
Goals
Now that libtorrent 2.0 uses mmap, I claim our goal should be to make it easy for callers to work with the torrent files' page cache.
In particular, since my project is a python app to vend bittorrent data over http, I'd like to just use
sendfile()
to vend that data. It's guaranteed to be visible as soon as libtorrent is done writing, is always file-aligned, saves 3-4 copies, and saves the data from ever needing to be touched by python.Proposal
I propose the following changes:
void torrent_handle::open_file(int file_index)
, which opens a new file descriptor to a torrent's file storage, and posts the result in anopen_file_alert
rename_file()
andmove_storage()
rename_file()
/move_storage()
should operate agnostic of any outstanding open file handlesopen_file()
followed by various operations (especiallymove_storage()
to a different volume) may mean the file descriptor refers to old storage that won't receive new dataopen_file_alert
will contain all the pieces referenced bypiece_finished_alert
up to that point. The caller understands that further storage alerts may make the fd "obsolete".deadline_flags_t::lite_alert
, such thatset_piece_deadline(alert_when_available | lite_alert)
will postpiece_finished_alert
instead ofread_piece_alert
.piece_finished_alert
of particular pieces, but not all pieces for all torrents.alert_mask & piece_progress
).Beta Was this translation helpful? Give feedback.
All reactions