Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add get_monotonic_batch(...) #114

Open
frankier opened this issue Dec 11, 2020 · 4 comments
Open

Add get_monotonic_batch(...) #114

frankier opened this issue Dec 11, 2020 · 4 comments

Comments

@frankier
Copy link
Contributor

The current get_batch(...) does all sorts of de-duplication and will seek anywhere needed. I wonder if it would be useful for people who need to get almost successive frames (e.g. own-keyframes [not video format keyframes] known beforehand or frames with a frameskip) to have a get_batch(...) specialized for this access pattern.

@zhreshold
Copy link
Member

Just my two cents for this particular use case:

to get almost successive frames (e.g. own-keyframes [not video format keyframes] known beforehand
if custom keyframes, not video keyframes are desired, then the best way to do is to cache them directly in memory or temporary file, because random important frames won't help decoding anyway.
Not sure if it's worth implementing it in c++ backend, since python level array caching is probably enough.

For example, to get_batch([1, 3, 5, 7, 9]) with [3, 5, 7] being important frames

  • the first run will submit arr = get_batch([1, 3, 5, 7, 9]), slice arr to cache frames [3, 5, 7]
  • the second run get_batch([1, 3, 7]) will check the cache, [3, 7] is already there, so it should submit get_batch([1,]) instead, combine the cache and return the array

@frankier Anything else you want to add to this?

@frankier
Copy link
Contributor Author

My use case involves only reading the video in a single pass, so no caching is needed. I suppose in terms of optimising I was thinking about at least two things:

  1. We don't have to deduplicate/sort the list in get_batch because we know it's already sorted. Therefore if we try to seek forward and find diff <= 0, we can just through an exception at this point.
  2. We don't have to keep bisecting for the current/next keyframe. We can just keep track of the current keyframe (this could help in general usage anyway) and scan forward from there to find the first before our next frame.

I don't know if 2. would be that big a speed up -- and perhaps the idea of keeping track of the current keyframe could be integrated in anyway but it would be nice to at least have 1. so we can opt out of the (useful in the general case) get_batch(...) pre-processing.

Feel free to close this issue if you don't think it's helpful. I just started thinking about this while trying to figure out #111

@frankier
Copy link
Contributor Author

Otherwise let me know which way you think is best, and I can put together a PR.

@zhreshold
Copy link
Member

@frankier Sorry I missed the update for a while. After a rethinking, I feel like the specific access pattern could benefit from another class, e.g., MonotonicReader with the constructor knowing what the desired keyframes you are requesting when you initially load the video, this will avoid messing up with the current foolproof design of VideoReader and can benefit more from specific optimizations if all you want is to quickly load some key frames and done.

  • you can skip constructing lots of redundant stuffs that are implemented in the existing VideoReader
  • you can of course implement 1&2 from your last comment

Does it sound good to you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants