(WIP) Initial implementation of the new videoReader API #2683

bjuncek · 2020-09-17T12:22:04Z

Per description in #2660 , here is a proof of concept implementation for video reading and metadata accessing.
THIS API IS STILL EXPERIMENTAL AND WILL LIKELY BE CHANGED/MODIFIED

Some key features:

confirming identical results to "video_reader" backend.
if merged would fixed the VideoReader segfault on SOME videos. #2650 issue due to underlying API change.

Missing features:

audio stream not properly tested
performance upgrade - at the moment there is a constant overhead for tensor allocation
lack of support for random byte, CC, and SUB streams
seek not properly tested

…ncek/base_api

Merge!

.circleci/unittest/linux/scripts/environment.yml

.circleci/config.yml.in

torchvision/csrc/cpu/video/Video.h

fmassa · 2020-10-06T19:04:25Z

test/test_video.py

+            s = min(r)
+            e = max(r)
+
+            reader = torch.classes.torchvision.Video(full_path, "video")


For a follow-up PR: we should expose in torchvision Video, so that you can access it via torchvision.io.Video or something like that

adding that to #2660 feature tracker

…ncek/base_api

fmassa · 2020-10-06T19:12:36Z

test/test_video.py

+            self.assertEqual(tv_result.size(), new_api.size())
+
+    def test_partial_video_reading_fn(self):
+        torchvision.set_video_backend("video_reader")


we might need to comment this out for now. Many of the test issues we had before were due to switching globally to using video_reader during the tests.

Sure - pushed the changes

bjuncek · 2020-10-06T21:13:21Z

@fmassa seems segfaulting on travis.
Looking at the log, it picks up av from conda and the segfault is likely due to the ffmpeg 4.3.1

the output of the raw log

  av                 conda-forge/linux-64::av-8.0.2-py36hf21bf4b_1
  bzip2              conda-forge/linux-64::bzip2-1.0.8-h516909a_3
  ffmpeg             conda-forge/linux-64::ffmpeg-4.3.1-h167e202_0

which makes sense given that travis is installing av from conda with no ffmpeg version check.
Should we add conda install -c conda-forge ffmpeg=4.2.2 before this

fmassa · 2020-10-07T08:41:44Z

Should we add conda install -c conda-forge ffmpeg=4.2.2 before this

We have disabled all IO tests in travis, TravisCI now only compiles those blocks. so I would say you can for now just skip test_video as we do for TestIO and TestVideoReader in

vision/.travis.yml

Line 58 in 5320f74

    
           - pytest --cov-config .coveragerc --cov torchvision --cov $TV_INSTALL_PATH -k 'not TestVideoReader and not TestVideoTransforms and not TestIO' test --ignore=test/test_datasets_download.py

codecov · 2020-10-07T09:04:38Z

Codecov Report

Merging #2683 into master will increase coverage by 0.68%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2683      +/-   ##
==========================================
+ Coverage   72.42%   73.11%   +0.68%     
==========================================
  Files          96       96              
  Lines        8313     8332      +19     
  Branches     1293     1299       +6     
==========================================
+ Hits         6021     6092      +71     
+ Misses       1903     1848      -55     
- Partials      389      392       +3

Impacted Files	Coverage Δ
torchvision/__init__.py	`68.75% <0.00%> (+3.12%)`	⬆️
torchvision/ops/boxes.py	`99.07% <0.00%> (+5.81%)`	⬆️
torchvision/io/video.py	`80.47% <0.00%> (+11.83%)`	⬆️
torchvision/io/_video_opt.py	`39.37% <0.00%> (+16.25%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e639d3...aae1d4f. Read the comment docs.

fmassa

Merging this to move forward, there are a few follow-up cleanups that can be done but let's do them in a different PR

* adding base files * setup modification to actually build the thing * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * adding base files * setup modification to actually build the thing * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * metadata registration works * API build next * test * Merge change * formatting parameters to avoid the segfault * next now works on a video * make size of the output tensor format dependent * Make next work on audio stream only as well * refactoring the _setCurrentStream param * Fixing the last frame return and sensor * todo docs * Formatting * cleanup and comments * introducing new tests for the API * cleanup * Comment out unnecesary format (will add following FFMPEG fix) * Reformat parsing function * removing the seek bug `get_decoder_params` * Removing unnecessary code/variables * enforce RGB24 as a reading format (will crash before ffmpeg fix) * permute the dimensions to return (RGB x H x W) * Changing the return type to std::tuple<torch::Tensor, double> as opposed to tensor list * Adjusting tests for the new return type * remove unnecessary jitter * clangangangang * Metadata return changes (pytorch#1) * remove implicit calls to set a current stream (pytorch#2) * Adding new tests to check the accuracy of the seek * cleanup debugging statements * adding base files * setup modification to actually build the thing * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * adding base files * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * metadata registration works * API build next * test * Merge change * formatting parameters to avoid the segfault * next now works on a video * make size of the output tensor format dependent * Make next work on audio stream only as well * refactoring the _setCurrentStream param * Fixing the last frame return and sensor * todo docs * Formatting * cleanup and comments * introducing new tests for the API * cleanup * Comment out unnecesary format (will add following FFMPEG fix) * Reformat parsing function * removing the seek bug `get_decoder_params` * Removing unnecessary code/variables * enforce RGB24 as a reading format (will crash before ffmpeg fix) * permute the dimensions to return (RGB x H x W) * Changing the return type to std::tuple<torch::Tensor, double> as opposed to tensor list * Adjusting tests for the new return type * remove unnecessary jitter * clangangangang * Metadata return changes (pytorch#1) * remove implicit calls to set a current stream (pytorch#2) * Adding new tests to check the accuracy of the seek * cleanup debugging statements * Addressing PR comments * addressing Francisco's comments * CLANG build formatting * Updated testing to test against pyav for the video tensor reads * Formatting * remove pyav from pip deps and add it to conda build * add pyav and ffmeped to conda builds * Formatting? * Setting up linter once and for all hopefully * Testing pyav * Fix to 8.0.0 * Try 6.2.0 * See what happens with av from pip * Remove FFMPEG blocker * What is going on? * More tests * Forgot something * unblocker * Check if cache is messing up with things * Now try with different ffmpeg * Now try with different ffmpeg * Testing pyav * Fix to 8.0.0 * Try 6.2.0 * See what happens with av from pip * What is going on? * More tests * Forgot something * Check if cache is messing up with things * Now try with different ffmpeg * Now try with different ffmpeg * Do not install av * Test with ffmpeg 4.2 * clean up video tests * cleaning up the tests a bit to better test partial reading * arrgh linter * Forgot the av test * forgot av test * checkout build files from master * revert circleci * addressing Franciscos comments * addressing Franciscos comments * Ignore ffmpeg in travis Co-authored-by: Francisco Massa <[email protected]> Co-authored-by: Edgar Andrés Margffoy Tuay <[email protected]>

* adding base files * setup modification to actually build the thing * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * adding base files * setup modification to actually build the thing * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * metadata registration works * API build next * test * Merge change * formatting parameters to avoid the segfault * next now works on a video * make size of the output tensor format dependent * Make next work on audio stream only as well * refactoring the _setCurrentStream param * Fixing the last frame return and sensor * todo docs * Formatting * cleanup and comments * introducing new tests for the API * cleanup * Comment out unnecesary format (will add following FFMPEG fix) * Reformat parsing function * removing the seek bug `get_decoder_params` * Removing unnecessary code/variables * enforce RGB24 as a reading format (will crash before ffmpeg fix) * permute the dimensions to return (RGB x H x W) * Changing the return type to std::tuple<torch::Tensor, double> as opposed to tensor list * Adjusting tests for the new return type * remove unnecessary jitter * clangangangang * Metadata return changes (#1) * remove implicit calls to set a current stream (pytorch#2) * Adding new tests to check the accuracy of the seek * cleanup debugging statements * adding base files * setup modification to actually build the thing * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * adding base files * video api constructor registration * FAIL metadata * FAIL update for QS * revert * debugging with Victor * metadata registration works * API build next * test * Merge change * formatting parameters to avoid the segfault * next now works on a video * make size of the output tensor format dependent * Make next work on audio stream only as well * refactoring the _setCurrentStream param * Fixing the last frame return and sensor * todo docs * Formatting * cleanup and comments * introducing new tests for the API * cleanup * Comment out unnecesary format (will add following FFMPEG fix) * Reformat parsing function * removing the seek bug `get_decoder_params` * Removing unnecessary code/variables * enforce RGB24 as a reading format (will crash before ffmpeg fix) * permute the dimensions to return (RGB x H x W) * Changing the return type to std::tuple<torch::Tensor, double> as opposed to tensor list * Adjusting tests for the new return type * remove unnecessary jitter * clangangangang * Metadata return changes (#1) * remove implicit calls to set a current stream (pytorch#2) * Adding new tests to check the accuracy of the seek * cleanup debugging statements * Addressing PR comments * addressing Francisco's comments * CLANG build formatting * Updated testing to test against pyav for the video tensor reads * Formatting * remove pyav from pip deps and add it to conda build * add pyav and ffmeped to conda builds * Formatting? * Setting up linter once and for all hopefully * Testing pyav * Fix to 8.0.0 * Try 6.2.0 * See what happens with av from pip * Remove FFMPEG blocker * What is going on? * More tests * Forgot something * unblocker * Check if cache is messing up with things * Now try with different ffmpeg * Now try with different ffmpeg * Testing pyav * Fix to 8.0.0 * Try 6.2.0 * See what happens with av from pip * What is going on? * More tests * Forgot something * Check if cache is messing up with things * Now try with different ffmpeg * Now try with different ffmpeg * Do not install av * Test with ffmpeg 4.2 * clean up video tests * cleaning up the tests a bit to better test partial reading * arrgh linter * Forgot the av test * forgot av test * checkout build files from master * revert circleci * addressing Franciscos comments * addressing Franciscos comments * Ignore ffmpeg in travis Co-authored-by: Francisco Massa <[email protected]> Co-authored-by: Edgar Andrés Margffoy Tuay <[email protected]>

bjuncek added 30 commits August 13, 2020 13:35

adding base files

a1ad68b

setup modification to actually build the thing

0abcfed

video api constructor registration

ecbec59

FAIL metadata

33d10bf

FAIL update for QS

57763cf

revert

9ded798

debugging with Victor

ac7f1e5

adding base files

24718de

setup modification to actually build the thing

6800811

video api constructor registration

2cf981c

FAIL metadata

30263e4

FAIL update for QS

d58e8b7

revert

f5657ec

debugging with Victor

f5284ec

metadata registration works

1398dd6

API build next

c124bb1

test

c43c729

Merge branch 'bjuncek/base_api' of github.com:bjuncek/vision into bju…

1d5600d

…ncek/base_api

Merge change

d4452d9

formatting parameters to avoid the segfault

56a84c9

next now works on a video

36cc8f1

make size of the output tensor format dependent

780bef9

Make next work on audio stream only as well

4716512

refactoring the _setCurrentStream param

2a0c73f

Fixing the last frame return and sensor

aa775c2

todo docs

bdb62bf

Formatting

632842c

cleanup and comments

8b7644d

introducing new tests for the API

cddc92e

cleanup

6411cf9

andfoy and others added 9 commits October 5, 2020 17:20

Test with ffmpeg 4.2

10095e1

Merge branch 'test-av' of github.com:fmassa/vision-1 into test-av

30232de

Merge!

Merge remote-tracking branch 'fmassa/test-av' into bjuncek/base_api

d128d72

clean up video tests

825d2ec

cleaning up the tests a bit to better test partial reading

fd510c8

arrgh linter

c0aeb54

Forgot the av test

df6a612

forgot av test

d910214

Merge branch 'master' into bjuncek/base_api

d7dd2ba