VideoReader segfault on SOME videos. #2650

bjuncek · 2020-09-08T09:58:25Z

🐛 Bug

VideoReader segmentation fault on long videos when using video_reader backend. This issue is a continuation of #2259
Torchvision segfaults when reading entire test video.

I used to believe this was the issue with long videos only, but it happens on the test videos we have provided as well suggesting in might related to the FFMPEG version installed on a system (the fact that test don't catch that might suggest that).

To Reproduce

Steps to reproduce the behavior:

install torchvision from source - in this case
from your folder call
vframes, _, _ = torchvision.io.read_video(path, pts_unit="sec") where path=$TVDIR/test/assets/videos/TrumanShow_wave_f_nm_np1_fr_med_26.avi

Backtrace suggests it's an issue in libswscale.

#0  0x00007fff88224cf2 in ?? () from /home/bjuncek/miniconda3/envs/vb/lib/libswscale.so.5
#1  0x00007fff88223bb4 in ?? () from /home/bjuncek/miniconda3/envs/vb/lib/libswscale.so.5
#2  0x00007fff881f9af4 in sws_scale () from /home/bjuncek/miniconda3/envs/vb/lib/libswscale.so.5

Which I've previously found can be due to conflicting inputs (note, this might be due to new/different FFMPEG version?).

Expected behavior

Video is being read

Environment

Collecting environment information...
PyTorch version: 1.6.0
Is debug build: False
CUDA used to build PyTorch: 10.2

OS: Ubuntu 18.04.4 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: Could not collect

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Quadro RTX 8000
GPU 1: Quadro RTX 8000

Nvidia driver version: 440.33.01
cuDNN version: Probably one of the following:
/usr/local/cuda-10.1/targets/x86_64-linux/lib/libcudnn.so.7
/usr/local/cuda-10.2.89/targets/x86_64-linux/lib/libcudnn.so.7

Versions of relevant libraries:
[pip3] numpy==1.19.1
[pip3] torch==1.6.0
[pip3] torchvision==0.7.0a0+78ed10c
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.2 256
[conda] mkl-service 2.3.0 py38he904b0f_0
[conda] mkl_fft 1.1.0 py38h23d657b_0
[conda] mkl_random 1.1.1 py38hcb8c335_0 conda-forge
[conda] numpy 1.19.1 py38hbc911f0_0
[conda] numpy-base 1.19.1 py38hfa32c7d_0
[conda] pytorch 1.6.0 py3.8_cuda10.2.89_cudnn7.6.5_0 pytorch
[conda] torchvision 0.7.0a0+78ed10c pypi_0 pypi

Suggested fix

Removing hidden inputs (specifically size/aspect ratio/crop) in _read_video op can in principle fix this, but might be bc breaking if users are exposing these manually in their code.

cc @bjuncek

The text was updated successfully, but these errors were encountered:

andfoy · 2020-09-16T17:51:05Z

More information:

Program received signal SIGSEGV, Segmentation fault.
0x00007f4d206b7fc2 in ff_yuv_420_rgb24_ssse3.loop0 ()
    at libswscale/x86/yuv_2_rgb.asm:376
376	libswscale/x86/yuv_2_rgb.asm: No such file or directory.
Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.el7.x86_64
(gdb) bt
#0  0x00007f4d206b7fc2 in ff_yuv_420_rgb24_ssse3.loop0 ()
    at libswscale/x86/yuv_2_rgb.asm:376
#1  0x00007f4d206b6e84 in yuv420_rgb24_ssse3 (c=0x564bca2e59c0, src=0x7ffd52b1c9e0, 
    srcStride=0x7ffd52b1c9c0, srcSliceY=0, srcSliceH=256, dst=0x7ffd52b1ca00, 
    dstStride=0x7ffd52b1c9d0) at libswscale/x86/yuv2rgb_template.c:177
#2  0x00007f4d2068eb45 in sws_scale (c=<optimized out>, srcSlice=<optimized out>, 
    srcStride=<optimized out>, srcSliceY=<optimized out>, srcSliceH=256, 
    dst=<optimized out>, dstStride=0x7ffd52b1cd10) at libswscale/swscale.c:969
#3  0x00007f4d21f62d2b in ffmpeg::(anonymous namespace)::transformImage (
    context=0x564bca2e59c0, srcSlice=0x564bca35f100, srcStride=0x564bca35f140, 
    inFormat=..., outFormat=..., 
    out=0x564bca4c5b60 "\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\025\016\030\024\r\027\024\r\031\024\r\031\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\023\f\030\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\024\r\031\023\f\030\023\f\030\023\f"..., planes=0x7ffd52b1cd20, lines=0x7ffd52b1cd10)
    at /root/vision/torchvision/csrc/cpu/decoder/video_sampler.cpp:46
#4  0x00007f4d21f639a8 in ffmpeg::VideoSampler::sample (this=0x564bca5a4220, 
    srcSlice=0x564bca35f100, srcStride=0x564bca35f140, out=0x564bca4c3490)
    at /root/vision/torchvision/csrc/cpu/decoder/video_sampler.cpp:182
#5  0x00007f4d21f63c1e in ffmpeg::VideoSampler::sample (this=0x564bca5a4220, 
    frame=0x564bca35f100, out=0x564bca4c3490)

The segafults only occur when MMX/SSE/AVX optimizations are enabled on FFmpeg

fmassa · 2020-09-18T14:53:08Z

I believe this issue might be a bug in FFmpeg introduced in FFmpeg/FFmpeg@fc6a588, and that has been fixed in FFmpeg/FFmpeg@ba3e771

The bug report for this issue was in https://trac.ffmpeg.org/ticket/8747

If that's the case, then recompiling FFmpeg would solve the issue.

andfoy · 2020-09-18T18:51:51Z

Effectively, this issue is directly related to the regression introduced in 4.3 and fixed in FFmpeg/FFmpeg@ba3e771. On FFmpeg 4.2 video reader tests pass

bjuncek · 2020-10-15T17:26:39Z

Given that this was a known issue from ffmpeg, and is fixed by using a different version, I'm closing this issue

bjuncek added bug high priority module: video labels Sep 8, 2020

pytorch-probot bot added the triage review label Sep 8, 2020

andfoy mentioned this issue Sep 16, 2020

PR: Add PyTorch FFmpeg to wheel and conda distributions #2596

Merged

bjuncek mentioned this issue Sep 17, 2020

(WIP) Initial implementation of the new videoReader API #2683

Merged

4 tasks

andfoy mentioned this issue Sep 18, 2020

PR: Build FFmpeg 4.2 pytorch/builder#524

Merged

andfoy mentioned this issue Sep 29, 2020

Make video_reader backend be available in the binaries #2365

Closed

bjuncek closed this as completed Oct 15, 2020

fmassa mentioned this issue Feb 11, 2021

Segmentation fault for Python 3.9 for video #3367

Closed

fmassa mentioned this issue Aug 9, 2021

Recommend a working conda installation sequence of torchvision, ffmpeg, opencv (both python and shared libraries + includes) together #4260

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VideoReader segfault on SOME videos. #2650

VideoReader segfault on SOME videos. #2650

bjuncek commented Sep 8, 2020 •

edited by pytorch-probot bot

Loading

andfoy commented Sep 16, 2020 •

edited

Loading

fmassa commented Sep 18, 2020 •

edited

Loading

andfoy commented Sep 18, 2020

bjuncek commented Oct 15, 2020

VideoReader segfault on SOME videos. #2650

VideoReader segfault on SOME videos. #2650

Comments

bjuncek commented Sep 8, 2020 • edited by pytorch-probot bot Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

Suggested fix

andfoy commented Sep 16, 2020 • edited Loading

fmassa commented Sep 18, 2020 • edited Loading

andfoy commented Sep 18, 2020

bjuncek commented Oct 15, 2020

bjuncek commented Sep 8, 2020 •

edited by pytorch-probot bot

Loading

andfoy commented Sep 16, 2020 •

edited

Loading

fmassa commented Sep 18, 2020 •

edited

Loading