Opened 5 years ago
Last modified 5 years ago
#8379 new defect
problem with h264_qsv, lookahead and scale_qsv
Reported by: | Francesco Santagata | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avfilter |
Version: | git-master | Keywords: | qsv |
Cc: | linjie.fu@intel.com, p4rancesc0@gmail.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
when using scale_qsv and lookahead with look_ahead_depth > 20 you get
Error while filtering: Cannot allocate memory
Failed to inject frame into filter network: Cannot allocate memory
instead using the same commandline with vpp_qsv it's working
How to reproduce:
% ffmpeg -v verbose -y -hwaccel qsv -init_hw_device qsv=qsv0:hw -filter_hw_device qsv0 -i tango-and-cash-short-640.mpg \ > -vf setfield=prog,hwupload=extra_hw_frames=80,scale_qsv=512:288:mode=2 -async_depth 1 \ > -acodec libfdk_aac -ab 64000 -profile:a aac_he -cutoff 18000 -metadata:s:a:0 language=ita \ > -vcodec h264_qsv -b:v 512000 -minrate 460800 -maxrate 563200 -bufsize 512000 -g 75 -forced_idr 1 -preset veryslow -rdo 0 -look_ahead 1 -look_ahead_depth 40 -profile:v high -level 4 not-working.mp4 ffmpeg version 4.2.1-moonshot Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7 (GCC) configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --mandir=/usr/share/man --extra-version=moonshot --enable-shared --enable-runtime-cpudetect --enable-gpl --enable-version3 --enable-postproc --enable-avfilter --enable-pthreads --enable-libgsm --enable-libxvid --enable-bzlib --enable-nonfree --enable-libx264 --disable-static --disable-debug --enable-libx265 --enable-pic --enable-libfreetype --enable-libfontconfig --enable-libfdk-aac --enable-libmfx --enable-libmp3lame --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' --extra-cflags='-fPIC -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/local/include -L=/usr/local/lib -Wl, -Bdynamic ' --disable-stripping libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 / 58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 / 55. 5.100 [AVHWDeviceContext @ 0xc0c080] Trying to use DRM render node for device 0. [AVHWDeviceContext @ 0xc0c080] libva: VA-API version 1.5.0 [AVHWDeviceContext @ 0xc0c080] libva: User requested driver 'iHD' [AVHWDeviceContext @ 0xc0c080] libva: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so [AVHWDeviceContext @ 0xc0c080] libva: Found init function __vaDriverInit_1_5 [AVHWDeviceContext @ 0xc0c080] libva: va_openDriver() returns 0 [AVHWDeviceContext @ 0xc0c080] Initialised VAAPI connection: version 1.5 [AVHWDeviceContext @ 0xc0c080] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0xc0c080] Driver not found in known nonstandard list, using standard behaviour. [AVHWDeviceContext @ 0xc0bb40] Initialize MFX session: API version is 1.30, implementation version is 1.30 [AVHWDeviceContext @ 0xc0bb40] MFX compile/runtime API: 1.30/1.30 [mpeg @ 0xc8c340] max_analyze_duration 5000000 reached at 5000000 microseconds st:0 Input #0, mpeg, from 'tango-and-cash-short-640.mpg': Duration: 00:01:00.72, start: 0.540000, bitrate: 534 kb/s Stream #0:0[0x1e0]: Video: mpeg2video (Main), 1 reference frame, yuv420p(tv, progressive, left), 640x480 [SAR 4:3 DAR 16:9], 25 fps, 25 tbr, 90k tbn, 50 tbc Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16p, 192 kb/s Stream mapping: Stream #0:0 -> #0:0 (mpeg2video (native) -> h264 (h264_qsv)) Stream #0:1 -> #0:1 (mp2 (native) -> aac (libfdk_aac)) Press [q] to stop, [?] for help [graph_1_in_0_1 @ 0xcf1f00] tb:1/48000 samplefmt:s16p samplerate:48000 chlayout:0x3 [format_out_0_1 @ 0xc0ab40] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_1' [auto_resampler_0 @ 0xcf5080] ch:2 chl:stereo fmt:s16p r:48000Hz -> ch:2 chl:stereo fmt:s16 r:48000Hz [graph 0 input from stream 0:0 @ 0xeac300] w:640 h:480 pixfmt:yuv420p tb:1/90000 fr:25/1 sar:4/3 sws_param:flags=2 [auto_scaler_0 @ 0xeabc00] w:iw h:ih flags:'bicubic' interl:0 [Parsed_hwupload_1 @ 0xeaaf40] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_setfield_0' and the filter 'Parsed_hwupload_1' [auto_scaler_0 @ 0xeabc00] w:640 h:480 fmt:yuv420p sar:4/3 -> w:640 h:480 fmt:nv12 sar:4/3 flags:0x4 [AVHWDeviceContext @ 0xeac000] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0xeac000] Driver not found in known nonstandard list, using standard behaviour. [AVHWDeviceContext @ 0xebff80] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0xebff80] Driver not found in known nonstandard list, using standard behaviour. [Parsed_scale_qsv_2 @ 0xeab540] Scaling mode: 2 [Parsed_scale_qsv_2 @ 0xeab540] w:640 h:480 -> w:512 h:288 [h264_qsv @ 0xca8940] Using the VBR with lookahead (LA) ratecontrol method [h264_qsv @ 0xca8940] MFMode:2 [AVHWDeviceContext @ 0xf39380] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0xf39380] Driver not found in known nonstandard list, using standard behaviour. [AVHWDeviceContext @ 0x1046400] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0x1046400] Driver not found in known nonstandard list, using standard behaviour. [h264_qsv @ 0xca8940] profile: high; level: 21 [h264_qsv @ 0xca8940] GopPicSize: 75; GopRefDist: 4; GopOptFlag: closed ; IdrInterval: 0 [h264_qsv @ 0xca8940] TargetUsage: 1; RateControlMethod: LA [h264_qsv @ 0xca8940] TargetKbps: 512; LookAheadDepth: 40; BRCParamMultiplier: 1 [h264_qsv @ 0xca8940] NumSlice: 1; NumRefFrame: 3 [h264_qsv @ 0xca8940] RateDistortionOpt: OFF [h264_qsv @ 0xca8940] RecoveryPointSEI: OFF IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0 [h264_qsv @ 0xca8940] MaxFrameSize: 110592; MaxSliceSize: 0; [h264_qsv @ 0xca8940] BitrateLimit: ON; MBBRC: OFF; ExtBRC: OFF [h264_qsv @ 0xca8940] Trellis: auto [h264_qsv @ 0xca8940] VDENC: OFF [h264_qsv @ 0xca8940] RepeatPPS: OFF; NumMbPerSlice: 0; LookAheadDS: off [h264_qsv @ 0xca8940] AdaptiveI: OFF; AdaptiveB: OFF; BRefType: pyramid [h264_qsv @ 0xca8940] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0 [h264_qsv @ 0xca8940] Entropy coding: CABAC; MaxDecFrameBuffering: 3 [h264_qsv @ 0xca8940] NalHrdConformance: OFF; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: OFF [h264_qsv @ 0xca8940] FrameRateExtD: 1; FrameRateExtN: 25 Output #0, mp4, to 'not-working.mp4': Metadata: encoder : Lavf58.29.100 Stream #0:0: Video: h264 (h264_qsv), 1 reference frame (avc1 / 0x31637661), qsv(left), 512x288 [SAR 1:1 DAR 16:9], q=-1--1, 512 kb/s, 25 fps, 12800 tbn, 25 tbc Metadata: encoder : Lavc58.54.100 h264_qsv Side data: cpb: bitrate max/min/avg: 563200/460800/512000 buffer size: 512000 vbv_delay: -1 Stream #0:1(ita): Audio: aac (libfdk_aac) (HE-AAC) (mp4a / 0x6134706D), 48000 Hz, stereo, s16, delay 5058, 64 kb/s Metadata: encoder : Lavc58.54.100 libfdk_aac Error while filtering: Cannot allocate memory Failed to inject frame into filter network: Cannot allocate memory Error while processing the decoded data for stream #0:0 [AVIOContext @ 0xc94740] Statistics: 0 seeks, 1 writeouts [libfdk_aac @ 0xc93280] 3 frames left in the queue on closing [AVIOContext @ 0xc94fc0] Statistics: 774288 bytes read, 2 seeks Conversion failed!
%ffmpeg -v verbose -y -hwaccel qsv -init_hw_device qsv=qsv0:hw -filter_hw_device qsv0 -i tango-and-cash-short-640.mpg \ > -vf setfield=prog,hwupload=extra_hw_frames=80,vpp_qsv=w=512:h=288 -async_depth 1 \ > -acodec libfdk_aac -ab 64000 -profile:a aac_he -cutoff 18000 -metadata:s:a:0 language=ita \ > -vcodec h264_qsv -b:v 512000 -minrate 460800 -maxrate 563200 -bufsize 512000 -g 75 -forced_idr 1 -preset veryslow -rdo 0 -look_ahead 1 -look_ahead_depth 40 -profile:v high -level 4 working.mp4 ffmpeg version 4.2.1-moonshot Copyright (c) 2000-2019 the FFmpeg developers built with gcc 7 (GCC) configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64 --mandir=/usr/share/man --extra-version=moonshot --enable-shared --enable-runtime-cpudetect --enable-gpl --enable-version3 --enable-postproc --enable-avfilter --enable-pthreads --enable-libgsm --enable-libxvid --enable-bzlib --enable-nonfree --enable-libx264 --disable-static --disable-debug --enable-libx265 --enable-pic --enable-libfreetype --enable-libfontconfig --enable-libfdk-aac --enable-libmfx --enable-libmp3lame --extra-cflags='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -fPIC' --extra-cflags='-fPIC -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -I/usr/local/include -L=/usr/local/lib -Wl, -Bdynamic ' --disable-stripping libavutil 56. 31.100 / 56. 31.100 libavcodec 58. 54.100 / 58. 54.100 libavformat 58. 29.100 / 58. 29.100 libavdevice 58. 8.100 / 58. 8.100 libavfilter 7. 57.100 / 7. 57.100 libswscale 5. 5.100 / 5. 5.100 libswresample 3. 5.100 / 3. 5.100 libpostproc 55. 5.100 / 55. 5.100 [AVHWDeviceContext @ 0x1710080] Trying to use DRM render node for device 0. [AVHWDeviceContext @ 0x1710080] libva: VA-API version 1.5.0 [AVHWDeviceContext @ 0x1710080] libva: User requested driver 'iHD' [AVHWDeviceContext @ 0x1710080] libva: Trying to open /opt/intel/mediasdk/lib64/iHD_drv_video.so [AVHWDeviceContext @ 0x1710080] libva: Found init function __vaDriverInit_1_5 [AVHWDeviceContext @ 0x1710080] libva: va_openDriver() returns 0 [AVHWDeviceContext @ 0x1710080] Initialised VAAPI connection: version 1.5 [AVHWDeviceContext @ 0x1710080] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0x1710080] Driver not found in known nonstandard list, using standard behaviour. [AVHWDeviceContext @ 0x170fb40] Initialize MFX session: API version is 1.30, implementation version is 1.30 [AVHWDeviceContext @ 0x170fb40] MFX compile/runtime API: 1.30/1.30 [mpeg @ 0x1790340] max_analyze_duration 5000000 reached at 5000000 microseconds st:0 Input #0, mpeg, from 'tango-and-cash-short-640.mpg': Duration: 00:01:00.72, start: 0.540000, bitrate: 534 kb/s Stream #0:0[0x1e0]: Video: mpeg2video (Main), 1 reference frame, yuv420p(tv, progressive, left), 640x480 [SAR 4:3 DAR 16:9], 25 fps, 25 tbr, 90k tbn, 50 tbc Stream #0:1[0x1c0]: Audio: mp2, 48000 Hz, stereo, s16p, 192 kb/s Stream mapping: Stream #0:0 -> #0:0 (mpeg2video (native) -> h264 (h264_qsv)) Stream #0:1 -> #0:1 (mp2 (native) -> aac (libfdk_aac)) Press [q] to stop, [?] for help [graph_1_in_0_1 @ 0x17f5f00] tb:1/48000 samplefmt:s16p samplerate:48000 chlayout:0x3 [format_out_0_1 @ 0x170eb40] auto-inserting filter 'auto_resampler_0' between the filter 'Parsed_anull_0' and the filter 'format_out_0_1' [auto_resampler_0 @ 0x17f9080] ch:2 chl:stereo fmt:s16p r:48000Hz -> ch:2 chl:stereo fmt:s16 r:48000Hz [graph 0 input from stream 0:0 @ 0x19afe80] w:640 h:480 pixfmt:yuv420p tb:1/90000 fr:25/1 sar:4/3 sws_param:flags=2 [auto_scaler_0 @ 0x19b17c0] w:iw h:ih flags:'bicubic' interl:0 [Parsed_hwupload_1 @ 0x19aee40] auto-inserting filter 'auto_scaler_0' between the filter 'Parsed_setfield_0' and the filter 'Parsed_hwupload_1' [auto_scaler_0 @ 0x19b17c0] w:640 h:480 fmt:yuv420p sar:4/3 -> w:640 h:480 fmt:nv12 sar:4/3 flags:0x4 [AVHWDeviceContext @ 0x19af100] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0x19af100] Driver not found in known nonstandard list, using standard behaviour. [AVHWDeviceContext @ 0x19e9200] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0x19e9200] Driver not found in known nonstandard list, using standard behaviour. [h264_qsv @ 0x17ac940] Using the VBR with lookahead (LA) ratecontrol method [h264_qsv @ 0x17ac940] MFMode:2 [AVHWDeviceContext @ 0x19de940] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0x19de940] Driver not found in known nonstandard list, using standard behaviour. [AVHWDeviceContext @ 0x1b6ef80] VAAPI driver: Intel iHD driver - 19.3.0. [AVHWDeviceContext @ 0x1b6ef80] Driver not found in known nonstandard list, using standard behaviour. [h264_qsv @ 0x17ac940] profile: high; level: 21 [h264_qsv @ 0x17ac940] GopPicSize: 75; GopRefDist: 4; GopOptFlag: closed ; IdrInterval: 0 [h264_qsv @ 0x17ac940] TargetUsage: 1; RateControlMethod: LA [h264_qsv @ 0x17ac940] TargetKbps: 512; LookAheadDepth: 40; BRCParamMultiplier: 1 [h264_qsv @ 0x17ac940] NumSlice: 1; NumRefFrame: 3 [h264_qsv @ 0x17ac940] RateDistortionOpt: OFF [h264_qsv @ 0x17ac940] RecoveryPointSEI: OFF IntRefType: 0; IntRefCycleSize: 0; IntRefQPDelta: 0 [h264_qsv @ 0x17ac940] MaxFrameSize: 110592; MaxSliceSize: 0; [h264_qsv @ 0x17ac940] BitrateLimit: ON; MBBRC: OFF; ExtBRC: OFF [h264_qsv @ 0x17ac940] Trellis: auto [h264_qsv @ 0x17ac940] VDENC: OFF [h264_qsv @ 0x17ac940] RepeatPPS: OFF; NumMbPerSlice: 0; LookAheadDS: off [h264_qsv @ 0x17ac940] AdaptiveI: OFF; AdaptiveB: OFF; BRefType: pyramid [h264_qsv @ 0x17ac940] MinQPI: 0; MaxQPI: 0; MinQPP: 0; MaxQPP: 0; MinQPB: 0; MaxQPB: 0 [h264_qsv @ 0x17ac940] Entropy coding: CABAC; MaxDecFrameBuffering: 3 [h264_qsv @ 0x17ac940] NalHrdConformance: OFF; SingleSeiNalUnit: ON; VuiVclHrdParameters: OFF VuiNalHrdParameters: OFF [h264_qsv @ 0x17ac940] FrameRateExtD: 1; FrameRateExtN: 25 Output #0, mp4, to 'working.mp4': Metadata: encoder : Lavf58.29.100 Stream #0:0: Video: h264 (h264_qsv), 1 reference frame (avc1 / 0x31637661), qsv(left), 512x288 [SAR 4:3 DAR 64:27], q=-1--1, 512 kb/s, 25 fps, 12800 tbn, 25 tbc Metadata: encoder : Lavc58.54.100 h264_qsv Side data: cpb: bitrate max/min/avg: 563200/460800/512000 buffer size: 512000 vbv_delay: -1 Stream #0:1(ita): Audio: aac (libfdk_aac) (HE-AAC) (mp4a / 0x6134706D), 48000 Hz, stereo, s16, delay 5058, 64 kb/s Metadata: encoder : Lavc58.54.100 libfdk_aac No more output streams to write to, finishing.e=00:00:57.28 bitrate= 549.2kbits/s speed=5.43x frame= 1500 fps=135 q=25.0 Lsize= 4265kB time=00:01:00.77 bitrate= 574.9kbits/s speed=5.47x video:3750kB audio:476kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.928828% Input file #0 (tango-and-cash-short-640.mpg): Input stream #0:0 (video): 1500 packets read (2556038 bytes); 1500 frames decoded; Input stream #0:1 (audio): 2531 packets read (1457856 bytes); 2531 frames decoded (2915712 samples); Total: 4031 packets (4013894 bytes) demuxed Output file #0 (working.mp4): Output stream #0:0 (video): 1500 frames encoded; 1500 packets muxed (3840285 bytes); Output stream #0:1 (audio): 1424 frames encoded (2915712 samples); 1427 packets muxed (487106 bytes); Total: 2927 packets (4327391 bytes) muxed [AVIOContext @ 0x1798740] Statistics: 2 seeks, 20 writeouts [AVIOContext @ 0x1798fc0] Statistics: 4698256 bytes read, 2 seeks
Thanks
Change History (9)
comment:1 by , 5 years ago
Cc: | added |
---|
comment:2 by , 5 years ago
Hi, I tried up to 800 frames, but nothing changes.
I tried first with full HD, then I scaled down the source file to 640 to 480, always same results.
Here's GEM while doing nothing:
23 objects, 544768 bytes
0 unbound objects, 0 bytes
23 bound objects, 544768 bytes
1 purgeable objects, 4096 bytes
10 mapped objects, 368640 bytes
0 display objects (pinned), 0 bytes
4294967296 [268435456] gtt total
[k]contexts: 10 objects, 225280 bytes (0 active, 225280 inactive, 225280 global, 0 shared, 0 unbound)
Here's the not working commmandline:
44 objects, 16154624 bytes
0 unbound objects, 0 bytes
44 bound objects, 16154624 bytes
16 purgeable objects, 15364096 bytes
11 mapped objects, 462848 bytes
0 display objects (pinned), 0 bytes
4294967296 [268435456] gtt total
[k]contexts: 10 objects, 225280 bytes (0 active, 225280 inactive, 225280 global, 0 shared, 0 unbound)
Here's the working commandline:
474 objects, 106700800 bytes
28 unbound objects, 14798848 bytes
277 bound objects, 33972224 bytes
1 purgeable objects, 4096 bytes
12 mapped objects, 557056 bytes
0 display objects (pinned), 0 bytes
4294967296 [268435456] gtt total
[k]contexts: 18 objects, 585728 bytes (0 active, 585728 inactive, 585728 global, 0 shared, 0 unbound)
ffmpeg: 447 objects, 105934848 bytes (3641344 active, 50814976 inactive, 1064960 global, 0 shared, 72728576 unbound)
B.R.
comment:3 by , 5 years ago
Version: | 4.2 → unspecified |
---|
Please confirm that the issue is reproducible with current FFmpeg git head.
comment:4 by , 5 years ago
I was able to reproduce the issue with:
ffmpeg version 4.2.1
intel-media-stack 19.3.0
linux kernel 3.10.0-693.21.1.el7.x86_64
B.R.
comment:5 by , 5 years ago
Cc: | added |
---|---|
Version: | unspecified → 4.2 |
comment:6 by , 5 years ago
Version: | 4.2 → unspecified |
---|
Please confirm that the issue is reproducible with current FFmpeg git head, the only version supported on this bug tracker.
comment:7 by , 5 years ago
Thanks Carl, confirmed that it could be reproduced with git head.
The root cause seems to be missing support for extra_hw_frames in scale_qsv.(which vpp_qsv supports)
Hi phrancesco, would you please help to try following patch?
(Please provided detailed information with the latest FFmpeg git head next time, thx)
follow-up: 9 comment:8 by , 5 years ago
Version: | unspecified → git-master |
---|
Thanks,
I applied the patch and built a new rpm using git head.
The problem is still there, lookahead depth <= 26 is working, lookahead depth > 26 doesn't work.
Same "Cannot allocate memory error."
BR
comment:9 by , 5 years ago
Replying to phrancesco:
Thanks,
I applied the patch and built a new rpm using git head.
The problem is still there, lookahead depth <= 26 is working, lookahead depth > 26 doesn't work.
Same "Cannot allocate memory error."
Hi phrancesco,
Would you please help to try setting extra_hw_frames to scale_qsv:
$ffmpeg ... scale_qsv=512:288:mode=2:extra_hw_frames=64 ...
(Above cmdline didn't set extra_hw_frames for these scale filters actually)
Actually, this works on my side with/without this patch, ff_filter_init_hw_frames() allocates enough memory/surface according to extra_hw_frames.
The reason for difference between vpp_qsv and scale_qsv is the default initial_pool_size:
vpp_qsv: 64
scale_qsv: 32
Maybe a patch here is necessary to keep an identical default behavior.
The log seems to show the allocated memory is not enough, you may try to set bigger extra_hw_frames to get it work. And it's a bit strange that qsv_vpp is able to work with 80 extra_hw_frames, while scale_qsv isn't.
Would you please help to compare the GEM object usage difference?