Opened 4 years ago
Closed 3 years ago
#8849 closed defect (fixed)
sub2video does not work with overlay_cuda
Reported by: | Bogdan Ilisei | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | unspecified | Keywords: | |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary of the bug:
sub2video seems to fail when attempting to overlay (burn-in) a dvb subtitle to a video using overlay_cuda
This seems to work fine when using a transparent png as a second input, and a similar chain (with the same source) works fine when using the normal overlay filter, by using hwdownload/hwupload, while retaining hardware decoding/encoding.
Sample file: https://0x0.st/iYuU.ts
I based the filter logic on these examples: https://patchwork.ffmpeg.org/project/ffmpeg/patch/20200318071955.2329-1-yyyaroslav@gmail.com/
How to reproduce:
# ./ffmpeg_npp -v verbose -report -dump_filtergraph fmt=dot:filename=./graph.dot -nostats -vsync 0 -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid -c:v h264_cuvid -i in.ts -filter_complex "[0:s] format=yuva420p,hwupload [0s]; [0:v] scale_npp=format=yuv420p [0v]; [0v][0s] overlay_cuda [v]" -map "[v]" -map 0:a -c:v h264_nvenc -preset medium -b:v 5M -bufsize 10M -profile:v main -temporal-aq 1 -acodec copy -copy_unknown -f mpegts -y out.ts ffmpeg started on 2020-08-14 at 02:34:38 Report written to "ffmpeg-20200814-023438.log" Log level: 48 ffmpeg version N-98725-gcfc6552032 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --pkg-config=pkg-config --pkg-config-flags=--static --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --extra-cflags='-mtune=generic' --extra-cflags=-O3 --enable-static --disable-shared --prefix=/home/ibm86/ffmpeg-windows-build-helpers/sandbox/cross_compilers/native --enable-nonfree --enable-libfdk-aac libavutil 56. 58.100 / 56. 58.100 libavcodec 58.100.100 / 58.100.100 libavformat 58. 50.100 / 58. 50.100 libavdevice 58. 11.101 / 58. 11.101 libavfilter 7. 87.100 / 7. 87.100 libswscale 5. 8.100 / 5. 8.100 libswresample 3. 8.100 / 3. 8.100 [h264 @ 0x564b2f71bb00] Reinit context to 1920x1088, pix_fmt: yuv420p [h264 @ 0x564b2f71bb00] Increasing reorder buffer to 2 [mpegts @ 0x564b2f7156c0] max_analyze_duration 5000000 reached at 5016000 microseconds st:1 WARNING: defaulting hwaccel_output_format to cuda for compatibility with old commandlines. This behaviour is DEPRECATED and will be removed in the future. Please explicitly set "-hwaccel_output_format cuda". Input #0, mpegts, from 'in.ts': Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s Program 1 Metadata: service_name : Service01 service_provider: FFmpeg Stream #0:0[0x100]: Video: h264 (High), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) [h264_mp4toannexb @ 0x564b2f7eef40] The input looks like it is Annex B already [h264_cuvid @ 0x564b315982c0] CUVID capabilities for h264_cuvid: [h264_cuvid @ 0x564b315982c0] 8 bit: supported: 1, min_width: 48, max_width: 4096, min_height: 16, max_height: 4096 [h264_cuvid @ 0x564b315982c0] 10 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0 [h264_cuvid @ 0x564b315982c0] 12 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0 Stream mapping: Stream #0:0 (h264_cuvid) -> scale_npp Stream #0:3 (dvbsub) -> format overlay_cuda -> Stream #0:0 (h264_nvenc) Stream #0:1 -> #0:1 (copy) Stream #0:2 -> #0:2 (copy) Press [q] to stop, [?] for help [h264_cuvid @ 0x564b315982c0] Formats: Original: cuda | HW: cuda | SW: nv12 [mpegts @ 0x564b2f7156c0] sub2video: using 1920x1080 canvas [graph 0 input from stream 0:3 @ 0x564b2f902ac0] w:1920 h:1080 pixfmt:bgra tb:1/90000 fr:0/1 sar:0/1 [graph 0 input from stream 0:0 @ 0x564b2f903740] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:25/1 sar:1/1 [auto_scaler_0 @ 0x564b2f906a00] w:iw h:ih flags:'bilinear' interl:0 [Parsed_format_0 @ 0x564b30f98540] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 0:3' and the filter 'Parsed_format_0' [Parsed_scale_npp_2 @ 0x564b30f99600] w:1920 h:1080 -> w:1920 h:1080 [auto_scaler_0 @ 0x564b2f906a00] w:1920 h:1080 fmt:bgra sar:0/1 -> w:1920 h:1080 fmt:yuva420p sar:0/1 flags:0x2 [Parsed_overlay_cuda_3 @ 0x564b2f901ac0] [framesync @ 0x564b2f901bf8] Sync level 2 [h264_nvenc @ 0x564b2f8125c0] Using input frames context (format cuda) with h264_nvenc encoder. [h264_nvenc @ 0x564b2f8125c0] Loaded Nvenc version 10.0 [h264_nvenc @ 0x564b2f8125c0] Nvenc initialized successfully [h264_nvenc @ 0x564b2f8125c0] Temporal AQ enabled. [mpegts @ 0x564b2f8c8d80] service 1 using PCR in pid=256, pcr_period=80ms [mpegts @ 0x564b2f8c8d80] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms Output #0, mpegts, to 'out.ts': Metadata: encoder : Lavf58.50.100 Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 25 fps, 90k tbn, 25 tbc (default) Metadata: encoder : Lavc58.100.100 h264_nvenc Side data: cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000 vbv_delay: N/A Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s Error while add the frame to buffer source(Internal bug, should not have happened). Error while filtering: Internal bug, should not have happened Failed to inject frame into filter network: Internal bug, should not have happened Error while processing the decoded data for stream #0:0 [AVIOContext @ 0x564b2f7fc100] Statistics: 0 seeks, 0 writeouts [h264_nvenc @ 0x564b2f8125c0] Nvenc unloaded [AVIOContext @ 0x564b2f71e580] Statistics: 5525648 bytes read, 2 seeks Conversion failed!
This seems to be working fine with a transparent PNG, for example:
# ./ffmpeg_npp -v verbose -nostats -vsync 0 -init_hw_device cuda=cuda -filter_hw_device cuda -hwaccel cuvid -c:v h264_cuvid -i in.ts -i t.png -filter_complex "[1:v] format=yuva420p,hwupload [0s]; [0:v] scale_npp=format=yuv420p [0v]; [0v][0s] overlay_cuda=shortest=false [v]" -map "[v]" -map 0:a -c:v h264_nvenc -preset medium -b:v 5M -bufsize 10M -profile:v main -temporal-aq 1 -acodec copy -copy_unknown -f mpegts -y out.ts ffmpeg version N-98725-gcfc6552032 Copyright (c) 2000-2020 the FFmpeg developers built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04) configuration: --pkg-config=pkg-config --pkg-config-flags=--static --disable-libxcb --disable-debug --enable-cuda-llvm --enable-cuvid --enable-nvenc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --extra-cflags='-mtune=generic' --extra-cflags=-O3 --enable-static --disable-shared --prefix=/home/ibm86/ffmpeg-windows-build-helpers/sandbox/cross_compilers/native --enable-nonfree --enable-libfdk-aac libavutil 56. 58.100 / 56. 58.100 libavcodec 58.100.100 / 58.100.100 libavformat 58. 50.100 / 58. 50.100 libavdevice 58. 11.101 / 58. 11.101 libavfilter 7. 87.100 / 7. 87.100 libswscale 5. 8.100 / 5. 8.100 libswresample 3. 8.100 / 3. 8.100 [h264 @ 0x555fe4920b00] Reinit context to 1920x1088, pix_fmt: yuv420p [h264 @ 0x555fe4920b00] Increasing reorder buffer to 2 [mpegts @ 0x555fe491a640] max_analyze_duration 5000000 reached at 5016000 microseconds st:1 WARNING: defaulting hwaccel_output_format to cuda for compatibility with old commandlines. This behaviour is DEPRECATED and will be removed in the future. Please explicitly set "-hwaccel_output_format cuda". Input #0, mpegts, from 'in.ts': Duration: 00:00:15.93, start: 1.400000, bitrate: 8561 kb/s Program 1 Metadata: service_name : Service01 service_provider: FFmpeg Stream #0:0[0x100]: Video: h264 (High), 1 reference frame ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, top first, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 25 fps, 50 tbr, 90k tbn, 50 tbc Stream #0:1[0x101](rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s Stream #0:2[0x102](qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s Stream #0:3[0x103](rum): Subtitle: dvb_subtitle ([6][0][0][0] / 0x0006) Input #1, png_pipe, from 't.png': Duration: N/A, bitrate: N/A Stream #1:0: Video: png, 1 reference frame, rgba(pc), 1024x721, 25 tbr, 25 tbn, 25 tbc [h264_mp4toannexb @ 0x555fe4a37400] The input looks like it is Annex B already [h264_cuvid @ 0x555fe4a3d580] CUVID capabilities for h264_cuvid: [h264_cuvid @ 0x555fe4a3d580] 8 bit: supported: 1, min_width: 48, max_width: 4096, min_height: 16, max_height: 4096 [h264_cuvid @ 0x555fe4a3d580] 10 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0 [h264_cuvid @ 0x555fe4a3d580] 12 bit: supported: 0, min_width: 0, max_width: 0, min_height: 0, max_height: 0 Stream mapping: Stream #0:0 (h264_cuvid) -> scale_npp Stream #1:0 (png) -> format overlay_cuda -> Stream #0:0 (h264_nvenc) Stream #0:1 -> #0:1 (copy) Stream #0:2 -> #0:2 (copy) Press [q] to stop, [?] for help [h264_cuvid @ 0x555fe4a3d580] Formats: Original: cuda | HW: cuda | SW: nv12 [graph 0 input from stream 1:0 @ 0x555fe674b980] w:1024 h:721 pixfmt:rgba tb:1/25 fr:25/1 sar:0/1 [graph 0 input from stream 0:0 @ 0x555fe674c740] w:1920 h:1080 pixfmt:cuda tb:1/90000 fr:25/1 sar:1/1 [auto_scaler_0 @ 0x555fe4b22440] w:iw h:ih flags:'bilinear' interl:0 [Parsed_format_0 @ 0x555fe4a32540] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 1:0' and the filter 'Parsed_format_0' [Parsed_scale_npp_2 @ 0x555fe4a0bf40] w:1920 h:1080 -> w:1920 h:1080 [auto_scaler_0 @ 0x555fe4b22440] w:1024 h:721 fmt:rgba sar:0/1 -> w:1024 h:721 fmt:yuva420p sar:0/1 flags:0x2 [Parsed_overlay_cuda_3 @ 0x555fe674a9c0] [framesync @ 0x555fe674aaf8] Sync level 2 [h264_nvenc @ 0x555fe69e3e40] Using input frames context (format cuda) with h264_nvenc encoder. [h264_nvenc @ 0x555fe69e3e40] Loaded Nvenc version 10.0 [h264_nvenc @ 0x555fe69e3e40] Nvenc initialized successfully [h264_nvenc @ 0x555fe69e3e40] Temporal AQ enabled. [mpegts @ 0x555fe4acd9c0] service 1 using PCR in pid=256, pcr_period=80ms [mpegts @ 0x555fe4acd9c0] muxrate VBR, sdt every 500 ms, pat/pmt every 100 ms Output #0, mpegts, to 'out.ts': Metadata: encoder : Lavf58.50.100 Stream #0:0: Video: h264 (h264_nvenc) (Main), 1 reference frame, cuda, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 25 fps, 90k tbn, 25 tbc (default) Metadata: encoder : Lavc58.100.100 h264_nvenc Side data: cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 10000000 vbv_delay: N/A Stream #0:1(rum): Audio: mp2 ([3][0][0][0] / 0x0003), 48000 Hz, stereo, fltp, 256 kb/s Stream #0:2(qaa): Audio: ac3 ([129][0][0][0] / 0x0081), 48000 Hz, 5.1(side), fltp, 640 kb/s [Parsed_overlay_cuda_3 @ 0x555fe674a9c0] [framesync @ 0x555fe674aaf8] Sync level 0 No more output streams to write to, finishing. frame= 354 fps=0.0 q=15.0 Lsize= 10349kB time=00:00:15.84 bitrate=5352.1kbits/s speed=16.8x video:8383kB audio:1628kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 3.378023% Input file #0 (in.ts): Input stream #0:0 (video): 714 packets read (14014289 bytes); 354 frames decoded; Input stream #0:1 (audio): 620 packets read (476160 bytes); Input stream #0:2 (audio): 465 packets read (1190400 bytes); Input stream #0:3 (subtitle): 0 packets read (0 bytes); Total: 1799 packets (15680849 bytes) demuxed Input file #1 (t.png): Input stream #1:0 (video): 1 packets read (10935 bytes); 1 frames decoded; Total: 1 packets (10935 bytes) demuxed Output file #0 (out.ts): Output stream #0:0 (video): 354 frames encoded; 354 packets muxed (8584346 bytes); Output stream #0:1 (audio): 620 packets muxed (476160 bytes); Output stream #0:2 (audio): 465 packets muxed (1190400 bytes); Total: 1439 packets (10250906 bytes) muxed [AVIOContext @ 0x555fe4a011c0] Statistics: 0 seeks, 41 writeouts [h264_nvenc @ 0x555fe69e3e40] Nvenc unloaded [AVIOContext @ 0x555fe4923580] Statistics: 21886488 bytes read, 2 seeks [AVIOContext @ 0x555fe49f3880] Statistics: 10935 bytes read, 0 seeks
Filter Graph - https://bit.ly/33ZjUE8
digraph G { node [shape=box] rankdir=LR "Parsed_format_0\n(format)" -> "Parsed_hwupload_1\n(hwupload)" [ label= "inpad:default -> outpad:default\nfmt:yuva420p w:1920 h:1080 tb:1/90000" ]; "Parsed_hwupload_1\n(hwupload)" -> "Parsed_overlay_cuda_3\n(overlay_cuda)" [ label= "inpad:default -> outpad:overlay\nfmt:cuda w:1920 h:1080 tb:1/90000" ]; "Parsed_scale_npp_2\n(scale_npp)" -> "Parsed_overlay_cuda_3\n(overlay_cuda)" [ label= "inpad:default -> outpad:main\nfmt:cuda w:1920 h:1080 tb:1/90000" ]; "Parsed_overlay_cuda_3\n(overlay_cuda)" -> "format\n(format)" [ label= "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ]; "graph 0 input from stream 0:3\n(buffer)" -> "auto_scaler_0\n(scale)" [ label= "inpad:default -> outpad:default\nfmt:bgra w:1920 h:1080 tb:1/90000" ]; "graph 0 input from stream 0:0\n(buffer)" -> "Parsed_scale_npp_2\n(scale_npp)" [ label= "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ]; "format\n(format)" -> "out_0_0\n(buffersink)" [ label= "inpad:default -> outpad:default\nfmt:cuda w:1920 h:1080 tb:1/90000" ]; "auto_scaler_0\n(scale)" -> "Parsed_format_0\n(format)" [ label= "inpad:default -> outpad:default\nfmt:yuva420p w:1920 h:1080 tb:1/90000" ]; }
Attachments (1)
Change History (3)
by , 4 years ago
Attachment: | ffmpeg-20200814-023438.log added |
---|
comment:1 by , 4 years ago
comment:2 by , 3 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Just tested now with latest builds, against the same CUDA version, and it seems to work just fine:
ffmpeg -threads 1 -v verbose -nostats -init_hw_device cuda=cuda:0 -hwaccel_device cuda -filter_hw_device cuda -extra_hw_frames 3 -hwaccel cuda -hwaccel_output_format cuda \ -reinit_filter 1 -filter_threads 1 -filter_complex_threads 1 \ -i "${INPUT}" \ -filter_complex "[0:s] scale=1920:1080,format=yuva420p,hwupload_cuda [sub]; [0:v] scale_npp=format=yuv420p [main]; [main][sub] overlay_cuda [v]" \ -c:v h264_nvenc -preset medium \ -map "[v]" -b:v 5M -minrate 2.5M -maxrate 7.5M -bufsize 10M -profile:v main -temporal-aq 1 \ -map 0:a -acodec libfdk_aac -vbr 5 \ -copy_unknown \ -f mpegts -y -
Follow up on this:
I'm not sure that sub2video is actually the main culprit here.
I copied the subtitle stream to an external file:
And I re-tried using cuda_overlay: