Opened 3 years ago

Last modified 2 years ago

#9458 reopened defect

ffmpeg's outputs differ a lot when compiled with gcc and clang

Reported by: Shaohua Li Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: race
Cc: Shaohua Li Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug: For some reasons, I compiled ffmpeg with gcc11 and clang13. For some inputs, I found that these two ffmpeg binaries would emit outputs that differ a lot.

Compile args: ./configure

Compiler: gcc11 and clang13

How to reproduce: (run the following command with two ffmpeg compiled with different compiler, then compare the size of the outputs. I used -threads 1 to avoid possible threading issue.)

% ffmpeg -threads 1 -y -i  input_diff_1  -f mp4 output
ffmpeg version N-104353-g2c734a8496 Copyright (c) 2000-2021 the FFmpeg developers
  built with Ubuntu clang version 13.0.1-++20211015123032+cf15ccdeb6d5-1~exp1~20211015003613.5
  configuration: --cc=/usr/bin/clang13 --cxx=/usr/bin/clang++-13 --ld='/usr/bin/clang++-13 -fno-sanitize=vptr -fno-sanitize=vptr -fno-sanitize=vptr -std=c++11' --extra-cflags=-I/ffmpeg/repo/clang13-O1/ffmpeg_deps/include --extra-ldflags=-L/ffmpeg/repo/clang13-O1/ffmpeg_deps/lib --prefix=/ffmpeg/repo/clang13-O1/ffmpeg_deps --pkg-config-flags=--static --optflags=-O1 --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libopus --enable-libtheora --enable-libvorbis --enable-nonfree --disable-shared
  libavutil      57.  7.100 / 57.  7.100
  libavcodec     59. 12.100 / 59. 12.100
  libavformat    59.  6.100 / 59.  6.100
  libavdevice    59.  0.101 / 59.  0.101
  libavfilter     8. 14.100 /  8. 14.100
  libswscale      6.  1.100 /  6.  1.100
  libswresample   4.  0.100 /  4.  0.100
  libpostproc    56.  0.100 / 56.  0.100
[avi @ 0x3091f40] Something went wrong during header parsing, tag [24][229]q[0] has size 3220715849, I will ignore it and try to continue anyway.
Input #0, avi, from 'bugs/diff_1':
  Duration: 00:00:01.67, start: 0.000000, bitrate: 528 kb/s
  Stream #0:0: Video: rawvideo, pal8, 352x240, 29.97 fps, 29.97 tbr, 29.97 tbn
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> mpeg4 (native))
Press [q] to stop, [?] for help
[mpeg4 @ 0x30984c0] too many threads/slices (16), reducing to 15
Output #0, mp4, to '.output':
  Metadata:
    encoder         : Lavf59.6.100
  Stream #0:0: Video: mpeg4 (mp4v / 0x7634706D), yuv420p(tv, progressive), 352x240, q=2-31, 200 kb/s, 29.97 fps, 11988 tbn
    Metadata:
      encoder         : Lavc59.12.100 mpeg4
    Side data:
      cpb: bitrate max/min/avg: 0/0/200000 buffer size: 0 vbv_delay: N/A
[rawvideo @ 0x3096ac0] Packet too small (8)ime=00:00:00.00 bitrate=4241.0kbits/s speed=N/A    
Error while decoding stream #0:0: Invalid data found when processing input
[rawvideo @ 0x3096ac0] Packet too small (8)
Error while decoding stream #0:0: Invalid data found when processing input
frame=   40 fps=0.0 q=31.0 Lsize=     434kB time=00:00:01.30 bitrate=2732.4kbits/s dup=2 drop=0 speed=2.05x    
video:433kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.237240%

Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.

Attachments (1)

input_diff_1 (107.6 KB ) - added by Shaohua Li 3 years ago.
input_diff_1

Download all attachments as: .zip

Change History (9)

by Shaohua Li, 3 years ago

Attachment: input_diff_1 added

input_diff_1

comment:1 by Elon Musk, 3 years ago

Resolution: invalid
Status: newclosed

We can not fix compilers to your imagination.

comment:2 by Shaohua Li, 3 years ago

You might have misunderstood me. This is not a compiler issue. This unstable results are probably because of some issues in ffmpeg, for example, undefined behaviours. For this specific test case, if you compile ffmpeg with clang13 but with different optimisation flags (e.g., -O0 and -Os), the outputs would still be different (you may need to run it multiple times to observe the difference).

comment:3 by Shaohua Li, 3 years ago

Resolution: invalid
Status: closedreopened

Well, I would insist that this issue is important as clang and gcc both are top popular compilers. It's harmful to have inconsistent program results across compilers or compiler optimisations (as I have explained in my last comment). Inconsistent results mean that at least one (potentially multiple) compiled binary has incorrect semantics.

Since I reported this issue, I've been trying to debug ffmpeg to find the root cause of such difference. I used gdb (with rr) to debug gcc11 and clang13 compiled ffmpeg_g with optflags=-O0.

I found that in libavcodec/motion_est.c:914, s->mpvencdsp.pix_sum(pix, s->linesize) would be evaluated differently by these two ffmpegs. This difference only appeared when analysing some of the streams. Because "pix_sum" seems to be in the form of assembly code, I was not able to continue my analysis. I provided detailed reproduce procedures below in case you're interested in.

1) run gdb :

% gdb --args ffmpeg_g  -threads  1 -y -i   input_diff_1  -f mp4 output 

2) set breakpoint at ffmpeg.c:4817:

% b ffmpeg.c:4817 

3) start the program and continue until you hit the breakpoint 11 times.

4) set breakpoint at motion_est.c:914 and continue; Note that, you need to make sure that when you hit this breakpoint, mb_x and mb_y should both be 0. This is important since ffmpeg is by default multi-threading and somehow -threads 1 cannot make it single-threaded in gdb.

5) check the value of sum. For clang13 compiled ffmpeg, it was 60160 while for gcc11, it was 57751.

comment:4 by Elon Musk, 3 years ago

Can you confirm that it also happens with -cpuflags 0 ? as first input option.

comment:5 by Shaohua Li, 3 years ago

Hi, I ran ffmpeg with -cpuflags 0 -threads 1 -y -i input_diff_1 -f mp4 output and it still happened.

comment:6 by Carl Eugen Hoyos, 2 years ago

Component: ffmpegundetermined
Resolution: invalid
Status: reopenedclosed

Please reopen if you can also reproduce once you force one encoder thread.

comment:7 by mkver, 2 years ago

Component: undeterminedavcodec
Keywords: race added
Resolution: invalid
Status: closedreopened

There is a race in the MPEG-4 encoder (which is of course unreproducible with one encoder thread). Reads in ff_mpeg4_pred_dc() are not synced with writes in ff_mpeg4_pred_dc() as well as ff_clean_intra_table_entries(). The same races happen in the vsynth*-mpeg4-thread-tests (with x from 1..3, _lena).

But there is also a use of an uninitialized value in swscale; this is probably the real issue here:

==4058042== Use of uninitialised value of size 8
==4058042==    at 0x105675C: palToY_c (input.c:476)
==4058042==    by 0x10B9E8B: lum_convert (hscale.c:108)
==4058042==    by 0x10464B5: swscale (swscale.c:464)
==4058042==    by 0x10464B5: scale_internal (swscale.c:1043)
==4058042==    by 0x1046DC0: sws_receive_slice (swscale.c:1175)
==4058042==    by 0x1046E7D: sws_scale_frame (swscale.c:1190)
==4058042==    by 0x44225E: scale_frame (vf_scale.c:844)
==4058042==    by 0x442699: filter_frame (vf_scale.c:860)
==4058042==    by 0x305CF5: ff_filter_frame_framed (avfilter.c:990)
==4058042==    by 0x305CF5: ff_filter_frame_to_filter (avfilter.c:1134)
==4058042==    by 0x305CF5: ff_filter_activate_default (avfilter.c:1183)
==4058042==    by 0x305CF5: ff_filter_activate (avfilter.c:1342)
==4058042==    by 0x30A057: push_frame (buffersrc.c:169)
==4058042==    by 0x30A057: av_buffersrc_add_frame_flags (buffersrc.c:258)
==4058042==    by 0x2DC51D: ifilter_send_frame (ffmpeg.c:2037)
==4058042==    by 0x2DC51D: send_frame_to_filters (ffmpeg.c:2106)
==4058042==    by 0x2DCA76: decode_video.constprop.0 (ffmpeg.c:2295)
==4058042==    by 0x2DF394: process_input_packet (ffmpeg.c:2449)
==4058042==    by 0x2DF394: process_input (ffmpeg.c:3886)
==4058042==    by 0x2DF394: transcode_step (ffmpeg.c:4021)
==4058042==    by 0x2DF394: transcode (ffmpeg.c:4072)
==4058042==  Uninitialised value was created by a heap allocation
==4058042==    at 0x484DE30: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==4058042==    by 0x484DF92: posix_memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==4058042==    by 0x10D6E64: av_malloc (mem.c:105)
==4058042==    by 0x10C1BA9: av_buffer_alloc (buffer.c:82)
==4058042==    by 0xA4599F: raw_decode (rawdec.c:248)
==4058042==    by 0x7A377C: decode_simple_internal (decode.c:307)
==4058042==    by 0x7A377C: decode_simple_receive_frame (decode.c:563)
==4058042==    by 0x7A377C: decode_receive_frame_internal (decode.c:584)
==4058042==    by 0x7A4487: avcodec_send_packet (decode.c:665)
==4058042==    by 0x2DC76D: decode (ffmpeg.c:2084)
==4058042==    by 0x2DC76D: decode_video.constprop.0 (ffmpeg.c:2209)
==4058042==    by 0x2DF394: process_input_packet (ffmpeg.c:2449)
==4058042==    by 0x2DF394: process_input (ffmpeg.c:3886)
==4058042==    by 0x2DF394: transcode_step (ffmpeg.c:4021)
==4058042==    by 0x2DF394: transcode (ffmpeg.c:4072)
==4058042==    by 0x2B652A: main (ffmpeg.c:4243)
==4058042== 
==4058042== Use of uninitialised value of size 8
==4058042==    at 0x10567A4: palToUV_c (input.c:489)
==4058042==    by 0x10BA0DF: chr_convert (hscale.c:227)
==4058042==    by 0x1046016: swscale (swscale.c:471)
==4058042==    by 0x1046016: scale_internal (swscale.c:1043)
==4058042==    by 0x1046DC0: sws_receive_slice (swscale.c:1175)
==4058042==    by 0x1046E7D: sws_scale_frame (swscale.c:1190)
==4058042==    by 0x44225E: scale_frame (vf_scale.c:844)
==4058042==    by 0x442699: filter_frame (vf_scale.c:860)
==4058042==    by 0x305CF5: ff_filter_frame_framed (avfilter.c:990)
==4058042==    by 0x305CF5: ff_filter_frame_to_filter (avfilter.c:1134)
==4058042==    by 0x305CF5: ff_filter_activate_default (avfilter.c:1183)
==4058042==    by 0x305CF5: ff_filter_activate (avfilter.c:1342)
==4058042==    by 0x30A057: push_frame (buffersrc.c:169)
==4058042==    by 0x30A057: av_buffersrc_add_frame_flags (buffersrc.c:258)
==4058042==    by 0x2DC51D: ifilter_send_frame (ffmpeg.c:2037)
==4058042==    by 0x2DC51D: send_frame_to_filters (ffmpeg.c:2106)
==4058042==    by 0x2DCA76: decode_video.constprop.0 (ffmpeg.c:2295)
==4058042==    by 0x2DF394: process_input_packet (ffmpeg.c:2449)
==4058042==    by 0x2DF394: process_input (ffmpeg.c:3886)
==4058042==    by 0x2DF394: transcode_step (ffmpeg.c:4021)
==4058042==    by 0x2DF394: transcode (ffmpeg.c:4072)
==4058042==  Uninitialised value was created by a heap allocation
==4058042==    at 0x484DE30: memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==4058042==    by 0x484DF92: posix_memalign (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==4058042==    by 0x10D6E64: av_malloc (mem.c:105)
==4058042==    by 0x10C1BA9: av_buffer_alloc (buffer.c:82)
==4058042==    by 0xA4599F: raw_decode (rawdec.c:248)
==4058042==    by 0x7A377C: decode_simple_internal (decode.c:307)
==4058042==    by 0x7A377C: decode_simple_receive_frame (decode.c:563)
==4058042==    by 0x7A377C: decode_receive_frame_internal (decode.c:584)
==4058042==    by 0x7A4487: avcodec_send_packet (decode.c:665)
==4058042==    by 0x2DC76D: decode (ffmpeg.c:2084)
==4058042==    by 0x2DC76D: decode_video.constprop.0 (ffmpeg.c:2209)
==4058042==    by 0x2DF394: process_input_packet (ffmpeg.c:2449)
==4058042==    by 0x2DF394: process_input (ffmpeg.c:3886)
==4058042==    by 0x2DF394: transcode_step (ffmpeg.c:4021)
==4058042==    by 0x2DF394: transcode (ffmpeg.c:4072)
==4058042==    by 0x2B652A: main (ffmpeg.c:4243)
==4058042== 

Last edited 2 years ago by mkver (previous) (diff)

comment:8 by Carl Eugen Hoyos, 2 years ago

For the given sample this is a regression since 100167451af5b385c7c82e214e10bff410ba3516

Note: See TracTickets for help on using tickets.