Opened 14 years ago
Closed 14 years ago
#272 closed defect (fixed)
altivec specific code segfaults on ppc when compiled with --enable-pic
Reported by: | Kim Nguyen | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avcodec |
Version: | git-master | Keywords: | ppc, PIC |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | yes | |
Analyzed by developer: | no |
Description
When compiled with --enable-pic, ffplay segfault when using code in libavcodec/ppc/fft_altivec_s.S on various files. For instance when decoding aac or ac3 audio streams. This is always reproducible and do not happen when disabling pic (the default):
Tested with latest git (47d2ca3205b53665328fe301879c339449db7a1d) but this has been the case for a long time I believe, I started remarking this when upgrading from mythtv 0.23 to 0.24, several months ago.
from master:
./configure --enable-pic
make
gdb ./ffplay_g ~myth/aac.dump
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x493b5460 (LWP 32531)]
0x10361e3c in ff_fft_calc_altivec () at libavcodec/ppc/fft_altivec_s.S:447
447 DECL_FFTS 0
Full gdb output attached (with bt, disass and register dump) attached.
I believe the cause of the crash to be the loading of a global symbol in a non pic compatible way.
Just doing ./configure and make, the file plays normally.
Attachments (2)
Change History (17)
by , 14 years ago
comment:1 by , 14 years ago
Status: | new → open |
---|---|
Version: | unspecified → git-master |
If you believe the crash did not happen in the past, please find the version introducing it.
comment:2 by , 14 years ago
Hi again,
the offending commit is this one (hurray for git bissect):
commit a46b84d1204d3cd2de14f08de29afee08f8f86d0
Author: Måns Rullgård <mans@mansr.com>
Date: Sun Jul 4 18:33:47 2010 +0000
PPC: convert Altivec FFT to pure assembler
On PPC a leaf function has a 288-byte red zone below the stack pointer,
sparing these functions the chore of setting up a full stack frame.
When a function call is disguised within an inline asm block, the
compiler might not adjust the stack pointer as required before a
function call, resulting in the red zone being clobbered.
Moving the entire function to pure asm avoids this problem and also
results in somewhat better code.
Originally committed as revision 24044 to svn://svn.ffmpeg.org/ffmpeg/trunk
Reverting this commit on top of master (which is f9ecb849ef39bc337d9439b829fe08da5c95cc3d at the moment) fixes the issue for me. I guess it went unnoticed for so long because
1) ppc is pretty rare these days
2) other software using ffmpeg as a shared library (thus using -fPIC) used to embbed it and link statically until fairly recently, or just shipped older versions.
(From the commit message though, reverting might introduce some other bugs that the commit fixed).
comment:3 by , 14 years ago
I compiled current FFmpeg on a G4 with --enable-pic (made sure about #define HAVE_ALTIVEC 1), tested two files containing aac audio, made sure ff_fft_calc_altivec() was called, but cannot reproduce a crash here.
(Your output is incomplete, unfortunately.)
comment:4 by , 14 years ago
I have confirmed this with a stream from my internet/tv box. The sample is available here:
http://idea.nguyen.vg/~kim/aac2.dump
(few seconds of a french commercial). Video is H264, Audio is AAC. When compiled with --enable-pic
with commit a46b84d1204 reverted, ffplay plays audio and video fine. With plain master
(f9ecb849ef3) it segfaults as explained above. The stream also plays fine on x86 vlc (just to check that it's not too corrupt).
Here are more infos and I'm attaching a gdb trace with full disassembly of the function (instead of pc-32, pc+32).
Machine is a G4 MacMini :
/proc/cpuinfo:
processor : 0
cpu : 7447A, altivec supported
clock : 1416.666661MHz
revision : 1.2 (pvr 8003 0102)
bogomips : 83.24
timebase : 41620997
platform : PowerMac
model : PowerMac10,1
machine : PowerMac10,1
motherboard : PowerMac10,1 MacRISC3 Power Macintosh
detected as : 287 (Mac mini)
pmac flags : 00000010
L2 cache : 512K unified
pmac-generation : NewWorld
Memory : 1024 MB
gcc version:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/powerpc-linux-gnu/gcc/powerpc-linux-gnu/4.5.2/lto-wrapper
Target: powerpc-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.2-8ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --with-multiarch-defaults=powerpc-linux-gnu --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib/powerpc-linux-gnu --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.5 --libdir=/usr/lib/powerpc-linux-gnu --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --enable-secureplt --disable-softfloat --enable-targets=powerpc-linux,powerpc64-linux --with-cpu=default32 --disable-werror --with-long-double-128 --enable-checking=release --build=powerpc-linux-gnu --host=powerpc-linux-gnu --target=powerpc-linux-gnu
Thread model: posix
gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4)
as version:
GNU assembler version 2.21.0 (powerpc-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.21.0.20110327
distrib is ubuntu natty, ppc (libc6 version is 2.13 egcs)
by , 14 years ago
Attachment: | full_gdb.txt added |
---|
GDB trace for the file aac2.dump with full disassembly
comment:5 by , 14 years ago
Since I am unable to reproduce the crash with the sample on a G4 after compiling with --enable-pic (and with altivec enabled):
Could you please provide complete, uncut output of ffmpeg when crashing?
(If ffmpeg does not crash, but ffplay does, please mention it and provide complete, uncut output of ffplay.)
$ ffmpeg -i aac2.dump -f null - ffmpeg version git-N-30662-gf9ecb84, Copyright (c) 2000-2011 the FFmpeg developers built on Jun 9 2011 09:58:10 with gcc 4.0.1 (Apple Inc. build 5493) configuration: --enable-pic libavutil 51. 8. 0 / 51. 8. 0 libavcodec 53. 6. 1 / 53. 6. 1 libavformat 53. 2. 0 / 53. 2. 0 libavdevice 53. 1. 1 / 53. 1. 1 libavfilter 2. 14. 1 / 2. 14. 1 libswscale 0. 14. 1 / 0. 14. 1 [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [aac @ 0x1012400] channel element 2.10 is not allocated [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [h264 @ 0x1010800] non-existing PPS referenced [h264 @ 0x1010800] non-existing PPS 0 referenced [h264 @ 0x1010800] decode_slice_header error [h264 @ 0x1010800] no frame! [mpegts @ 0x1005400] max_analyze_duration 5000000 reached at 5034667 Input #0, mpegts, from 'aac2.dump': Duration: 00:00:10.24, start: 3033.232422, bitrate: 1791 kb/s Program 1352 Stream #0.0[0x23](fra): Subtitle: [6][0][0][0] / 0x0006 Stream #0.1[0xa3]: Video: h264 (High), yuv420p, 480x576 [PAR 32:15 DAR 16:9], 29.06 fps, 25 tbr, 90k tbn, 50 tbc Stream #0.2[0x778](fra): Audio: aac, 48000 Hz, stereo, s16, 73 kb/s [buffer @ 0xc0e110] w:480 h:576 pixfmt:yuv420p tb:1/1000000 sar:32/15 sws_param: Output #0, null, to 'pipe:': Metadata: encoder : Lavf53.2.0 Stream #0.0: Video: rawvideo, yuv420p, 480x576 [PAR 32:15 DAR 16:9], q=2-31, 200 kb/s, 90k tbn, 25 tbc Stream #0.1(fra): Audio: pcm_s16be, 48000 Hz, stereo, s16, 1536 kb/s Stream mapping: Stream #0.1 -> #0.0 Stream #0.2 -> #0.1 Press [q] to stop, [?] for help [aac @ 0x1012400] channel element 2.10 is not allocated Error while decoding stream #0.2 frame= 85 fps= 39 q=0.0 Lsize= -0kB time=00:00:02.51 bitrate= -0.1kbits/s dup=13 drop=0 video:0kB audio:472kB global headers:0kB muxing overhead -100.004552%
comment:6 by , 14 years ago
Reproduced by developer: | set |
---|
Reproducible on Linux (not on OS X where gas-preprocessor.pl is used).
follow-up: 8 comment:7 by , 14 years ago
Are you sure this isn't specific to your compiler version?
LLVM and possible also gcc made an ABI change that optimized the TOC register r2 away.
The current code thus can't work anymore.
comment:8 by , 14 years ago
Replying to reimar:
Are you sure this isn't specific to your compiler version?
LLVM and possible also gcc made an ABI change that optimized the TOC register r2 away.
The current code thus can't work anymore.
I think that's what is happening. However I'm using a fairly standard compiler (which I think is the default in any recent distro: gcc 4.5.x) Also tested with gcc 4.4.5 with the same results so I guess this change in gcc is quite old. However I came up with a proof of concept patch to load the address of local symbols in a pic friendly way. (I use the standard trick of computing the space between a fixed label and the symbol to be loaded and then add this to the adresss of the fixed label which is determined at runtime using a phony branch instruction). I've tested it on linux ppc 32bit and I'm now able to play the above sample without problems.
diff --git a/libavcodec/ppc/asm.S b/libavcodec/ppc/asm.S index e372d53..30e0953 100644 --- a/libavcodec/ppc/asm.S +++ b/libavcodec/ppc/asm.S @@ -67,7 +67,11 @@ X(\name): .macro movrel rd, sym #if CONFIG_PIC - lwz \rd, \sym@got(r2) + bcl 20, 31, lab_pic_\@ +lab_pic_\@: + mflr \rd + addis \rd, \rd, (\sym - lab_pic_\@)@ha + addi \rd, \rd, (\sym - lab_pic_\@)@l #else lis \rd, \sym@ha la \rd, \sym@l(\rd)
comment:9 by , 14 years ago
Apple gcc-4.2 with --enable-pic fails compilation:
$ make AS libavcodec/ppc/fft_altivec_s.o libavcodec/ppc/fft_altivec_s.S:747:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:747:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:787:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:787:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:794:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:794:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:1282:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:1282:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:1322:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:1322:Invalid mnemonic 'got(r2)' libavcodec/ppc/fft_altivec_s.S:1329:Parameter syntax error (parameter 2) libavcodec/ppc/fft_altivec_s.S:1329:Invalid mnemonic 'got(r2)' make: *** [libavcodec/ppc/fft_altivec_s.o] Error 1
comment:10 by , 14 years ago
This is strange, my patch is removing the @got(r2), not adding it. I don't know why it remains
when compiled on osx. Maybe it's due to gas-preprocessor.pl or something.
comment:11 by , 14 years ago
(Sorry, I had originally written my comment before you posted your patch.)
Apple gcc 4.2 fails with current, unpatched FFmpeg git with --enable-pic
comment:12 by , 14 years ago
With the patch, compilation succeeds with Apple gcc 4.2 and decoding also works on Linux.
comment:13 by , 14 years ago
what performance effect does this patch have?
it could slow things down for cases that dont need the extra code
follow-up: 15 comment:14 by , 14 years ago
what performance effect does this patch have?
it could slow things down for cases that dont need the extra code
Depends on what you mean by 'cases that don't need the extra code'. If you compile without PIC, the patch is not used (#ifdef) so no performance loss here, and if you compile with -fPIC and a 'recent' gcc nothing works anyway (gcc 4.2 which triggers the bug was released in 2007). I can also confirm that ASM + patch remains much faster than using the plain C version of the function.
By the way, gcc is using the same trick to load global symbol in -fPIC code nowadays:
static int GLOBAL[3] = {1, 2, 3}; int f(int x) { return GLOBAL[x]; }
Compiling with gcc -O0 -S -c foo.c one can see that the address of GLOBAL is computed as so:
bcl 20,31,.L2 .L2: mflr 30 addis 30,30,.LCTOC1-.L2@ha addi 30,30,.LCTOC1-.L2@l
where .LCTOC1 is the label marking the begining of the TOC section. This is exactly what the patch does.
Anyway the movrel macro at issue is always followed by *several* memory accesses so one extra unconditional jump can hardly impact performances. I also belive this patch to be backward compatible so the code can still be compiled with older versions of gcc/as (this is untested though).
What the patch does not handle at the moment is the PPC64 architecture (I'm not even sure this arch is affected by this bug).
comment:15 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | open → closed |
Patch applied
i assume this fixes this ticket
Thanks for the patch
bt, disass, info all-registers of ffplay_g