Opened 13 years ago
Closed 12 years ago
#1227 closed defect (fixed)
crash in ff_put_h264_chroma_mc8_neon
Reported by: | elioxia | Owned by: | |
---|---|---|---|
Priority: | important | Component: | avcodec |
Version: | git-master | Keywords: | arm crash h264 |
Cc: | bruce-wu | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
attached video: 1024x384 h264 annexb byte stream
the video size is important (half of 1024x768) as a similar full sized video works fine
reproducible on master (fc882b6e), 6.5, 10.2
built with gcc4.4.3 (android ndk r7b), gcc4.6.2 (linaro android toolchain)
tested on various android devices
Attachments (3)
Change History (23)
by , 13 years ago
comment:1 by , 13 years ago
Keywords: | arm added |
---|
Please provide gdb backtrace etc. as described on http://ffmpeg.org/bugreports.html
comment:2 by , 12 years ago
Not reproduceable on OMAP4 Panda board you will have to provide a backtrace or its very unlikely we will be able to fix this.
by , 12 years ago
Attachment: | bug-test-video.mp4 added |
---|
format: mp4, video: h264, audio: aac, size:672x384
comment:3 by , 12 years ago
I have the same problem. My test android phone is google nexus S, its information is:
CPU FEATURES: swp half thumb fastmult vfp edsp thumbee neon vfpv3
ANDROID VERSION: 4.03
I use android-ndk-r8b to build ffmpeg for Android.
The test video is attached. Here is the detailed information about this video:
format: mp4
video codec: h24
audio codec: aac
size: 672x384
And gdb backtrack extracted by logcat:
I/DEBUG ( 77): * * * * * * * * * * * * * * * *
I/DEBUG ( 77): Build fingerprint: 'unknown'
I/DEBUG ( 77): pid: 1035, tid: 1071 >>> com.abc.ui <<<
I/DEBUG ( 77): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 52f2601d
I/DEBUG ( 77): r0 530dc600 r1 52f2601d r2 00000160 r3 00000000
I/DEBUG ( 77): r4 00000018 r5 00000000 r6 00000000 r7 00000000
I/DEBUG ( 77): r8 5305fd5b r9 00000180 10 0000009d fp 00000001
I/DEBUG ( 77): ip 00000028 sp 5251eae4 lr 50e8e834 pc 511040dc cpsr 60000010
I/DEBUG ( 77): d0 1818181818181818 d1 2828282828282828
I/DEBUG ( 77): d2 0046004f0051004f d3 003e003e003e003f
I/DEBUG ( 77): d4 8080808080808080 d5 8080808080808080
I/DEBUG ( 77): d6 7b7b7b7b7c7d7e7e d7 7b7b7b7b7b7c7d7e
I/DEBUG ( 77): d8 0000000042900000 d9 43f0000000000000
I/DEBUG ( 77): d10 0000000000000000 d11 0000000000000000
I/DEBUG ( 77): d12 0000000000000000 d13 0000000000000000
I/DEBUG ( 77): d14 0000000000000000 d15 0000000000000000
I/DEBUG ( 77): d16 7b7b7b7b7b7c7d7e d17 7b7b7b7b7b7c7d7e
I/DEBUG ( 77): d18 1ed81f181f581f80 d19 1ec01ec01ec01ec0
I/DEBUG ( 77): d20 00480048004d0050 d21 003e003e003f0045
I/DEBUG ( 77): d22 007e007e007e007e d23 007f007f007f007f
I/DEBUG ( 77): d24 0101010101010101 d25 ffffffffffffffff
I/DEBUG ( 77): d26 ffffffffffffffff d27 2c2c2c2c2c2c2c2c
I/DEBUG ( 77): d28 1f1f1f2026292827 d29 1f1f1f2026292827
I/DEBUG ( 77): d30 1f1f1f1f1f1f2026 d31 1f1f1f1f1f1f2026
I/DEBUG ( 77): scr 20000012
I/DEBUG ( 77):
I/DEBUG ( 77): #00 pc 0047b0dc /data/data/com.abc.ui/lib/libffmpeg.so (ff_put_h264_chroma_mc8_neon)
I/DEBUG ( 77): #01 lr 50e8e834 /data/data/com.abc.ui/lib/libffmpeg.so
I/DEBUG ( 77):
I/DEBUG ( 77): code around pc:
I/DEBUG ( 77): 511040bc f3c50801 f4214ac2 f3c62c00 f3c72801 .....J!..,...(..
I/DEBUG ( 77): 511040cc f5d1f000 f2b45105 f2ca0870 f2ca1872 .....Q..p...r...
I/DEBUG ( 77): 511040dc f4216ac2 f2b67107 f44007d2 f44017d2 .j!..q....@...@.
I/DEBUG ( 77): 511040ec caffffef e8bd80f0 e92d40f0 e1cd41d4 .........@-..A..
I/DEBUG ( 77): 511040fc e1a0e000 f5d1f000 f7d1f002 e0170594 ................
I/DEBUG ( 77):
I/DEBUG ( 77): code around lr:
I/DEBUG ( 77): 50e8e814 e1a01006 e7942002 e59d007c e58dc004 ..... ..|.......
I/DEBUG ( 77): 50e8e824 e59d3068 e58de000 e59dc08c e12fff3c h0..........<./.
I/DEBUG ( 77): 50e8e834 e28dd044 e8bd8ff0 e59d0034 e0831000 D.......4.......
I/DEBUG ( 77): 50e8e844 e285300f e1530001 aaffff7d e59d1030 .0....S.}...0...
I/DEBUG ( 77): 50e8e854 e289300f e0822001 e1530002 b3a02000 .0... ....S.. ..
I/DEBUG ( 77):
I/DEBUG ( 77): memory map around addr 52f2601d:
I/DEBUG ( 77): 52f14000-52f26000
I/DEBUG ( 77): (no map for address)
I/DEBUG ( 77): 52f2d000-52f2e000 /dev/pvrsrvkm
I/DEBUG ( 77):
I/DEBUG ( 77): stack:
I/DEBUG ( 77): 5251eaa4 0000046c
I/DEBUG ( 77): 5251eaa8 5251ead0
I/DEBUG ( 77): 5251eaac 5118d618 /data/data/com.abc.ui/lib/libffmpeg.so
I/DEBUG ( 77): 5251eab0 51741c20
I/DEBUG ( 77): 5251eab4 51741aa0
I/DEBUG ( 77): 5251eab8 504c2140
I/DEBUG ( 77): 5251eabc 00000aac
I/DEBUG ( 77): 5251eac0 000019e0
I/DEBUG ( 77): 5251eac4 5118d618 /data/data/com.abc.ui/lib/libffmpeg.so
I/DEBUG ( 77): 5251eac8 5118d618 /data/data/com.abc.ui/lib/libffmpeg.so
I/DEBUG ( 77): 5251eacc 5047e010
I/DEBUG ( 77): 5251ead0 00000000
I/DEBUG ( 77): 5251ead4 00000002
I/DEBUG ( 77): 5251ead8 df0027ad
I/DEBUG ( 77): 5251eadc 00000000
I/DEBUG ( 77): 5251eae0 00000005
I/DEBUG ( 77): #00 5251eae4 5047e010
I/DEBUG ( 77): 5251eae8 000000c0
I/DEBUG ( 77): 5251eaec 52f253bd
I/DEBUG ( 77): 5251eaf0 0000332c
I/DEBUG ( 77): 5251eaf4 50e8e834 /data/data/com.abc.ui/lib/libffmpeg.so
I/DEBUG ( 77): 5251eaf8 00000005
I/DEBUG ( 77): 5251eafc 00000000
I/DEBUG ( 77): 5251eb00 504810de
I/DEBUG ( 77): 5251eb04 504810fe
I/DEBUG ( 77): 5251eb08 5048111e
I/DEBUG ( 77): 5251eb0c 00003000
I/DEBUG ( 77): 5251eb10 00003808
I/DEBUG ( 77): 5251eb14 00003808
I/DEBUG ( 77): 5251eb18 504c3450
I/DEBUG ( 77): 5251eb1c 0004213b
I/DEBUG ( 77): 5251eb20 00000001
I/DEBUG ( 77): 5251eb24 00000000
I/DEBUG ( 77): 5251eb28 00000180
follow-up: 5 comment:4 by , 12 years ago
By the way, I use ffmpeg 0.8.10 and the bug can be reproduced on ffmpeg 0.11
comment:5 by , 12 years ago
Replying to bruce-wu:
By the way, I use ffmpeg 0.8.10 and the bug can be reproduced on ffmpeg 0.11
Please provide a backtrace for the crash with FFmpeg 0.11 (or, even better, current git head) as explained on http://ffmpeg.org/bugreports.html
comment:6 by , 12 years ago
Cc: | added |
---|
comment:7 by , 12 years ago
Keywords: | crash added |
---|
comment:8 by , 12 years ago
After investigation, I found what the problem is: read memory outside of the bound of the array pointed to by register R1 in MACRO h264_chroma_mc8 or MACRO h264_chroma_mc4 in libavcodec/arm/h264dsp_neon.S (verion 0.8.10) or libavcodec/arm/h264cmc_neon.S(version 0.11.1). I fixed the bug by modifying those two macros. Here is updated macros in version 0.8.10:
/* chroma_mc8(uint8_t *dst, uint8_t *src, int stride, int h, int x, int y) */
.macro h264_chroma_mc8 type
function ff_\type\()_h264_chroma_mc8_neon, export=1
.ifc \type,avg
mov lr, r0
.endif
muls r7, r4, r5
rsb r6, r7, r5, lsl #3
rsb ip, r7, r4, lsl #3
sub r4, r7, r4, lsl #3
sub r4, r4, r5, lsl #3
add r4, r4, #64
beq 2f
vdup.8 d0, r4
lsl r4, r2, #1
vdup.8 d1, ip
vld1.64 {d4, d5}, [r1], r4
vdup.8 d2, r6
vdup.8 d3, r7
vext.8 d5, d4, d5, #1
1:
vld1.64 {d6, d7}, [r5], r4
pld [r5]
vmull.u8 q8, d4, d0
vext.8 d7, d6, d7, #1
vmlal.u8 q8, d5, d1
vld1.64 {d4, d5}, [r1], r4
vmlal.u8 q8, d6, d2
vext.8 d5, d4, d5, #1
vmlal.u8 q8, d7, d3
vmull.u8 q9, d6, d0
subs r3, r3, #2
vmlal.u8 q9, d7, d1
vmlal.u8 q9, d4, d2
vmlal.u8 q9, d5, d3
vrshrn.u16 d16, q8, #6
/*vld1.64 {d6, d7}, [r5], r4*/
pld [r1]
vrshrn.u16 d17, q9, #6
.ifc \type,avg
vld1.64 {d20}, [lr,:64], r2
vld1.64 {d21}, [lr,:64], r2
vrhadd.u8 q8, q8, q10
.endif
/*vext.8 d7, d6, d7, #1*/
vst1.64 {d16}, [r0,:64], r2
vst1.64 {d17}, [r0,:64], r2
bgt 1b
beq 4f
3:
pld [r5]
vmull.u8 q8, d4, d0
vmlal.u8 q8, d6, d1
vmull.u8 q9, d6, d0
vmlal.u8 q9, d4, d1
vrshrn.u16 d16, q8, #6
vrshrn.u16 d17, q9, #6
.ifc \type,avg
vld1.64 {d20}, [lr,:64], r2
vld1.64 {d21}, [lr,:64], r2
vrhadd.u8 q8, q8, q10
.endif
subs r3, r3, #2
pld [r1]
vst1.64 {d16}, [r0,:64], r2
vst1.64 {d17}, [r0,:64], r2
bgt 3b
4: vld1.64 {d4, d5}, [r1], r2
vld1.64 {d6, d7}, [r1], r2
vext.8 d5, d4, d5, #1
vext.8 d7, d6, d7, #1
pld [r1]
subs r3, r3, #2
vmull.u8 q8, d4, d0
vmlal.u8 q8, d5, d1
vmull.u8 q9, d6, d0
vmlal.u8 q9, d7, d1
pld [r1]
vrshrn.u16 d16, q8, #6
vrshrn.u16 d17, q9, #6
.ifc \type,avg
vld1.64 {d20}, [lr,:64], r2
vld1.64 {d21}, [lr,:64], r2
vrhadd.u8 q8, q8, q10
.endif
vst1.64 {d16}, [r0,:64], r2
vst1.64 {d17}, [r0,:64], r2
bgt 4b
endfunc
.endm
/* chroma_mc4(uint8_t *dst, uint8_t *src, int stride, int h, int x, int y) */
.macro h264_chroma_mc4 type
function ff_\type\()_h264_chroma_mc4_neon, export=1
.ifc \type,avg
mov lr, r0
.endif
muls r7, r4, r5
rsb r6, r7, r5, lsl #3
rsb ip, r7, r4, lsl #3
sub r4, r7, r4, lsl #3
sub r4, r4, r5, lsl #3
add r4, r4, #64
beq 2f
vdup.8 d0, r4
lsl r4, r2, #1
vdup.8 d1, ip
vld1.64 {d4}, [r1], r4
vdup.8 d2, r6
vdup.8 d3, r7
vext.8 d5, d4, d5, #1
vtrn.32 d0, d1
vtrn.32 d2, d3
vtrn.32 d4, d5
1:
vld1.64 {d6}, [r5], r4
pld [r5]
vext.8 d7, d6, d7, #1
vmull.u8 q8, d4, d0
vtrn.32 d6, d7
vld1.64 {d4}, [r1], r4
vmlal.u8 q8, d6, d2
vext.8 d5, d4, d5, #1
vmull.u8 q9, d6, d0
vtrn.32 d4, d5
vmlal.u8 q9, d4, d2
vadd.i16 d16, d16, d17
vadd.i16 d17, d18, d19
vrshrn.u16 d16, q8, #6
subs r3, r3, #2
pld [r1]
.ifc \type,avg
vld1.32 {d20[0]}, [lr,:32], r2
vld1.32 {d20[1]}, [lr,:32], r2
vrhadd.u8 d16, d16, d20
.endif
vst1.32 {d16[0]}, [r0,:32], r2
vst1.32 {d16[1]}, [r0,:32], r2
bgt 1b
beq 4f
vext.32 d1, d0, d1, #1
add r5, r1, r2
lsl r4, r2, #1
vld1.32 {d4[0]}, [r1], r4
3:
vld1.32 {d4[1]}, [r5], r4
pld [r5]
vmull.u8 q8, d4, d0
vld1.32 {d4[0]}, [r1], r4
vmull.u8 q9, d4, d1
vadd.i16 d16, d16, d17
vadd.i16 d17, d18, d19
vrshrn.u16 d16, q8, #6
.ifc \type,avg
vld1.32 {d20[0]}, [lr,:32], r2
vld1.32 {d20[1]}, [lr,:32], r2
vrhadd.u8 d16, d16, d20
.endif
subs r3, r3, #2
pld [r1]
vst1.32 {d16[0]}, [r0,:32], r2
vst1.32 {d16[1]}, [r0,:32], r2
bgt 3b
4: vld1.64 {d4}, [r1], r2
vld1.64 {d6}, [r1], r2
vext.8 d5, d4, d5, #1
vext.8 d7, d6, d7, #1
vtrn.32 d4, d5
vtrn.32 d6, d7
vadd.i16 d16, d16, d17
vadd.i16 d17, d18, d19
pld [r1]
vrshrn.u16 d16, q8, #6
.ifc \type,avg
vld1.32 {d20[0]}, [lr,:32], r2
vld1.32 {d20[1]}, [lr,:32], r2
vrhadd.u8 d16, d16, d20
.endif
pld [r1]
vst1.32 {d16[0]}, [r0,:32], r2
vst1.32 {d16[1]}, [r0,:32], r2
bgt 4b
endfunc
.endm
As shown in the code, register R1 points to ARRAY src (type is uint_t*). The idea in the modification is to test if register R3 (ARGUMENT h in caller of C program) is less or equal zero before reading elements pointed to by registe R1. If it is, then skip reading and jump to the end of function.
I tested the code using several videos, and it works. For version 0.11.1, the modification is the same.
comment:9 by , 12 years ago
please submit a "git format-patch" or a normal patch as generated by git diff of the relevant changes. Its not possible nor reliable to extract from the posted code what has been changed
follow-up: 11 comment:10 by , 12 years ago
I had the same problem on iOS, and did a quick test porting the changes to the ffmpeg version we use (same neon file as 0.11.2), works fine!
comment:11 by , 12 years ago
Replying to Gregory:
I had the same problem on iOS, and did a quick test porting the changes to the ffmpeg version we use (same neon file as 0.11.2), works fine!
Please consider either posting the patch here or sending it to ffmpeg-devel (where it will probably receive more attention) to allow other users to profit from the fix!
by , 12 years ago
Attachment: | 0001-modify-macro-h264_chroma_mc8-and-h264_chroma_mc4-pat.patch added |
---|
a patch for bug #1227
follow-up: 13 comment:12 by , 12 years ago
A patch for bug #1227 is attached. I use git version of ffmpeg, which I guess it's newest.
comment:13 by , 12 years ago
follow-up: 15 comment:14 by , 12 years ago
This patch may slow the code down a little bit, but it works. The original code seems too ambitious to be correct. Anyway, it is just my proposal. If someone have more efficient code to fix the bug, I will appreciate it.
comment:15 by , 12 years ago
Replying to bruce-wu:
This patch may slow the code down a little bit, but it works.
Are all of your changes necessary to fix the crash?
follow-up: 19 comment:16 by , 12 years ago
I suggest 32byte-padded buffer as a workaround, but cannot understand why 16byte padding is not enough in the first place.
comment:17 by , 12 years ago
Keywords: | h264 added |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Ive applied the patch to fix the crash for now, patches that make the code faster without reintroducing the crash are very welcome !
comment:18 by , 12 years ago
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Ive made a mistake during testing, the patch is not working, it breaks h264 decoding as can be seen with "make fate" with --samples=...
Thus patch reverted, please submit a working patch and make sure that speed loss is kept to a minimum if it cannot be entirely avoided
comment:19 by , 12 years ago
Replying to elioxia:
I suggest 32byte-padded buffer as a workaround, but cannot understand why 16byte padding is not enough in the first place.
The problem the way i understand it (i couldnt reproduce a crash) is that this code reads a line too much not just a few bytes
comment:20 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | reopened → closed |
This should have been fixed a month ago or so, please reopen if its still crashing with latest ffmpeg
sample video file