Opened 13 years ago

Closed 13 years ago

Last modified 12 years ago

#407 closed defect (fixed)

Assertion fails in h264_refs.c

Reported by: alex Owned by:
Priority: important Component: avcodec
Version: git-master Keywords: h264 abort crash
Cc: Blocked By:
Blocking: Reproduced by developer: yes
Analyzed by developer: no

Description

Platform: Iphone OS, Cortex-A8 CPU.
Library compiled with NEON asm and optimizations.

Sometimes (quite rarely) I'm getting crash in ff_put_pixels16_neon while decoding h.264 frames. There's a lot of decoding errors due to device load, network losses, etc. As I could understand, crash happens when decoder tries to fix/consume decoding errors.

Feel free to contact me if you need something (if this information is not detailed enough).

Callstack looks like:

ff_put_pixels16_neon
mc_dir_part
hl_motion
hl_decode_mb_simple
decode_mb
guess_mv
ff_er_frame_end
field_end
decode_frame
avcodec_decode_video2

Attachments (3)

crash_frame.h264 (786 bytes ) - added by alex 13 years ago.
Sample frame
frames (av_assert crash).zip (13.8 KB ) - added by alex 13 years ago.
assert.h264 (13.4 KB ) - added by Carl Eugen Hoyos 13 years ago.

Download all attachments as: .zip

Change History (22)

comment:1 by alex, 13 years ago

Oh, I forgot, SVN revision is 26402 (latest available on SVN).

comment:2 by Carl Eugen Hoyos, 13 years ago

Please test git head - see http://ffmpeg.org/download.html
And please provide complete backtrace: See http://ffmpeg.org/bugreports.html

in reply to:  2 comment:3 by alex, 13 years ago

Tried git head, same issue.
Yep, I should've know how to submit bugs, sorry.

GDB backtrace:

#0  0x00153d64 in ff_put_pixels16_neon () at common.h:46
#1  0x00200a6c in mc_dir_part (h=0x58a2000, pic=0x58c894c, n=<value temporarily unavailable, due to optimizations>, square=1, chroma_height=8, delta=0, list=0, dest_y=0xbe2610 '\200' <repeats 200 times>..., dest_cb=0xc12f10 '\200' <repeats 200 times>..., dest_cr=0xc31910 '\200' <repeats 200 times>..., src_x_offset=0, src_y_offset=0, qpix_op=0x58a3410, chroma_op=0x155808 <ff_put_h264_chroma_mc8_neon>, pixel_shift=0, chroma444=0) at libavcodec/h264.c:473
#2  0x0020165c in mc_part (h=0x58a2000, n=0, square=1, chroma_height=8, delta=0, dest_y=0xbe2610 '\200' <repeats 200 times>..., dest_cb=0xc12f10 '\200' <repeats 200 times>..., dest_cr=0xc31910 '\200' <repeats 200 times>..., x_offset=0, y_offset=0, qpix_put=0x58a3410, chroma_put=0x155808 <ff_put_h264_chroma_mc8_neon>, qpix_avg=0x58a3510, chroma_avg=0x155964 <ff_avg_h264_chroma_mc8_neon>, weight_op=0x58a44d4, weight_avg=0x58a44fc, list0=4096, list1=0, pixel_shift=0, chroma444=0) at libavcodec/h264.c:549
#3  0x00213a60 in hl_decode_mb_simple_8 (h=0x58a2000) at libavcodec/h264.c:696
#4  0x0021487c in ff_h264_hl_decode_mb (h=0x58a2000) at libavcodec/h264.c:2103
#5  0x001f8b10 in decode_mb (s=0x58a2000, ref=0) at libavcodec/error_resilience.c:59
#6  0x001f9cfc in guess_mv (s=0x58a2000) at libavcodec/error_resilience.c:414
#7  0x001fb65c in ff_er_frame_end (s=0x58a2000) at libavcodec/error_resilience.c:1066
#8  0x00204244 in field_end (h=0x58a2000, in_setup=<value temporarily unavailable, due to optimizations>) at libavcodec/h264.c:2418
#9  0x002176cc in decode_frame (avctx=0xb7f400, data=0x5e39dc0, data_size=0x2fee0c4c, avpkt=<value temporarily unavailable, due to optimizations>) at libavcodec/h264.c:3904
#10 0x0029b09c in avcodec_decode_video2 (avctx=0xb7f400, picture=0x5e39dc0, got_picture_ptr=0x2fee0c4c, avpkt=0x2fee0c10) at libavcodec/utils.c:769

Disasm near pc:

Dump of assembler code from 0x153d44 to 0x153d84:
0x00153d44 <ff_clear_blocks_neon+172>:	vst1.16	{d0-d1}, [r0, :128]!
0x00153d48 <ff_clear_blocks_neon+176>:	vst1.16	{d0-d1}, [r0, :128]!
0x00153d4c <ff_clear_blocks_neon+180>:	vst1.16	{d0-d1}, [r0, :128]!
0x00153d50 <ff_clear_blocks_neon+184>:	vst1.16	{d0-d1}, [r0, :128]!
0x00153d54 <ff_clear_blocks_neon+188>:	vst1.16	{d0-d1}, [r0, :128]!
0x00153d58 <ff_clear_blocks_neon+192>:	vst1.16	{d0-d1}, [r0, :128]!
0x00153d5c <ff_clear_blocks_neon+196>:	bx	lr
0x00153d60 <ff_put_h264_qpel16_mc00_neon+0>:	mov	r3, #16	; 0x10
0x00153d64 <ff_put_pixels16_neon+0>:	vld1.64	{d0-d1}, [r1], r2
0x00153d68 <ff_put_pixels16_neon+4>:	vld1.64	{d2-d3}, [r1], r2
0x00153d6c <ff_put_pixels16_neon+8>:	vld1.64	{d4-d5}, [r1], r2
0x00153d70 <ff_put_pixels16_neon+12>:	pld	[r1, r2, lsl #2]
0x00153d74 <ff_put_pixels16_neon+16>:	vld1.64	{d6-d7}, [r1], r2
0x00153d78 <ff_put_pixels16_neon+20>:	pld	[r1]
0x00153d7c <ff_put_pixels16_neon+24>:	pld	[r1, r2]
0x00153d80 <ff_put_pixels16_neon+28>:	pld	[r1, r2, lsl #1]

Registers:

r0             0xbe2610	12461584
r1             0x73d1a10	121444880
r2             0x40	64
r3             0x10	16
r4             0x58a2000	92938240
r5             0x73d1a10	121444880
r6             0x0	0
r7             0x2fee06dc	804128476
r8             0x0	0
r9             0x1	1
r10            0x0	0
r11            0x0	0
r12            0x2aa0	10912
sp             0x2fee0660	804128352
lr             0x200a6c	2099820
pc             0x153d64	1391972
cpsr           {
  0x80000010, 
  n = 0x1, 
  z = 0x0, 
  c = 0x0, 
  v = 0x0, 
  q = 0x0, 
  j = 0x0, 
  ge = 0x0, 
  e = 0x0, 
  a = 0x0, 
  i = 0x0, 
  f = 0x0, 
  t = 0x0, 
  mode = 0x10
}	{
  0x80000010, 
  n = 1, 
  z = 0, 
  c = 0, 
  v = 0, 
  q = 0, 
  j = 0, 
  ge = 0, 
  e = 0, 
  a = 0, 
  i = 0, 
  f = 0, 
  t = 0, 
  mode = usr
}

comment:4 by reimar, 13 years ago

Do you have a data dump that can be used to reproduce the issue?
Do you know whether or not the crash also occurs when running on x86 or with NEON support disabled (or even just this specific function disabled)?
Assuming it supports NEON, can you run this through valgrind?
Alignment seems sufficient (actually vld1 seems to not require any), so it seems likely this should not be ARM-specific.
Since it is the load instruction, it should be the source that is invalid.
Due to the edge emulation code, the MVs should not be able to cause this.
So the source picture probably is invalid.
Purely speculation, but a theory is that either it has been freed (though the data pointer usually should be 0-ed then) or it wasn't properly discarded in a size change and is too small.

in reply to:  4 comment:5 by alex, 13 years ago

No asm, no NEON, ARM CPU.
Similar crash:

#0  <unknown function> [inlined] () at :0
#1  <unknown function> [inlined] () at :0
#2  0x0018fca0 in ff_put_pixels16x16_8_c (dst=0x146de10 '\200' <repeats 200 times>..., src=0x7860a10 <Address 0x7860a10 out of bounds>, stride=96) at dsputil_template.c:0
#3  0x00283128 in mc_dir_part (h=0x587d000, pic=0x58a394c, n=<value temporarily unavailable, due to optimizations>, square=1, chroma_height=8, delta=0, list=0, dest_y=0x146de10 '\200' <repeats 200 times>..., dest_cb=0x149ab88 '\200' <repeats 200 times>..., dest_cr=0x149b388 '\200' <repeats 200 times>..., src_x_offset=0, src_y_offset=0, qpix_op=0x587e410, chroma_op=0x166770 <put_h264_chroma_mc8_8_c>, pixel_shift=0, chroma444=0) at libavcodec/h264.c:473
#4  0x00283d18 in mc_part (h=0x587d000, n=0, square=1, chroma_height=8, delta=0, dest_y=0x146de10 '\200' <repeats 200 times>..., dest_cb=0x149ab88 '\200' <repeats 200 times>..., dest_cr=0x149b388 '\200' <repeats 200 times>..., x_offset=0, y_offset=0, qpix_put=0x587e410, chroma_put=0x166770 <put_h264_chroma_mc8_8_c>, qpix_avg=0x587e510, chroma_avg=0x166e5c <avg_h264_chroma_mc8_8_c>, weight_op=0x587f4d4, weight_avg=0x587f4fc, list0=4096, list1=0, pixel_shift=0, chroma444=0) at libavcodec/h264.c:549
#5  0x002969dc in hl_decode_mb_simple_8 (h=0x587d000) at libavcodec/h264.c:696
#6  0x002977f8 in ff_h264_hl_decode_mb (h=0x587d000) at libavcodec/h264.c:2103
#7  0x0027aec4 in decode_mb (s=0x587d000, ref=0) at libavcodec/error_resilience.c:59
#8  0x0027c0b0 in guess_mv (s=0x587d000) at libavcodec/error_resilience.c:414
#9  0x0027da10 in ff_er_frame_end (s=0x587d000) at libavcodec/error_resilience.c:1066
#10 0x00286900 in field_end (h=0x587d000, in_setup=<value temporarily unavailable, due to optimizations>) at libavcodec/h264.c:2418
#11 0x0029a648 in decode_frame (avctx=0x1415a00, data=0x80c1b0, data_size=0x77a6c44, avpkt=<value temporarily unavailable, due to optimizations>) at libavcodec/h264.c:3904
#12 0x00326848 in avcodec_decode_video2 (avctx=0x1415a00, picture=0x80c1b0, got_picture_ptr=0x77a6c44, avpkt=0x77a6c00) at libavcodec/utils.c:769
0x0018fc80 <put_tpel_pixels_mc00_c+476>:	cmp	lr, r5
0x0018fc84 <put_tpel_pixels_mc00_c+480>:	orr	r3, r12, r3, lsl #8
0x0018fc88 <put_tpel_pixels_mc00_c+484>:	add	r1, r1, r2
0x0018fc8c <put_tpel_pixels_mc00_c+488>:	strh	r3, [r0], r2
0x0018fc90 <put_tpel_pixels_mc00_c+492>:	bne	0x18fc74 <put_tpel_pixels_mc00_c+464>
0x0018fc94 <put_tpel_pixels_mc00_c+496>:	pop	{r4, r5, r7, pc}
0x0018fc98 <ff_put_pixels16x16_8_c+0>:	push	{r4, r7, lr}
0x0018fc9c <ff_put_pixels16x16_8_c+4>:	add	r7, sp, #4	; 0x4
0x0018fca0 <ff_put_pixels16x16_8_c+8>:	ldrb	r3, [r1, #1]
0x0018fca4 <ff_put_pixels16x16_8_c+12>:	ldrb	r12, [r1]
0x0018fca8 <ff_put_pixels16x16_8_c+16>:	add	r4, r0, r2
0x0018fcac <ff_put_pixels16x16_8_c+20>:	add	r9, r2, r4
0x0018fcb0 <ff_put_pixels16x16_8_c+24>:	orr	r12, r12, r3, lsl #8
0x0018fcb4 <ff_put_pixels16x16_8_c+28>:	ldrb	r3, [r1, #2]
0x0018fcb8 <ff_put_pixels16x16_8_c+32>:	orr	r12, r12, r3, lsl #16
0x0018fcbc <ff_put_pixels16x16_8_c+36>:	ldrb	r3, [r1, #3]
r0             0x146de10	21421584
r1             0x7860a10	126224912
r2             0x60	96
r3             0x18fc98	1637528
r4             0x587d000	92786688
r5             0x7860a10	126224912
r6             0x0	0
r7             0x77a664c	125462092
r8             0x0	0
r9             0x1	1
r10            0x0	0
r11            0x0	0
r12            0x2aa0	10912
sp             0x77a6648	125462088
lr             0x283128	2634024
pc             0x18fca0	1637536
cpsr           {
  0x80000010, 
  n = 0x1, 
  z = 0x0, 
  c = 0x0, 
  v = 0x0, 
  q = 0x0, 
  j = 0x0, 
  ge = 0x0, 
  e = 0x0, 
  a = 0x0, 
  i = 0x0, 
  f = 0x0, 
  t = 0x0, 
  mode = 0x10
}	{
  0x80000010, 
  n = 1, 
  z = 0, 
  c = 0, 
  v = 0, 
  q = 0, 
  j = 0, 
  ge = 0, 
  e = 0, 
  a = 0, 
  i = 0, 
  f = 0, 
  t = 0, 
  mode = usr
}

comment:6 by Carl Eugen Hoyos, 13 years ago

Could you attach a sample?
Or upload to http://www.datafilehost.com/ ?

in reply to:  6 comment:7 by alex, 13 years ago

Replying to cehoyos:

Could you attach a sample?
Or upload to http://www.datafilehost.com/ ?

This is a realtime streaming from IP camera. I'll try to capture some frames, which reproduces the crash.

by alex, 13 years ago

Attachment: crash_frame.h264 added

Sample frame

comment:8 by alex, 13 years ago

Added sample frame, what been last before crash. I'll try to capture all the frames since last I-frame, if you need that.

Also, I've noticed another crash while decoding h.264. Assertion failed in:
h264_refs.c: 482
av_assert(h->long_ref_count + h->short_ref_count <= h->sps.ref_frame_count);

long_ref_count = 167772170
short_ref_count = -442687480
sps.ref_frame_count = 60829697

This values are odd, I guess. That never haven't happened with latest SVN revision.
Backtrace is:

#0  0x3348fa1c in __pthread_kill ()
#1  0x310c63ba in pthread_kill ()
#2  0x310bebfe in abort ()
#3  0x0023c664 in ff_h264_decode_ref_pic_marking (h=0x7271000, gb=<value temporarily unavailable, due to optimizations>) at libavcodec/h264_refs.c:482
#4  0x00208744 in decode_slice_header (h=0x7271000, h0=0x7271000) at libavcodec/h264.c:2917
#5  0x00217018 in decode_nal_units (h=0x7271000, buf=0x5e8afc0 "", buf_size=409) at libavcodec/h264.c:3697
#6  0x00217a18 in decode_frame (avctx=0xb9ba00, data=0x70b7f0, data_size=0x700cbe4, avpkt=<value temporarily unavailable, due to optimizations>) at libavcodec/h264.c:3884
#7  0x0029b3bc in avcodec_decode_video2 (avctx=0xb9ba00, picture=0x70b7f0, got_picture_ptr=0x700cbe4, avpkt=0x700cdd8) at libavcodec/utils.c:769

Also, I couldn't reproduce 'put_pixels16 crash' running on x86 (iPhone simulator). But, assertion failed if I run on ARM or x86 as well.
Probably AV occurs on x86 too, but memory being read is valid. Just in theory.

As result:
x86 crashes only on av_assert,
ARM crashes on av_assert and put_pixels16 as well.

comment:9 by Carl Eugen Hoyos, 13 years ago

I fear more frames are needed...

$ ffmpeg -i crash_frame.h264
ffmpeg version N-32061-g5b71ae2, Copyright (c) 2000-2011 the FFmpeg developers
  built on Aug 22 2011 17:01:52 with gcc 4.5.3
  configuration: --enable-libopencore-amrnb --enable-version3 --cc=/usr/local/gcc-4.5.3/bin/gcc
  libavutil    51. 13. 0 / 51. 13. 0
  libavcodec   53. 11. 0 / 53. 11. 0
  libavformat  53.  9. 0 / 53.  9. 0
  libavdevice  53.  3. 0 / 53.  3. 0
  libavfilter   2. 34. 2 /  2. 34. 2
  libswscale    2.  0. 0 /  2.  0. 0
[h264 @ 0x129f420] Format h264 detected only with low score of 1, misdetection possible!
[h264 @ 0x12a1500] non-existing PPS referenced
[h264 @ 0x12a1500] non-existing PPS 0 referenced
[h264 @ 0x12a1500] decode_slice_header error
[h264 @ 0x12a1500] no frame!
[h264 @ 0x129f420] Could not find codec parameters (Video: h264)
[h264 @ 0x129f420] Estimating duration from bitrate, this may be inaccurate
crash_frame.h264: could not find codec parameters

by alex, 13 years ago

comment:10 by alex, 13 years ago

Attached frame sequence that reproduces crash on av_assert. Codec is h.264 of course.

by Carl Eugen Hoyos, 13 years ago

Attachment: assert.h264 added

comment:11 by Carl Eugen Hoyos, 13 years ago

Keywords: h264 assertion added; Crash Error resilience removed
Priority: normalimportant
Reproduced by developer: set
Status: newopen
Summary: Crash in ff_put_pixels16_neon (EXC_BAD_ACCESS)Assertion fails in h264_refs.c
Version: unspecifiedgit-master
$ ffmpeg -v 9 -loglevel 99 -i assert.h264
ffmpeg version N-32157-g0629b1f, Copyright (c) 2000-2011 the FFmpeg developers
  built on Aug 30 2011 09:51:47 with gcc 4.5.3
  configuration: --cc=/usr/local/gcc-4.5.3/bin/gcc
  libavutil    51. 14. 0 / 51. 14. 0
  libavcodec   53. 12. 0 / 53. 12. 0
  libavformat  53. 10. 0 / 53. 10. 0
  libavdevice  53.  3. 0 / 53.  3. 0
  libavfilter   2. 37. 0 /  2. 37. 0
  libswscale    2.  0. 0 /  2.  0. 0
[h264 @ 0x12a0420] Format h264 probed with size=2048 and score=51
[h264 @ 0x12a2540] Unsupported bit depth: 0
[h264 @ 0x12a2540] Unknown NAL code: 18 (31 bits)
    Last message repeated 3 times
[h264 @ 0x12a2540] Unknown NAL code: 18 (29 bits)
[h264 @ 0x12a2540] Unknown NAL code: 18 (31 bits)
[h264 @ 0x12a2540] Unknown NAL code: 18 (30 bits)
[h264 @ 0x12a2540] Unknown NAL code: 18 (31 bits)
    Last message repeated 2 times
[h264 @ 0x12a2540] Unknown NAL code: 18 (30 bits)
[h264 @ 0x12a2540] reference picture missing during reorder
[h264 @ 0x12a2540] reference count overflow
[h264 @ 0x12a2540] decode_slice_header error
[h264 @ 0x12a2540] Unknown NAL code: 18 (31 bits)
    Last message repeated 1 times
[h264 @ 0x12a2540] mmco: unref short failure
[h264 @ 0x12a2540] concealing 2 DC, 2 AC, 2 MV errors
Assertion h->long_ref_count + h->short_ref_count <= h->sps.ref_frame_count failed at libavcodec/h264_refs.c:482
Aborted

in reply to:  11 comment:12 by alex, 13 years ago

Maybe it would be better to open another ticket for assertion issue?
Crashes in ff_put_pixels16_neon still exist.
Also, is it more convenient to you to receive sample in one piece?

comment:13 by Carl Eugen Hoyos, 13 years ago

Does the crash happen before the assert?

Do you believe it is more convenient to unzip a larger file, concatenate the samples to get a smaller file that can be fed to ffmpeg?

Reimar may have been right that there is a size change involved:

$ ffmpeg -i assert.h264 -f null -
ffmpeg version N-32157-g0629b1f, Copyright (c) 2000-2011 the FFmpeg developers
  built on Aug 30 2011 09:51:47 with gcc 4.5.3
  configuration: --cc=/usr/local/gcc-4.5.3/bin/gcc
  libavutil    51. 14. 0 / 51. 14. 0
  libavcodec   53. 12. 0 / 53. 12. 0
  libavformat  53. 10. 0 / 53. 10. 0
  libavdevice  53.  3. 0 / 53.  3. 0
  libavfilter   2. 37. 0 /  2. 37. 0
  libswscale    2.  0. 0 /  2.  0. 0
[h264 @ 0x12a2540] reference picture missing during reorder
[h264 @ 0x12a2540] reference count overflow
[h264 @ 0x12a2540] decode_slice_header error
[h264 @ 0x12a2540] mmco: unref short failure
[h264 @ 0x12a2540] concealing 2 DC, 2 AC, 2 MV errors
[h264 @ 0x12a2540] number of reference frames (0+2) exceeds max (0; probably corrupt input), discarding one
[h264 @ 0x12a0420] Estimating duration from bitrate, this may be inaccurate

Seems stream 0 codec frame rate differs from container frame rate: 50.00 (50/1) -> 25.00 (50/2)
Input #0, h264, from 'assert.h264':
  Duration: N/A, bitrate: N/A
    Stream #0.0: Video: h264, yuv420p, 16x32, 25 fps, 25 tbr, 1200k tbn, 50 tbc
[buffer @ 0x12a0360] w:16 h:32 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
Output #0, null, to 'pipe:':
  Metadata:
    encoder         : Lavf53.10.0
    Stream #0.0: Video: rawvideo (I420 / 0x30323449), yuv420p, 16x32, q=2-31, 200 kb/s, 90k tbn, 25 tbc
Stream mapping:
  Stream #0.0 -> #0.0: h264 -> rawvideo
Press [q] to stop, [?] for help
[buffer @ 0x12a0360] Buffer video input changed from size:16x32 fmt:yuv420p to size:640x480 fmt:yuv420p
[buffer @ 0x12a0360] Inserting scaler filter
[buffersink @ 0x12a6aa0] auto-inserting filter 'Input equalizer' between the filter 'src' and the filter 'out'
[scale @ 0x12a3500] w:640 h:480 fmt:yuv420p -> w:16 h:32 fmt:yuv420p flags:0x2
[h264 @ 0x12a2540] reference picture missing during reorder
[h264 @ 0x12a2540] reference count overflow
[h264 @ 0x12a2540] decode_slice_header error
[h264 @ 0x12a2540] mmco: unref short failure
[h264 @ 0x12a2540] concealing 2 DC, 2 AC, 2 MV errors
[h264 @ 0x12a2540] number of reference frames (0+2) exceeds max (0; probably corrupt input), discarding one
[buffer @ 0x12a0360] Buffer video input changed from size:640x480 fmt:yuv420p to size:16x32 fmt:yuv420p
[scale @ 0x12a3500] w:16 h:32 fmt:yuv420p -> w:16 h:32 fmt:yuv420p flags:0x2
frame=    5 fps=  0 q=0.0 Lsize=      -0kB time=00:00:00.20 bitrate=  -0.9kbits/s
video:0kB audio:0kB global headers:0kB muxing overhead -inf%

in reply to:  13 comment:14 by alex, 13 years ago

Replying to cehoyos:

Does the crash happen before the assert?

Which one?
Assertion crash occurs more often, so I'm stuck with it and it's complicated now to catch crash in put_pixels.

comment:15 by Carl Eugen Hoyos, 13 years ago

The assert is fixed, could you test if the crash you originally saw is still reproducible?

comment:16 by alex, 13 years ago

Frames from previous sample cause no problem now. But we still got problems with the assert. Also, I've seen crash in ff_put_pixels16_neon once (no sample was captured).
New sample attached;

comment:17 by Carl Eugen Hoyos, 13 years ago

Resolution: fixed
Status: openclosed

comment:18 by Carl Eugen Hoyos, 12 years ago

Keywords: abort added; assertion removed

comment:19 by Carl Eugen Hoyos, 12 years ago

Keywords: crash added
Note: See TracTickets for help on using tickets.