Opened 3 years ago
Last modified 8 weeks ago
#9471 open defect
EAC3 native encoder is only gapless in the beginning, not in the end
Reported by: | Balling | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avformat |
Version: | git-master | Keywords: | eac3 gapless mp4 editlist |
Cc: | MasterQuestionable | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description (last modified by )
Summary of the bug:
edts atom (editlist) has media time and media duration, yet even though media time is correctly written for EAC3 and AAC (native EAC3 encoder has 256 sample of silence (a.k.a. encoder delay) that are then removed from the beginning with native encoder and native AAC has 1024 samples that are also working great) the media duration is not correctly written. Even if it were to be correctly written media duration is not applied on decoding even for AAC, see for example https://bugs.chromium.org/p/chromium/issues/detail?id=668999 that is still present in git-master!!!
How to reproduce:
% ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -c:a eac3 outeac3.mp4 ffmpeg version N-104341-g933765aa0e-20211013 Copyright (c) 2000-2021 the FFmpeg developers built with gcc 10-win32 (GCC) 20210408 configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --enable-shared --disable-static --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libvorbis --enable-opencl --enable-libvmaf --enable-vulkan --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-avisynth --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libglslang --enable-libgme --enable-libass --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libmfx --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --disable-vaapi --enable-libvidstab --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20211013 libavutil 57. 7.100 / 57. 7.100 libavcodec 59. 12.100 / 59. 12.100 libavformat 59. 6.100 / 59. 6.100 libavdevice 59. 0.101 / 59. 0.101 libavfilter 8. 14.100 / 8. 14.100 libswscale 6. 1.100 / 6. 1.100 libswresample 4. 0.100 / 4. 0.100 libpostproc 56. 0.100 / 56. 0.100 Input #0, lavfi, from 'sine=frequency=1000:duration=5': Duration: N/A, start: 0.000000, bitrate: 705 kb/s Stream #0:0: Audio: pcm_s16le, 44100 Hz, mono, s16, 705 kb/s Stream mapping: Stream #0:0 -> #0:0 (pcm_s16le (native) -> eac3 (native)) Press [q] to stop, [?] for help Output #0, mp4, to 'outeac3.mp4': Metadata: encoder : Lavf59.6.100 Stream #0:0: Audio: eac3 (ec-3 / 0x332D6365), 44100 Hz, mono, fltp, 96 kb/s Metadata: encoder : Lavc59.12.100 eac3 size= 60kB time=00:00:05.00 bitrate= 98.2kbits/s speed= 487x video:0kB audio:59kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.166617%
Track duration: 5010 (0x00001392) - 5010 (0x1392) ms Media time: 256 (0x00000100) - 256 (0x100) ms Media rate: 65536 (0x00010000) - 1.000
Now compare it to aac: ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -c:a aac fileaac.mp4:
Track duration: 5000 (0x00001388) - 5000 (0x1388) ms Media time: 1024 (0x00000400) - 1024 (0x400) Media rate: 65536 (0x00010000) - 1.000
Unfortunately a) Mediainfo tracer is buggy in that part: https://github.com/MediaArea/MediaInfoLib/issues/1441
b) I am not sure that media duration is really buggy since it is not applied anyway!
c) I checked it all decoding to wav and checking in Audacity.
d) I dunno whether sbgp and sgpd are needed (whether EAC3 depends on previous frames)
Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.
Change History (8)
comment:1 by , 3 years ago
Description: | modified (diff) |
---|
comment:3 by , 3 years ago
Status: | new → open |
---|
Okay, so now after c2424b1f35a1c6c06f1f9fe5f77a7157ed84e1cd it writes before editlist duration (as is in -ignore_editlist 1 to wav) in mdhd and dumb warning in mediainfo that was warning about mdhd_Duration: xxxxx being wrong is gone. But still wrong amount media duration is written in editlist and otherwise.
comment:4 by , 2 years ago
What is strange is that with -b:a 2048k it will write not 5010 number there, but 4975! What???
comment:5 by , 2 years ago
Component: | ffmpeg → undetermined |
---|---|
Keywords: | mp4 editlist removed |
Reproduced by developer: | unset |
comment:6 by , 2 years ago
I will also point out that Adobe Audacity 17, that was the last version that supports Dolby native encoders with even 7.1 encoding uses 1792 samples as initial priming (7 frames, each 256 samples).
And Plex encoder that uses Mediaconcept encoder linked with Dolby SDK (Easyaudioencoder.exe) uses 768 samples, that is 3 frames. Most of stuff encoded in the wild is 768 samples.
comment:7 by , 8 months ago
Cc: | added |
---|---|
Component: | undetermined → avformat |
Keywords: | mp4 editlist added |
͏ I think similar problems are essentially caused by the unjustifiable complexity and poor design of many things.
͏ See also: https://trac.ffmpeg.org/ticket/11002#comment:9
comment:8 by , 8 weeks ago
EC3A Entry (12 bytes) EC3A Track duration: 4994 (0x00001382) - 4994 (0x1382) ms EC3E Media time: 256 (0x00000100) - 256 (0x100) ms EC42 Media rate: 65536 (0x00010000) - 1.000
is now printed... The editlist is still not fully applied (media time is applied, track duration is not).
Who knows what spec says about this? I looked into TS 103 420 nothing there.
Oh, found it in TS 102 366 (only media time part though)!
J.1.3.2 Priming and delay
The codec uses audio blocks of a fixed length of 256 samples, and a transform which applies over two audio blocks. To obtain the correct audio from a block, both blocks in the transform are needed, and hence both the prior encoded block and the current encoded block need to be decoded to output the first frame. This is sometimes called "priming" and may be signaled using the 'roll' sample group. Thus, a full reconstruction of the first 256 audio samples is sometimes not possible since there is no previous access unit. If it is desired to achieve full reconstruction of these samples, it is possible to add silence to the beginning of the audio signal. In practice, an encoder might prepend an arbitrary amount of silent audio waveform samples to the signal. This portion of the audio signal is sometimes called "encoder delay" and varies depending on the implementation. This can be compensated using one of the following delay compensation approaches.
So roll is not written, wow!! (In sgpd.)