Opened 6 years ago
Last modified 19 months ago
#7828 open defect
gapless playback doesn't work with AAC (remainder and Apple style)
Reported by: | Christoph Anton Mitterer | Owned by: | Elon Musk |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | git-master | Keywords: | aac gapless |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
With current git master ffmpeg 1125277:
In the following a series of test files, based on https://commons.wikimedia.org/w/index.php?title=File%3ATelemann_-_2violin_Sonata_1-1.ogg, are used.
Each filename starts with a number, where the same number indicates the files belong together.
00.test.wav
PCM WAV shortened version of the above Wikipedia Demo
01.split-track01.wav
01.split-track02.wav
00.test.wav split into two halfs (these are the actual base test files used with encoders)
02.split-track01.mp3
02.split-track02.mp3
LAME encoded versions of 01.split-track01.wav and 01.split-track02.wav
03.split-track01.opus
03.split-track02.opus
opusenc encoded versions of 01.split-track01.wav and 01.split-track02.wav
and so on.
1) How the to base test files (01.split-track01.wav and 01.split-track02.wav) were created
The Wikipedia demo file was first decoded to PCM WAV with opusdec, and split in two halfs with
$ shnsplit 00.test.wav
enter split points:
0:05.317
shnsplit: warning: rounding 0:05.317 (offset: 937919) to nearest sector boundary (offset: 938448)
shnsplit: warning: file 2 will not be cut on a sector boundary
Splitting [test.wav] (0:15.66) --> [01.split-track01.wav] (0:05.24) : 100% OK
Splitting [test.wav] (0:15.66) --> [01.split-track02.wav] (0:10.42) : 100% OK
For cross checking, the resulting files were joined again:
$ sox 01.split-track01.wav 01.split-track02.wav joined.wav
The concatenation is binary identical to the original file:
$ diff 00.test.wav joined.wav
$
which can also be seen (visually) in e.g. audacity or sonic-visualizer (i.e. there are no gaps or other distortions between 01.split-track01.wav and 01.split-track02.wav.
2) What is tested?
The split files will now be encoded with some reference encoders and played respectively decoded (to PCM WAV) again afterwards checking for the following:
- Does the "gaplass" playback even work for the plain PCM WAV?
- At playback, can any gap, crack, pop, etc. be heared between the two files (i.e. does "gapless playback" work)?
- At decoding to PCM WAV, is there any shift at the start of the 1st file respectively end of the 2nd file?
- At decoding to PCM WAV, is there any gap/shift/other distortion at the end of the 1st file and start of the 2nd file when these two are concatenated, in other words at the joining position?
Hearing tests were repeated multiple times, so the files were already in the OS cache and one should basically expect no delay at all from slow storage medium (which was anyway one of the fastest SSDs)
Unless otherwise noticed, all programs libraries were from Debian unstable.
Encoders with these options were used:
- lame --verbose -q 0 -v -V 4 --noreplaygain --id3v2-utf16 --add-id3v2 --id3v1-only LAME 64bits version 3.100
- opusenc --vbr --bitrate 96 split-track01.wav opus-tools 0.1.10
- fdkaac -p 29 -m 4 <gapless modes, part of the filename> 0.6.3 gapless modes: 0 iTunSMPB 1 ISO standard (edts and sgpd) 2 Both
- aac-enc -t 29 -v 4 0.1.6 => may not even set any gapless information, so can possibly completely ignored
- faac -q 100 -w 1.29.9.2 => may not even set any gapless information, so can possibly completely ignored
And for playback respectively decoding:
ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers
built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609
configuration:
libavutil 56. 26.100 / 56. 26.100
libavcodec 58. 48.100 / 58. 48.100
libavformat 58. 26.101 / 58. 26.101
libavdevice 58. 7.100 / 58. 7.100
libavfilter 7. 48.100 / 7. 48.100
libswscale 5. 4.100 / 5. 4.100
libswresample 3. 4.100 / 3. 4.100
3) Results
a) Hearing Tests
I couldn't use ffplay, because I had to run everything from git-master ffmpeg on another node.
What I did instead was using mpv to playback files decoded (below in (b) with git-master ffmpeg.
mpv 01.split-track0*.wav => OK, no gap/pop/click/etc. (original files, no ffmpeg here)
mpv ffmpeg1125277.02.split-track0*.mp3.wav => OK, no gap/pop/click/etc.
mpv ffmpeg1125277.03.split-track0*.opus.wav => OK, no gap/pop/click/etc.
but all with AAC:
mpv ffmpeg1125277.04.split-track0*.fdkaac.gapless-mode-0.m4a.wav
...
mpv ffmpeg1125277.08.split-track0*.faac.m4a.wav => BAD, clearly audible gap
b) Visible tests
For these, all the encoded files were again decoded with:
ffmpeg -vn -i input.file output.wav
(this and only this was done with the ffmpeg from git master on some *buntu machine).
to PCM WAV files like:
ffmpeg1125277.02.split-track01.mp3.wav
ffmpeg1125277.02.split-track02.mp3.wav
which would be the yet again decoded files used in the visual tests (with audacity/sonic-visualizer), that is e.g. ffmpeg1125277.02.split-track01.mp3.wav would have been decoded with ffmpeg1125277 from 02.split-track01.mp3 .
For each such pair an image is attached, e.g.:
ffmpeg1125277.02.mp3.wav.png
Comparing the intersection point:
top: the original 00.test.wav
middle: the joined ffmpeg1125277.02.split-track01.mp3.wav and ffmpeg1125277.02.split-track02.mp3.wav (named ffmpeg1125277.02.joined.mp3.wav)
(joins were made with sox 1.wav 2.wav 1-joined-with-2.wav)
bottom: ffmpeg1125277.02.split-track01.mp3.wav alone, serving just as reference as to where the intersection is
Opus seems to always sample at 48kHz, so in sonic visualizer there is an option that will do automatic resampling on opening, which I've enabled.
ffmpeg1125277.02.mp3.wav.png => OK, mostly, there might be a small distortion (red circle), but I guess nothing that anyone will be able to hear
ffmpeg1125277.03.opus.wav.png => OK, seems perfect
all the AAC ones:
ffmpeg1125277.*.*.m4a.wav.png => BAD, not only huge gaps, but it seems the as if end and start of the joined files was even like "faded out/in" (no idea whether encoder or decoder error)
In detail:
fdkaac+iTunSMPB: gap + fade in AND out
fdkaac+ISO: gap + fade out
fdkaac+Both: gap + fade out
(mpv seems to always have gap + fade in AND out
aac-enc: gap + fade out AND in
faac: gap + fade out
(same for mpv)
So while we can probably toss aac-enc and faac,... one sees that something is already different with fdkaac depending on the gap detection method (though both have still big gaps).
Long story short:
I would guess that *somewhere* there's a bug with respect to gapless encoding and/or decoding of AAC.
Since fdkaac claims it would support gapless playback, one might assume the error is on ffmpeg's side.
Problem is, I have no encoder/decoder pair for which I know that it works... maybe one could try it with itunes?
I'd be happy to evaluate further, if any developer has an idea how to move on (i.e. how/where to get AAC files which are definitively considered to be correctly encoded for gapless playback and which one can test with ffmpeg), until then I'd assume that the fdkaac created files are in correctly created for gapless playback.
Thanks, Chris.
The test files an images can be found at:
https://drive.google.com/drive/folders/1SIt1z3FtlYa-zMEDzF-m8jsMCe2PFsyz?usp=sharing
FYI: I did the same tests with mpv (however with 4.1.1 ffmpeg):
https://github.com/mpv-player/mpv/issues/2284
Attachments (4)
Change History (16)
comment:1 by , 6 years ago
comment:2 by , 6 years ago
The used ffmpeg command line was already described in the text above, but again for reference:
./ffmpeg -vn -i 02.split-track01.mp3 ffmpeg1125277.02.split-track01.mp3.wav ./ffmpeg -vn -i 02.split-track02.mp3 ffmpeg1125277.02.split-track02.mp3.wav ./ffmpeg -vn -i 03.split-track01.opus ffmpeg1125277.03.split-track01.opus.wav ./ffmpeg -vn -i 03.split-track02.opus ffmpeg1125277.03.split-track02.opus.wav ./ffmpeg -vn -i 04.split-track01.fdkaac.gapless-mode-0.m4a ffmpeg1125277.04.split-track01.fdkaac.gapless-mode-0.m4a.wav ./ffmpeg -vn -i 04.split-track02.fdkaac.gapless-mode-0.m4a ffmpeg1125277.04.split-track02.fdkaac.gapless-mode-0.m4a.wav ./ffmpeg -vn -i 05.split-track01.fdkaac.gapless-mode-1.m4a ffmpeg1125277.05.split-track01.fdkaac.gapless-mode-1.m4a.wav ./ffmpeg -vn -i 05.split-track02.fdkaac.gapless-mode-1.m4a ffmpeg1125277.05.split-track02.fdkaac.gapless-mode-1.m4a.wav ./ffmpeg -vn -i 06.split-track01.fdkaac.gapless-mode-2.m4a ffmpeg1125277.06.split-track01.fdkaac.gapless-mode-2.m4a.wav ./ffmpeg -vn -i 06.split-track02.fdkaac.gapless-mode-2.m4a ffmpeg1125277.06.split-track02.fdkaac.gapless-mode-2.m4a.wav ./ffmpeg -vn -i 07.split-track01.aac-enc.m4a ffmpeg1125277.07.split-track01.aac-enc.m4a.wav ./ffmpeg -vn -i 07.split-track02.aac-enc.m4a ffmpeg1125277.07.split-track02.aac-enc.m4a.wav ./ffmpeg -vn -i 08.split-track01.faac.m4a ffmpeg1125277.08.split-track01.faac.m4a.wav ./ffmpeg -vn -i 08.split-track02.faac.m4a ffmpeg1125277.08.split-track02.faac.m4a.wav
The console output of these is already in the list of files https://drive.google.com/drive/folders/1SIt1z3FtlYa-zMEDzF-m8jsMCe2PFsyz
but I can copy&paste it here as well:
ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mp3, from '02.split-track01.mp3': Duration: 00:00:05.36, start: 0.025057, bitrate: 156 kb/s Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 156 kb/s Metadata: encoder : LAME3.100 Stream mapping: Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.02.split-track01.mp3.wav': Metadata: ISFT : Lavf58.26.101 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Metadata: encoder : Lavc58.48.100 pcm_s16le size= 917kB time=00:00:05.32 bitrate=1411.3kbits/s speed= 316x video:0kB audio:916kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.008312% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mp3, from '02.split-track02.mp3': Duration: 00:00:10.61, start: 0.025057, bitrate: 143 kb/s Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 143 kb/s Metadata: encoder : LAME3.100 Stream mapping: Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.02.split-track02.mp3.wav': Metadata: ISFT : Lavf58.26.101 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Metadata: encoder : Lavc58.48.100 pcm_s16le size= 1819kB time=00:00:10.55 bitrate=1411.3kbits/s speed= 350x video:0kB audio:1819kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.004188% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 [ogg @ 0x246b400] 657 bytes of comment header remain Input #0, ogg, from '03.split-track01.opus': Duration: 00:00:05.33, start: 0.000000, bitrate: 126 kb/s Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp Metadata: ENCODER : opusenc from opus-tools 0.1.10 ENCODER_OPTIONS : --vbr --bitrate 96 Stream mapping: Stream #0:0 -> #0:0 (opus (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.03.split-track01.opus.wav': Metadata: ISFT : Lavf58.26.101 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s Metadata: ENCODER_OPTIONS : --vbr --bitrate 96 encoder : Lavc58.48.100 pcm_s16le size= 998kB time=00:00:05.32 bitrate=1536.1kbits/s speed= 163x video:0kB audio:998kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.007636% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 [ogg @ 0x2bb9400] 657 bytes of comment header remain Input #0, ogg, from '03.split-track02.opus': Duration: 00:00:10.57, start: 0.000000, bitrate: 122 kb/s Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp Metadata: ENCODER : opusenc from opus-tools 0.1.10 ENCODER_OPTIONS : --vbr --bitrate 96 Stream mapping: Stream #0:0 -> #0:0 (opus (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.03.split-track02.opus.wav': Metadata: ISFT : Lavf58.26.101 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s Metadata: ENCODER_OPTIONS : --vbr --bitrate 96 encoder : Lavc58.48.100 pcm_s16le size= 1980kB time=00:00:10.55 bitrate=1536.1kbits/s speed= 170x video:0kB audio:1980kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.003847% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '04.split-track01.fdkaac.gapless-mode-0.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:17:22.000000Z encoder : fdkaac 0.6.3, libfdk-aac 3.4.22, VBR mode 4 iTunSMPB : 00000000 00000C00 000005C6 000000000001CA3A 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Duration: 00:00:05.53, start: 0.069660, bitrate: 46 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 44 kb/s (default) Metadata: creation_time : 2019-04-03T16:17:22.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.04.split-track01.fdkaac.gapless-mode-0.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom iTunSMPB : 00000000 00000C00 000005C6 000000000001CA3A 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ISFT : Lavf58.26.101 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:17:22.000000Z encoder : Lavc58.48.100 pcm_s16le size= 928kB time=00:00:05.45 bitrate=1393.3kbits/s speed= 279x video:0kB audio:928kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.008208% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '04.split-track02.fdkaac.gapless-mode-0.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:17:33.000000Z encoder : fdkaac 0.6.3, libfdk-aac 3.4.22, VBR mode 4 iTunSMPB : 00000000 00000C00 00000281 0000000000038D7F 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Duration: 00:00:10.73, start: 0.069660, bitrate: 42 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 40 kb/s (default) Metadata: creation_time : 2019-04-03T16:17:33.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.04.split-track02.fdkaac.gapless-mode-0.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom iTunSMPB : 00000000 00000C00 00000281 0000000000038D7F 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ISFT : Lavf58.26.101 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:17:33.000000Z encoder : Lavc58.48.100 pcm_s16le size= 1824kB time=00:00:10.65 bitrate=1402.0kbits/s speed= 217x video:0kB audio:1824kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.004176% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '05.split-track01.fdkaac.gapless-mode-1.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:18:03.000000Z encoder : fdkaac 0.6.3, libfdk-aac 3.4.22, VBR mode 4 Duration: 00:00:05.53, start: 0.000000, bitrate: 46 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 44 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:03.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.05.split-track01.fdkaac.gapless-mode-1.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom ISFT : Lavf58.26.101 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:03.000000Z encoder : Lavc58.48.100 pcm_s16le size= 920kB time=00:00:05.34 bitrate=1411.3kbits/s speed= 265x video:0kB audio:920kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.008280% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '05.split-track02.fdkaac.gapless-mode-1.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:18:10.000000Z encoder : fdkaac 0.6.3, libfdk-aac 3.4.22, VBR mode 4 Duration: 00:00:10.73, start: 0.000000, bitrate: 42 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 40 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:10.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.05.split-track02.fdkaac.gapless-mode-1.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom ISFT : Lavf58.26.101 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:10.000000Z encoder : Lavc58.48.100 pcm_s16le size= 1824kB time=00:00:10.58 bitrate=1411.3kbits/s speed= 212x video:0kB audio:1824kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.004176% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '06.split-track01.fdkaac.gapless-mode-2.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:18:28.000000Z encoder : fdkaac 0.6.3, libfdk-aac 3.4.22, VBR mode 4 iTunSMPB : 00000000 00000C00 000005C6 000000000001CA3A 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Duration: 00:00:05.53, start: 0.000000, bitrate: 46 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 44 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:28.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.06.split-track01.fdkaac.gapless-mode-2.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom iTunSMPB : 00000000 00000C00 000005C6 000000000001CA3A 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ISFT : Lavf58.26.101 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:28.000000Z encoder : Lavc58.48.100 pcm_s16le size= 920kB time=00:00:05.34 bitrate=1411.3kbits/s speed= 268x video:0kB audio:920kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.008280% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '06.split-track02.fdkaac.gapless-mode-2.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:18:29.000000Z encoder : fdkaac 0.6.3, libfdk-aac 3.4.22, VBR mode 4 iTunSMPB : 00000000 00000C00 00000281 0000000000038D7F 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 Duration: 00:00:10.73, start: 0.000000, bitrate: 42 kb/s Stream #0:0(und): Audio: aac (HE-AACv2) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 40 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:29.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.06.split-track02.fdkaac.gapless-mode-2.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom iTunSMPB : 00000000 00000C00 00000281 0000000000038D7F 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ISFT : Lavf58.26.101 Stream #0:0(und): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:18:29.000000Z encoder : Lavc58.48.100 pcm_s16le size= 1824kB time=00:00:10.58 bitrate=1411.3kbits/s speed= 214x video:0kB audio:1824kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.004176% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 [aac @ 0x28bb400] Estimating duration from bitrate, this may be inaccurate Input #0, aac, from '07.split-track01.aac-enc.m4a': Duration: 00:00:05.88, bitrate: 43 kb/s Stream #0:0: Audio: aac (HE-AACv2), 44100 Hz, stereo, fltp, 43 kb/s Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.07.split-track01.aac-enc.m4a.wav': Metadata: ISFT : Lavf58.26.101 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Metadata: encoder : Lavc58.48.100 pcm_s16le size= 952kB time=00:00:05.52 bitrate=1411.3kbits/s speed= 278x video:0kB audio:952kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.008001% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 [aac @ 0x2a3b400] Estimating duration from bitrate, this may be inaccurate Input #0, aac, from '07.split-track02.aac-enc.m4a': Duration: 00:00:09.31, bitrate: 48 kb/s Stream #0:0: Audio: aac (HE-AACv2), 44100 Hz, stereo, fltp, 48 kb/s Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.07.split-track02.aac-enc.m4a.wav': Metadata: ISFT : Lavf58.26.101 Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s Metadata: encoder : Lavc58.48.100 pcm_s16le size= 1848kB time=00:00:10.72 bitrate=1411.3kbits/s speed= 212x video:0kB audio:1848kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.004122% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '08.split-track01.faac.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:25:17.000000Z encoder : FAAC 1.29.9.2 Duration: 00:00:05.32, start: 0.000000, bitrate: 100 kb/s Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 98 kb/s (default) Metadata: creation_time : 2019-04-03T16:25:17.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.08.split-track01.faac.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom ISFT : Lavf58.26.101 Stream #0:0(eng): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:25:17.000000Z encoder : Lavc58.48.100 pcm_s16le size= 920kB time=00:00:05.36 bitrate=1405.2kbits/s speed= 434x video:0kB audio:920kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.008280% ffmpeg version N-93527-g1125277 Copyright (c) 2000-2019 the FFmpeg developers built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.11) 20160609 configuration: libavutil 56. 26.100 / 56. 26.100 libavcodec 58. 48.100 / 58. 48.100 libavformat 58. 26.101 / 58. 26.101 libavdevice 58. 7.100 / 58. 7.100 libavfilter 7. 48.100 / 7. 48.100 libswscale 5. 4.100 / 5. 4.100 libswresample 3. 4.100 / 3. 4.100 Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '08.split-track02.faac.m4a': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom creation_time : 2019-04-03T16:25:30.000000Z encoder : FAAC 1.29.9.2 Duration: 00:00:10.56, start: 0.000000, bitrate: 97 kb/s Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 95 kb/s (default) Metadata: creation_time : 2019-04-03T16:25:30.000000Z Stream mapping: Stream #0:0 -> #0:0 (aac (native) -> pcm_s16le (native)) Press [q] to stop, [?] for help Output #0, wav, to 'ffmpeg1125277.08.split-track02.faac.m4a.wav': Metadata: major_brand : M4A minor_version : 0 compatible_brands: M4A mp42isom ISFT : Lavf58.26.101 Stream #0:0(eng): Audio: pcm_s16le ([1][0][0][0] / 0x0001), 44100 Hz, stereo, s16, 1411 kb/s (default) Metadata: creation_time : 2019-04-03T16:25:30.000000Z encoder : Lavc58.48.100 pcm_s16le size= 1820kB time=00:00:10.58 bitrate=1408.2kbits/s speed= 492x video:0kB audio:1820kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.004185%
comment:3 by , 6 years ago
I found a set of test vector AAC files, that are ought to play back gapless (by using meta data according to MPEG standards,.. not the iTunes stuff which - of course (Apple) - does it different):
https://www2.iis.fraunhofer.de/AAC/gapless.html
(turn down your volume as these files have a pretty loud and annoying sound, IMO)
When decoding these files with FFMPEG to e.g. WAV, there is a "gap" added at the end (just as described in the website above)... interestingly,... AFAIU the website, decoders "not doing it right" would also add a "gap" at the beginning, but it seems this works properly with ffmpeg (but maybe I'm doing something wrong here).
comment:4 by , 6 years ago
Also note that the Fraunhofer website contains a link to Apples documentation:
https://developer.apple.com/library/archive/technotes/tn2258/_index.html#//apple_ref/doc/uid/DTS40009396
Maybe that helps adding support for the itunes gapless metadata.
comment:5 by , 3 years ago
Keywords: | aac gapless added |
---|---|
Summary: | gapless playback (probably) doesn't work with AAC → gapless playback doesn't work with AAC (remainder and Apple style) |
comment:6 by , 3 years ago
Some other samples test1_nero.m4a
FS#12185 : Fix gapless playback for Nero AAC
https://www.rockbox.org/tracker/task/12185
comment:7 by , 2 years ago
Status: | new → open |
---|
AFAIU the website, decoders "not doing it right" would also add a "gap" at the beginning, but it seems this works properly with ffmpeg (but maybe I'm doing something wrong here).
We do it right, just only at the beggining (now that HE-AAC (and v2) are fixed, it was eating too much in the beginning).
Chromium fixed remainder here (first commit): https://bugs.chromium.org/p/chromium/issues/detail?id=668999
Of course it cannot be ported, since it uses ffmpeg's wrapper: https://chromium-review.googlesource.com/c/chromium/src/+/1114094/
comment:8 by , 2 years ago
This should fix it, right? https://patchwork.ffmpeg.org/project/ffmpeg/patch/20190429225027.81295-1-fumoboy007@me.com/
FATE breaks right where it should.
stts discussed here: https://github.com/MediaArea/MediaInfoLib/issues/1570#issuecomment-1410997142
comment:9 by , 2 years ago
Owner: | set to |
---|
Can you fix that, Chrome has this correct, so insane visability of this code path!! See patch in previous comment that will also fix subtitle code for mpv (again, mpv has this code path correct, since no such bug with subtitles).
Why is this so slow and so many bugs are present...
comment:10 by , 2 years ago
Please note it in the bug, once this is merged (and ideally with which version)... once that lands in Debian I could test it again.
Thanks,
Chris.
follow-up: 12 comment:11 by , 2 years ago
Oh btw: your patch, does for which gapless playback "notation" does it fix it? As far as I understood there are different formats for AAC (one following MPEG standards... and some proprietary Apply system - see comments above).
comment:12 by , 19 months ago
Replying to Christoph Anton Mitterer:
Oh btw: your patch, does for which gapless playback "notation" does it fix it? As far as I understood there are different formats for AAC (one following MPEG standards... and some proprietary Apply system - see comments above).
That listed patch is for the MPEG standard method - it relies on the stts
box to find the last signaled sample duration. The Apple method relies on an MP4 tag that it parses.
I'm not sure if that patch would handle fragmented MP4 files - they usually have an empty stts
box, and instead rely on the Track Fragment Header Box (tfhd
) or the Track Fragment Run Box (trun
).
As far as I know here's the available methods to signal gapless playback:
- Use an Edit List Box to list the duration and priming samples, and optional
stts
box.1 - Use an Edit List Box to only list the priming samples, and rely on either:
- The
trun
box to list packet durations of the fragment. - The
tfhd
box to list a default packet duration.2 - The
stts
box to list packet durations (maybe? see footnote).3
- The
- Use an MP4 custom metadata item (the Apple method).
Bug #10458 was closed as a duplicate of this one. The process outlined above to demonstrate the issue is complex (lots of downloading and concatenating files, etc). Thought it might be helpful to share my example for detecting an error with gapless decoding - just encode a file using some known number of samples, decode it back, and check that the number of decoded samples is different.
% ffmpeg -f lavfi -i anullsrc=r=48000:d=2 source.wav # verify the created audio file as exactly 96000 samples % soxi -s source.wav 96000 # encode to aac % ffmpeg -i source.wav -c:a aac encoded.m4a # (verify that encoded.m4a has an Edit List Box and STTS box using a tool like boxdumper) # decode back to wav % ffmpeg -i encoded.m4a destination.wav # observe the sample count != 96000 % soxi -s destination.wav 96256
You can replace the aac codec with any codec that can go in MP4, and relies on the MP4 file to signal the duration - so aac, mp3, opus all have the issue.
Attaching sample files:
- one that can rely on either the elst or the stts box (nonfragmented-aac-stts.mp4)
- one that relies on the tfhd box (fragmented-aac-tfhd.mp4)
- one that relies on the trun box (fragmented-aac-trun.mp4)
- one that relies on the iTunes-style comment (nonfragmented-aac-itunes.mp4)
When properly decoded they should all be exactly 2 seconds @ 48kHz (96000 samples).
1: Even if the edit list has duration, you may need the stts
box as well, say if you concatenated multiple streams into a single mp4 file - smooth playback would need multiple Edit List Boxes, and you may have packets mid-stream that need to be truncated.
2: Usually seen when the final fragment contains a single packet.
3: I'm not sure if having an Edit List Box with a duration of zero is valid if you can produce an stts
box. If you know all the packet lengths to create an stts
box, can fill in the total duration (minus padding) into the Edit List Box.
by , 19 months ago
Attachment: | nonfragmented-aac-stts.mp4 added |
---|
Non-fragmented MP4 file with AAC audio. Contains an edit list box signaling priming samples to discard, and an stts box signaling the last packet has 768 samples. Should decode to exactly 96000 samples.
by , 19 months ago
Attachment: | fragmented-aac-tfhd.mp4 added |
---|
Fragmented MP4 file with AAC audio. Contains an edit list box signaling priming samples to discard, the final fragment uses the tfhd box to signal the last packet has 768 samples. Should decode to exactly 96000 samples.
by , 19 months ago
Attachment: | fragmented-aac-trun.mp4 added |
---|
Fragmented MP4 file with AAC audio. Contains an edit list box signaling priming samples to discard, the final fragment uses the trun box to signal the last packet has 768 samples. Should decode to exactly 96000 samples.
by , 19 months ago
Attachment: | nonfragmented-aac-itunes.mp4 added |
---|
Non-fragmented MP4 file with AAC audio. Uses a custom metadata (iTunSMPB) to signal the priming and padding samples to discard. Should decode to exactly 96000 samples.
Please provide one
ffmpeg
command line together with the complete, uncut console output to make this a valid ticket.