Opened 5 years ago
Closed 5 years ago
#8578 closed defect (fixed)
FFmpeg 4.2 breaks Matroska streaming
Reported by: | Sesse | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avformat |
Version: | git-master | Keywords: | mkv regression |
Cc: | andreas.rheinhardt@gmail.com | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Hi,
I'm having an application that creates Matroska streams on-the-fly by reusing chunks between HTTP clients:
- Set up a callback in avio->pb.
- Call avformat_write_header(), store what is returned in variable.
- Whenever a client connects, immediately send the contents of said variable.
- For each packet to mux, call avformat_interleaved_write() as usual, and send the results to every connected client.
In other words, there might be intermediate packets from the start of the stream that a client won't be seeing, but this is fairly common (I believe e.g. VLC does the same thing). It has worked fine in FFmpeg for a few years, but after 4.2, clients now cannot decode anything:
[mkv] Invalid EBML length at position 716
[ffmpeg] Cannot seek backward in linear streams!
[mkv] Unexpected end of file (no clusters found)
I've bisected, and although the results are a bit fuzzy, I believe this is the commit that broke it:
commit add68dcca958f0f6b42cabea6f546ceae5c7f16d Author: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Date: Sat Apr 20 01:41:06 2019 +0200 avformat/matroskaenc: Write CRC-32 in non-seekable mode Given that in both the seekable as well as the non-seekable mode dynamic buffers are used to write level 1 elements and that now no seeks are used in the seekable case any more, the two modes can be combined; as a consequence, the non-seekable mode automatically inherits the ability to write CRC-32 elements. There are no differences in case the output is seekable; when it is not and writing CRC-32 elements is disabled, there can still be minor differences because before this commit, the EBML ID and length field were counted towards the cluster size limit; now they no longer are. Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com> Signed-off-by: James Almer <jamrial@gmail.com>
I can't find any way to fix it with flags. I've tried setting "live" to 1, and I've tried setting "write_crc32" to 0 (both of them with av_dict_set()), but it doesn't seem to change anything.
Attachments (1)
Change History (11)
comment:1 by , 5 years ago
Keywords: | mkv regression added; matroska removed |
---|---|
Version: | 4.2 → git-master |
comment:3 by , 5 years ago
Martin Storsjö helped me debug this. The gist of it is: Earlier, there would be only one write callback per chunk. Now, there is two; one for the EBML ID (4 bytes) and one for the actual data (50–60 kB). Since the Matroska muxer doesn't use data markers (type in the callback, after the header has been written, is always AVIO_DATA_MARKER_UNKNOWN), my code assumes that the point right after the EBML ID is a valid starting point for the stream, whereas it isn't.
Worse yet, since there's a sizeable delay in wall clock time after the EBML ID is written, it's practically guaranteed that the client will indeed hit that point. If I add a hack to my code saying that
if (buf_size == 4) type = AVIO_DATA_MARKER_SYNC_POINT;
then things instantly start working again.
comment:4 by , 5 years ago
- The "variable" returned by avformat_write_header() that you store and send to the client is the header written by avformat_write_header() and not the return value of avformat_write_header(), isn't it? (Yes, a stupid question, but I just want to be sure.)
- The AVIOContext's is flagged as non-seekable (i.e. pb->seekable & AVIO_SEEKABLE_NORMAL is wrong), I presume?
- Before the patchset that the cited commit belongs to was applied there were two interwoven codepaths for writing clusters (and other level 1 elements). Both assembled the clusters in memory via a dynamic buffer (an AVIOContext that you can open with avio_open_dyn_buf()) (let's call it dyn) before outputting via the real AVIOContext (called pb in the following).
a) The codepath for seekable output: Write the Cluster element ID via pb; also write Matroska's "unknown-length" length via pb (using the maximal length of 8 bytes for the length field). If a CRC-32 element is going to be written, reserve six bytes for it by writing an EBML-Void element in dyn. Then write content of the cluster as usual into dyn. When the cluster is to be finished, output the CRC-32 (if it is to be written) via pb and then write the whole content of dyn (containing the Cluster's content) with the exception of the bytes reserved for the CRC-32 (if any) via pb. Then seek back (in pb) to length field to update it (i.e. overwrite it with the real length that is now known) and seek back to the end of the Cluster again to write the next Cluster (or the Cues if this was the last one and Cues should be written at the end).
b) The codepath for unseekable output: Write the Cluster element ID into dyn; also write an "unknown-length" length field into dyn. Then write the Cluster into dyn (without writing CRC-32 elements, because that was just not supported by the implementation). When closing the Cluster, the length field was updated (requiring seeks but that is possible because we are seeking with dyn, not pb) and the whole Cluster has been output (i.e. sent to pb) in one avio_write().
- Given that both codepaths relied on dynamic buffers they could be merged into one: Write Cluster element ID to pb; open dyn and reserve space for the CRC-32 in dyn (if it is to be written). Write Cluster content into dyn. When the Cluster is to be finished, write the correct length field for the Cluster (said size is now known!).* Then write the CRC-32 to pb (if ...) and then write the actual Cluster content (excluding the stuff at the beginning reserved for the CRC-32). This is what the patchset to which the cited patch belongs does. This patch is the very one that stopped the Cluster to be output in one go for nonseekable output; so your fuzzy result makes sense.
- As it happens, I already have a patch that should fix this. It modifies the process as follows: Open dyn and reserve space for CRC-32 (if ...). Write Cluster content into dyn. When the Cluster is to be finished, write the Cluster element ID, the length field, the CRC-32 (if ...) and the Cluster content (excluding the part reserved for the CRC-32 element (if ...)). You can find it here; yet it is part of a longer patchset and you will also have to apply this patch, this patch, this patch, this patch, this patch, this patch, this patch and finally this patch. Can you test whether this indeed fixes your problem?
*: We save a few bytes here: Matroska (or actually EBML, the thing Matroska is built on) uses variable-length length-fields where smaller lengths take less bytes to encode and given that the length to write is known at this point the smallest possible length field is chosen.
comment:5 by , 5 years ago
Now that you have mentioned the markers: They are broken: The check for the beginning of a new cluster should be "if (mkv->cluster_pos == -1)" and not "if (!mkv->cluster_pos)". Will send a patch.
by , 5 years ago
Attachment: | 0001-avformat-matroskaenc-Write-level-1-elements-in-one-g.patch added |
---|
Patch to write level 1 elements in one go again for release 4.2
comment:6 by , 5 years ago
For your convenience I have added a patch for you to apply on top of release 4.2 that you can test. This is what will be backported if you confirm that it fixes the issue.
comment:7 by , 5 years ago
Cc: | added |
---|
comment:10 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Fixed in d9c21ec63999501afc39ffc26f363245e8624304, backported as 477275795865f074e60635d813a7b765284ca948.
Please confirm that the issue you see is reproducible with current FFmpeg git head.