Opened 19 months ago
Last modified 19 months ago
#10291 open enhancement
FFmpeg removes IETF BCP-47 language tags from MKV files during remuxing or encoding
Reported by: | ptr727 | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | avformat |
Version: | git-master | Keywords: | mkv |
Cc: | Jérôme Martinez | Blocked By: | |
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
Summary:
When FFmpeg creates MKV files from MKV files, the LanguageIETF tags from the original file is not written, and the language granularity is lost.
For reference see:
- https://datatracker.ietf.org/doc/draft-ietf-cellar-matroska/
- https://gitlab.com/mbunkus/mkvtoolnix/-/wikis/Languages-in-Matroska-and-MKVToolNix
- https://github.com/ietf-wg-cellar/matroska-specification/blob/master/ebml_matroska.xml#L434
- https://en.wikipedia.org/wiki/IETF_language_tag
- https://r12a.github.io/app-subtags/
Create media file snippet from MKV that contains IETF BCP-47 tags:
mkvmerge --split parts:00:00:00-00:01:00 --output MKV-IETF-Snippet.mkv MKV-IETF.mkv
Use MkvMerge to create a JSON file describing the MKV contents:
mkvmerge --identify MKV-IETF-Snippet.mkv --identification-format json
Note the presence of language and language_ietf tags in the file:
"language": "srp" "language_ietf": "sr-Latn-RS"
Similar output can be produced using MediaInfo and FfProbe:
mediainfo --Output=XML MKV-IETF-Snippet.mkv <Language>sr-Latn-RS</Language>
ffprobe -loglevel quiet -show_streams -show_format -print_format json MKV-IETF-Snippet.mkv "language": "srp"
Note that FfProbe only uses the ISO693-3 tags, and ignores the IETF BCP-47 tags.
ffmpeg -i MKV-IETF-Snippet.mkv -map 0 -codec copy -f matroska MKV-IETF-Snippet-FfMpeg.mkv
Repeat the steps above to get the MKV tag information, and note that the IETF language tags have been stripped from the output file.
"language": "srp"
The "sr-Latn-RS" detailed language has been reduced the "srp", losing the regional specifics.
Observed behavior: ffmpeg strips IETF language tags from files.
Expected behavior: ffmpeg retains IETF tags (or all Matroska tags even if not interpreted) from the source file.
Nice to have behavior: FfProbe emits IETF language tags.
Attachments (1)
Change History (3)
by , 19 months ago
Attachment: | FfMpeg_IETF.zip added |
---|
comment:1 by , 19 months ago
Cc: | added |
---|
comment:2 by , 19 months ago
Component: | ffmpeg → avformat |
---|---|
Status: | new → open |
Type: | defect → enhancement |
There is no support for the LanguageBCP47 element in either muxer or demuxer, so this needs to be implemented.
JSON and XML and report LOG