Opened 6 years ago

Last modified 6 years ago

#7362 new enhancement

Newline in subtitles: sub.ass - CRLF and sub.srt - LF

Reported by: Danila Owned by:
Priority: wish Component: avcodec
Version: git-master Keywords: ass srt
Cc: beroal Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style
Can you add option -eol to choose the newline style

% ffmpeg.exe -i TEST_Input.mkv -map 0:m:language:eng -map -0:a -map -0:v -eol "\r\n" "TEST_output.srt"
For example MKVToolNix - extract .srt subtitles with CRLF (at least in windows release)

How to reproduce:

% ffmpeg.exe -i TEST_Input.mkv -map 0:m:language:eng -map -0:a -map -0:v "TEST_output.srt"
% ffmpeg.exe -i TEST_Input.mkv -map 0:m:language:eng -map -0:a -map -0:v "TEST_output.ass"
ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
built with gcc 7.3.1 (GCC) 20180722

Patches should be submitted to the ffmpeg-devel mailing list and not this bug tracker.

Attachments (6)

FFMPEG_Bug_sub_1.png (227.8 KB ) - added by Danila 6 years ago.
FFMPEG_Bug_sub_2.png (517.9 KB ) - added by Danila 6 years ago.
FFMPEG_Bug_sub_3.png (227.9 KB ) - added by Danila 6 years ago.
FFMPEG_Bug_sub_4.png (524.9 KB ) - added by Danila 6 years ago.
FFMPEG_Bug_sub_5.png (90.3 KB ) - added by Danila 6 years ago.
Forced Sub Sample.ass (1.4 KB ) - added by Carl Eugen Hoyos 6 years ago.

Download all attachments as: .zip

Change History (13)

by Danila, 6 years ago

Attachment: FFMPEG_Bug_sub_1.png added

by Danila, 6 years ago

Attachment: FFMPEG_Bug_sub_2.png added

in reply to:  description ; comment:1 by Carl Eugen Hoyos, 6 years ago

Component: ffmpegavcodec
Keywords: ass added; Newline character encoding subtitles CR+LF LF removed
Version: unspecifiedgit-master

Replying to KnightDanila:

Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style

Why do you believe that one of them is wrong?

Can you add option -eol to choose the newline style

Wouldn’t such an option allow to write invalid files?

Which programs fail for a subtitle file produced by FFmpeg?

by Danila, 6 years ago

Attachment: FFMPEG_Bug_sub_3.png added

by Danila, 6 years ago

Attachment: FFMPEG_Bug_sub_4.png added

in reply to:  1 comment:2 by Danila, 6 years ago

Replying to cehoyos:

Replying to KnightDanila:

Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style

Why do you believe that one of them is wrong?

Hm... i do not think it is wrong, but:
1) MKVToolNix - extract .srt subtitles with CRLF (at least in windows release)
2) Windows OS use CRLF newline style https://en.wikipedia.org/wiki/Newline#Representation (Atari TOS, Microsoft Windows, DOS (MS-DOS, PC DOS, etc.), DEC TOPS-10, RT-11, CP/M, MP/M, OS/2, Symbian OS, Palm OS, Amstrad CPC, and most other early non-Unix and non-IBM operating systems)
3) Aegisub - create.srt subtitles with CRLF (at least in windows release)

Can you add option -eol to choose the newline style

Wouldn’t such an option allow to write invalid files?

Maybe, but it can use only two options -eol CRLF - for Win or -eol LF - for Unix :)
It is difficult question :)

Which programs fail for a subtitle file produced by FFmpeg?

Hm... notepad.exe :D - it read it, but without new lines. https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_2.png
Maybe, same DOS and Symbian DVD players falls :D

Also, .srt have CRLF - but in strange places (Why does it not have CRLF in all .srt file?):
I marked it green:
.srt https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_4.png
.ass https://trac.ffmpeg.org/attachment/ticket/7362/FFMPEG_Bug_sub_3.png

Version 0, edited 6 years ago by Danila (next)

by Danila, 6 years ago

Attachment: FFMPEG_Bug_sub_5.png added

in reply to:  1 ; comment:3 by beroal, 6 years ago

Replying to cehoyos:

Replying to KnightDanila:

Picture 1 - sub.ass has CRLF newline style
Picture 2 - sub.srt has LF newline style

Why do you believe that one of them is wrong?

Can you add option -eol to choose the newline style

Wouldn’t such an option allow to write invalid files?

I recently extracted text subtitles from "mkv" on Linux, and the result has a mix of "\r\n" and "\n" line ends which is clearly incorrect. So "ffmpeg" *already* produced an incorrect file, see below.

0000000    1  \n   0   0   :   0   0   :   0   1   ,   4   4   3       -
0000010    -   >       0   0   :   0   0   :   0   4   ,   5   3   6  \n
0000020                                  342 231 252       M   e   e   t
0000030        R   e   b   e   c   c   a     342 231 252  \n  \n   2  \n
0000040    0   0   :   0   0   :   0   4   ,   5   4   7       -   -   >
0000050        0   0   :   0   0   :   0   7   ,   3   3   0  \n        
0000060          342 231 252       S   h   e   '   s       t   h   e    
0000070    c   o   o   l   e   s   t       g   i   r   l  \r  \n        
0000080                    i   n       t   h   e       w   o   r   l   d
0000090    ,       w   a   i   t     342 231 252  \n  \n   3  \n   0   0
00000a0    :   0   0   :   0   7   ,   4   0   7       -   -   >       0
00000b0    0   :   0   0   :   1   0   ,   5   4   2  \n                

IMHO, there is no need for an option because, according to http://www.textfiles.com/uploads/kds-srt.txt , lines must end with "\r\n". So "ffmpeg" must produce "\r\n" for all line ends on all operating systems.

On the other hand, an option for adding UTF-8 byte order mark would be useful as some rare programs ("dsrt", for example) require it in UTF-8 text files.

I see that "\r\n" are used between lines of a single subtitle record. I do not know how text subtitles are stored in "mkv", but I guess that a text of a subtitle record in the input file uses "\r\n" to separate lines, and "ffmpeg" inserts "\n" after timing and subtitle record numbers because my operating system is Linux.

in reply to:  3 ; comment:4 by Carl Eugen Hoyos, 6 years ago

Cc: beroal added

Replying to beroal:

I recently extracted text subtitles from "mkv" on Linux, and the result has a mix of "\r\n" and "\n" line ends which is clearly incorrect.

How can I reproduce this?

in reply to:  4 comment:5 by beroal, 6 years ago

Replying to cehoyos:

Replying to beroal:

I recently extracted text subtitles from "mkv" on Linux, and the result has a mix of "\r\n" and "\n" line ends which is clearly incorrect.

How can I reproduce this?

Run

ffmpeg -i "$INPUT_FILE" -map 0:2 -codec:subtitles subrip "$OUTPUT_FILE"

with INPUT_FILE containing the path to 2D Forced Subtitles Sample #1 (SRT) (from Kodi Samples). Here is the start of OUTPUT_FILE:

0000000    1  \n   0   0   :   0   0   :   0   2   ,   2   5   3       -
0000010    -   >       0   0   :   0   0   :   0   3   ,   4   2   0  \n
0000020    T   h   e   y   '   r   e       h   e   r   e   !  \n  \n   2
0000030   \n   0   0   :   0   0   :   0   4   ,   7   9   7       -   -
0000040    >       0   0   :   0   0   :   0   6   ,   2   1   4  \n   H
0000050    u   r   r   y   ,       M   u   s   e   !  \n  \n   3  \n   0
0000060    0   :   0   0   :   2   5   ,   1   5   1       -   -   >    
0000070    0   0   :   0   0   :   2   8   ,   3   2   0  \n   W   h   a
0000080    t       t   h   e       h   e   l   l       a   r   e       y
0000090    o   u       d   o   i   n   g   ?  \r  \n   W   h   y       a
00000a0    r   e   n   '   t       y   o   u       o   u   t       o   n
00000b0        t   h   e       w   a   t   e   r   ?  \n  \n   4  \n   0

comment:6 by Carl Eugen Hoyos, 6 years ago

There may be a difference between a linebreak in the subtitle file and a linebreak that is meant to be shown when displaying the subtitles.

The main question is still if any application meant to read subtitles (read: other than Notepad) has issues reading subtitles written with (current) FFmpeg.

by Carl Eugen Hoyos, 6 years ago

Attachment: Forced Sub Sample.ass added

comment:7 by Carl Eugen Hoyos, 6 years ago

Apart from a BOM mkvextract produces identical output as FFmpeg.

When transcoding attached ass subtitle file to srt, FFmpeg produces \n for the line feeds of the srt file and \r\n for the crlf's that are part of the subtitles. I don't know if this is expected.

Note: See TracTickets for help on using tickets.