Opened 11 years ago

Closed 10 years ago

Last modified 10 years ago

#3496 closed enhancement (fixed)

Support UTF-16 subtitles

Reported by: klpu Owned by:
Priority: normal Component: avformat
Version: git-master Keywords: sub utf16
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
How to reproduce:

% ffmpeg -i *.ass

  ffmpeg version N-61585-ga1ce776 Copyright (c) 2000-2014 the FFmpeg developers
  built on Mar 20 2014 11:54:54 with gcc 4.8 (Ubuntu/Linaro 4.8.1-10ubuntu9)
  configuration: --enable-libfdk-aac --enable-libx264 --enable-openssl --enable-gpl --enable-nonfree --enable-librtmp --enable-x11grab
  libavutil      52. 67.100 / 52. 67.100
  libavcodec     55. 52.102 / 55. 52.102
  libavformat    55. 34.101 / 55. 34.101
  libavdevice    55. 11.100 / 55. 11.100
  libavfilter     4.  3.100 /  4.  3.100
  libswscale      2.  5.101 /  2.  5.101
  libswresample   0. 18.100 /  0. 18.100
  libpostproc    52.  3.100 / 52.  3.100
[mp3 @ 0x2ef99c0] Format mp3 detected only with low score of 1, misdetection possible!
[mp3 @ 0x2ef99c0] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from 'The.Wolf.of.Wall.Street.2013.720p.BluRay.X264-AMIABLE .·±Ìå.ass':
  Duration: 00:00:26.35, start: 0.000000, bitrate: 160 kb/s
    Stream #0:0: Audio: mp3, 32000 Hz, stereo, s16p, 160 kb/s
At least one output file must be specified

Download subtitle frome https://bbs.vitamio.org/files/5326ae02421aa98fcb000752?locale=en&version=origin

Attachments (1)

The.Wolf.of.Wall.Street.2013.720p.BluRay.X264-AMIABLE .·±Ìå.ass (514.7 KB ) - added by klpu 11 years ago.

Download all attachments as: .zip

Change History (8)

comment:1 by klpu, 11 years ago

Can not detect Little-endian UTF-16 Unicode text subtitle.

comment:2 by Clément Bœsch, 11 years ago

Status: newopen
Summary: FFmpeg dont detect correct subtitleSupport UTF-16 subtitles
Type: defectenhancement
00000000  ff fe 5b 00 53 00 63 00  72 00 69 00 70 00 74 00  |..[.S.c.r.i.p.t.|
00000010  20 00 49 00 6e 00 66 00  6f 00 5d 00 0d 00 0a 00  | .I.n.f.o.].....|
00000020  3b 00 20 00 53 00 63 00  72 00 69 00 70 00 74 00  |;. .S.c.r.i.p.t.|

Workaround:

% iconv -f utf16le -t utf8 /tmp/The.Wolf.of.Wall.Street.2013.720p.BluRay.X264-AMIABLE\ .·±Ìå.ass > test.ass
% ffprobe test.ass
ffprobe version N-61759-g9456a86 Copyright (c) 2007-2014 the FFmpeg developers
  built on Mar 24 2014 09:54:13 with gcc 4.8.2 (GCC) 20140206 (prerelease)
  configuration: --enable-nonfree --enable-gpl --enable-libx264 --enable-libmp3lame --enable-x11grab --enable-libvorbis --samples=/home/ux/fate-samples --enable-libvpx --cpu=native --enable-libfaac --cc='ccache cc'
  libavutil      52. 67.100 / 52. 67.100
  libavcodec     55. 52.103 / 55. 52.103
  libavformat    55. 34.101 / 55. 34.101
  libavdevice    55. 11.100 / 55. 11.100
  libavfilter     4.  3.100 /  4.  3.100
  libswscale      2.  5.102 /  2.  5.102
  libswresample   0. 18.100 /  0. 18.100
  libpostproc    52.  3.100 / 52.  3.100
Input #0, ass, from '../test.ass':
  Duration: N/A, bitrate: N/A
    Stream #0:0: Subtitle: ssa

Note: your link in the description is a .srt

Last edited 11 years ago by Clément Bœsch (previous) (diff)

comment:3 by gjdfgh, 11 years ago

This would be pretty simple to achieve:

  1. add a readline function that can convert utf16 to utf8 on the fly (but also can read utf8 alone - based on the function parameter)
  2. in the probe function, try all 3 fundamental encodings: 8 bit (codepage/multibyte, ASCII compatible), UTF16be, UTF16le.

This would be pretty simple, and is the approach used by mplayer. It wouldn't need any complicated charset detection and conversion code.

If nobody objects, I could write a patch.

comment:4 by Clément Bœsch, 11 years ago

Keywords: subtitles added; subtitle removed

comment:5 by Carl Eugen Hoyos, 11 years ago

Keywords: sub added; subtitles removed

comment:6 by Clément Bœsch, 10 years ago

Resolution: fixed
Status: openclosed

comment:7 by Carl Eugen Hoyos, 10 years ago

Keywords: utf16 added
Note: See TracTickets for help on using tickets.