diff options
author | Hin-Tak Leung <htl10@users.sourceforge.net> | 2014-06-06 14:36:21 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2014-06-06 16:08:09 -0700 |
commit | 017f8da43e92ddd9989884720b694a512e09ccce (patch) | |
tree | 2edff7cb8d45e6e859b91d8a7382799459f818f0 /kernel/uid16.c | |
parent | 6d6bd94f4d83d70cdff67d0bf2a64ef6878216e7 (diff) | |
download | linux-stable-017f8da43e92ddd9989884720b694a512e09ccce.tar.gz linux-stable-017f8da43e92ddd9989884720b694a512e09ccce.tar.bz2 linux-stable-017f8da43e92ddd9989884720b694a512e09ccce.zip |
hfsplus: fix worst-case unicode to char conversion of file names and attributes
This is a series of 3 patches which corrects issues in HFS+ concerning
the use of non-english file names and attributes. Names and attributes
are stored internally as UTF-16 units up to a fixed maximum size, and
convert to and from user-representation by NLS. The code incorrectly
assume that NLS string lengths are equal to unicode lengths, which is
only true for English ascii usage.
This patch (of 3):
The HFS Plus Volume Format specification (TN1150) states that file names
are stored internally as a maximum of 255 unicode characters, as defined
by The Unicode Standard, Version 2.0 [Unicode, Inc. ISBN
0-201-48345-9]. File names are converted by the NLS system on Linux
before presented to the user.
255 CJK characters converts to UTF-8 with 1 unicode character to up to 3
bytes, and to GB18030 with 1 unicode character to up to 4 bytes. Thus,
trying in a UTF-8 locale to list files with names of more than 85 CJK
characters results in:
$ ls /mnt
ls: reading directory /mnt: File name too long
The receiving buffer to hfsplus_uni2asc() needs to be 255 x
NLS_MAX_CHARSET_SIZE bytes, not 255 bytes as the code has always been.
Similar consideration applies to attributes, which are stored internally
as a maximum of 127 UTF-16BE units. See XNU source for an up-to-date
reference on attributes.
Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6 is not
attainable in the case of conversion to UTF-8, as going beyond 3 bytes
requires the use of surrogate pairs, i.e. consuming two input units.
Thanks Anton Altaparmakov for reviewing an earlier version of this
change.
This patch fixes all callers of hfsplus_uni2asc(), and also enables the
use of long non-English file names in HFS+. The getting and setting,
and general usage of long non-English attributes requires further
forthcoming work, in the following patches of this series.
[akpm@linux-foundation.org: fix build]
Signed-off-by: Hin-Tak Leung <htl10@users.sourceforge.net>
Reviewed-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Vyacheslav Dubeyko <slava@dubeyko.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Sougata Santra <sougata@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel/uid16.c')
0 files changed, 0 insertions, 0 deletions