joerg wrote:To get the charset you have to open locale.library and use locale = ILocale->OpenLocale(NULL). locale.library is in LIBS:, not a kickstart module, and it depends on several other files on SYS: as well. If SYS: is a SFS partition and SFS would try to open locale.library which has to be loaded from this SFS partition you'd get an endless loop or a deadlock ...
Ok, that's a pretty good reason. As this discussion started with filesysbox.lib / NTFS I didn't think of the automount filesystems.
joerg wrote:The names are in UTF-8, on any file system (most still need some bug fixes), it's no longer limited to ASCII (everything >= 160 was just undefined bytes until now).
That's new but I guess it's needed as a first step to support multibyte charsets. Oh, I can hear the outcry of some of the community members: "But, but, but ... that will break the backward ... my super old program wont work anymore ... ", now where is my popcorn
UTF-8 Entry won't be an issue (I am working on an IME using UTF-8 encoding string outputs into the systems Input Event stream), I have already got UTF-8 encoded filenames that are inaccessible until I can actually deal with the UTF8 encoded names (display and input being seperate afaik).
I've just got an internal encoding issue for compilation of single encodings of characters (I will be taking composition inputs and generating the distinct characters as a "deadkey" processing response).
for my own personal internal use I am going to be using the UTF-8 "codepoint" values pretty much raw for internal buffering (modified use of a TagItem structure for buffering purposes as an overloaded Qualifiers(tag) and IE_Code(data) value pair).
I was considering of using the 7bit safe "URLencode" schema for requester recognition...but as colinw has said about dos.library being updated for UTF-8 encoding safe string usage. I'll deal with Side-By-Side Language Input selections all being pushed to a common core UTF-8 encoding.
This should allow English, Russian, Japanese and additional language support without any major weirdnesses and workarounds based on codepages and other things.
One thing I am definitely accepting is that I'll support only two output encodings... "raw original"(ISO Latin-1 encoding only) and "vanilla"(UTF-8 processed) so that there is some measure of backwards compatability.
I'm thinking this will help the OS for more than filesystems (and I am leaving loading to be referential based on language libraries being loaded from Iprefs)