Page 1 of 1
Codeset conversion: the recommended way
Posted: Mon Nov 28, 2011 7:40 pm
by trixie
Can we please have an official Hyperion word on how the poor developers should implement codeset conversion in their programs? I mean, in a future-proof way: "the recommended practice", if you like. Is codesets.library the way? Or is there an alternative solution under development that is to become part of OS4? I'm asking because I have a number of projects under development that work with UTF-8 encoded text and therefore require codeset conversion.
Re: Codeset conversion: the recommended way
Posted: Tue Nov 29, 2011 5:52 pm
by ZeroG
I don't think that there is support for UTF-8 encoding, but you can get a Unicode mapping table using
IDiskfont->ObtainCharsetInfo().
Re: Codeset conversion: the recommended way
Posted: Tue Nov 29, 2011 11:10 pm
by chris
...and once you have that, you can choose from newlib's iconv(), iconv.library/libiconv, codesets.library and/or parserutils.library (and I think there's a utf8.library floating around somewhere too)
I assume the "official" way is to use iconv() from newlib.library. I'm not sure if locale.library uses that too or has built-in functions for converting catalogs into the correct charset for display. Or maybe it doesn't even do that, I'm not quite sure.
Take your pick!
Re: Codeset conversion: the recommended way
Posted: Wed Jan 25, 2012 5:43 pm
by trixie
@chris
chris wrote:you can choose from newlib's iconv(), iconv.library/libiconv, codesets.library and/or parserutils.library (and I think there's a utf8.library floating around somewhere too)
All right, iconv() would probably be the best for what I need - newlib.library is part of the AOS kernel now, is that right? So I won't have to rely on the user having a third-party library installed.
I found an example for iconv(), and see that before you use it, you have to
Code: Select all
iconv_open(const char *tocode, const char *fromcode);
Where do I get the "tocode" and "fromcode" codeset names, are they the same as those used by the locale.library? Can I do something like this?
Code: Select all
iconv_open("iso-8859-2", "utf-8");
(Sorry I'd try it out myself but unfortunately, my SAM is broken at the moment, taking a holiday in Italy

)
Re: Codeset conversion: the recommended way
Posted: Thu Jan 26, 2012 11:16 pm
by chris
trixie wrote:All right, iconv() would probably be the best for what I need - newlib.library is part of the AOS kernel now, is that right? So I won't have to rely on the user having a third-party library installed.
Yep, that's the best bet generally if it does what you need.
I found an example for iconv(), and see that before you use it, you have to
Code: Select all
iconv_open(const char *tocode, const char *fromcode);
Where do I get the "tocode" and "fromcode" codeset names, are they the same as those used by the locale.library? Can I do something like this?
Code: Select all
iconv_open("iso-8859-2", "utf-8");
IIRC, yes.
Re: Codeset conversion: the recommended way
Posted: Fri May 18, 2012 9:25 pm
by Belxjander
Glad to have come across this... I'm taking notes as Perception-IME is also dealing with UTF-8 text strings as well...
@Trixie, I hope your own sam is recoverable somehow
Re: Codeset conversion: the recommended way
Posted: Fri Jun 01, 2012 5:20 pm
by billt
And WxWidgets port will need UTF8 as well. Seems to be a popular thing these days.
Re: Codeset conversion: the recommended way
Posted: Sun Jun 03, 2012 9:45 pm
by trixie
@Belxjander
@Trixie, I hope your own sam is recoverable somehow
My Sam is probably dead as a dodo but thanks to Steven Solie I got access to an affordable replacement so I'm now fully setup and developing again!
Re: Codeset conversion: the recommended way
Posted: Thu Jun 28, 2012 11:48 am
by Belxjander
trixie wrote:@Belxjander
@Trixie, I hope your own sam is recoverable somehow
My Sam is probably dead as a dodo but thanks to Steven Solie I got access to an affordable replacement so I'm now fully setup and developing again!
Excellent news at least...
I'm currently looking at how to handle plugging in extra materials to the locale.library and I'll consider remapping from the UTF tables already present to try and get Japanese Displaying properly first... following up with getting the Input handled properly