Codeset conversion: the recommended way

This forum is for general developer support questions.
Post Reply
User avatar
trixie
Posts: 411
Joined: Thu Jun 30, 2011 3:54 pm
Location: Czech Republic

Codeset conversion: the recommended way

Post by trixie »

Can we please have an official Hyperion word on how the poor developers should implement codeset conversion in their programs? I mean, in a future-proof way: "the recommended practice", if you like. Is codesets.library the way? Or is there an alternative solution under development that is to become part of OS4? I'm asking because I have a number of projects under development that work with UTF-8 encoded text and therefore require codeset conversion.
The Rear Window blog

AmigaOne X5000 @ 2GHz / 4GB RAM / Radeon RX 560 / ESI Juli@ / AmigaOS 4.1 Final Edition
SAM440ep-flex @ 667MHz / 1GB RAM / Radeon 9250 / AmigaOS 4.1 Final Edition
User avatar
ZeroG
Posts: 124
Joined: Sat Jun 18, 2011 12:31 pm
Location: Germany

Re: Codeset conversion: the recommended way

Post by ZeroG »

I don't think that there is support for UTF-8 encoding, but you can get a Unicode mapping table using
IDiskfont->ObtainCharsetInfo().
chris
Posts: 564
Joined: Sat Jun 18, 2011 12:05 pm
Contact:

Re: Codeset conversion: the recommended way

Post by chris »

...and once you have that, you can choose from newlib's iconv(), iconv.library/libiconv, codesets.library and/or parserutils.library (and I think there's a utf8.library floating around somewhere too)

I assume the "official" way is to use iconv() from newlib.library. I'm not sure if locale.library uses that too or has built-in functions for converting catalogs into the correct charset for display. Or maybe it doesn't even do that, I'm not quite sure.

Take your pick!
User avatar
trixie
Posts: 411
Joined: Thu Jun 30, 2011 3:54 pm
Location: Czech Republic

Re: Codeset conversion: the recommended way

Post by trixie »

@chris
chris wrote:you can choose from newlib's iconv(), iconv.library/libiconv, codesets.library and/or parserutils.library (and I think there's a utf8.library floating around somewhere too)
All right, iconv() would probably be the best for what I need - newlib.library is part of the AOS kernel now, is that right? So I won't have to rely on the user having a third-party library installed.

I found an example for iconv(), and see that before you use it, you have to

Code: Select all

iconv_open(const char *tocode, const char *fromcode);
Where do I get the "tocode" and "fromcode" codeset names, are they the same as those used by the locale.library? Can I do something like this?

Code: Select all

iconv_open("iso-8859-2", "utf-8");
(Sorry I'd try it out myself but unfortunately, my SAM is broken at the moment, taking a holiday in Italy :-) )
The Rear Window blog

AmigaOne X5000 @ 2GHz / 4GB RAM / Radeon RX 560 / ESI Juli@ / AmigaOS 4.1 Final Edition
SAM440ep-flex @ 667MHz / 1GB RAM / Radeon 9250 / AmigaOS 4.1 Final Edition
chris
Posts: 564
Joined: Sat Jun 18, 2011 12:05 pm
Contact:

Re: Codeset conversion: the recommended way

Post by chris »

trixie wrote:All right, iconv() would probably be the best for what I need - newlib.library is part of the AOS kernel now, is that right? So I won't have to rely on the user having a third-party library installed.
Yep, that's the best bet generally if it does what you need.
I found an example for iconv(), and see that before you use it, you have to

Code: Select all

iconv_open(const char *tocode, const char *fromcode);
Where do I get the "tocode" and "fromcode" codeset names, are they the same as those used by the locale.library? Can I do something like this?

Code: Select all

iconv_open("iso-8859-2", "utf-8");
IIRC, yes.
Belxjander
Posts: 315
Joined: Mon May 14, 2012 11:26 pm
Location: 日本千葉県松戸市 / Matsudo City, Chiba, Japan
Contact:

Re: Codeset conversion: the recommended way

Post by Belxjander »

Glad to have come across this... I'm taking notes as Perception-IME is also dealing with UTF-8 text strings as well...

@Trixie, I hope your own sam is recoverable somehow
AmigaOS 4.1u4 Hand-Installed on a Sam440flex with Radeon 9250 Graphics card,
1GB of Memory present and 2 HDD (1x160GB primary, 1x500GB secondary),
Buffalo Wireless Keyboard and Mouse over USB in Japanese (BSKBW06)
Currently I am working on two projects called [url]http://code.google.com/p/perception-ime/[/url] and [url]http://code.google.com/p/polymorph/[/url]
billt
Posts: 9
Joined: Fri Feb 10, 2012 5:35 pm

Re: Codeset conversion: the recommended way

Post by billt »

And WxWidgets port will need UTF8 as well. Seems to be a popular thing these days.
User avatar
trixie
Posts: 411
Joined: Thu Jun 30, 2011 3:54 pm
Location: Czech Republic

Re: Codeset conversion: the recommended way

Post by trixie »

@Belxjander
@Trixie, I hope your own sam is recoverable somehow
My Sam is probably dead as a dodo but thanks to Steven Solie I got access to an affordable replacement so I'm now fully setup and developing again!
The Rear Window blog

AmigaOne X5000 @ 2GHz / 4GB RAM / Radeon RX 560 / ESI Juli@ / AmigaOS 4.1 Final Edition
SAM440ep-flex @ 667MHz / 1GB RAM / Radeon 9250 / AmigaOS 4.1 Final Edition
Belxjander
Posts: 315
Joined: Mon May 14, 2012 11:26 pm
Location: 日本千葉県松戸市 / Matsudo City, Chiba, Japan
Contact:

Re: Codeset conversion: the recommended way

Post by Belxjander »

trixie wrote:@Belxjander
@Trixie, I hope your own sam is recoverable somehow
My Sam is probably dead as a dodo but thanks to Steven Solie I got access to an affordable replacement so I'm now fully setup and developing again!
Excellent news at least...

I'm currently looking at how to handle plugging in extra materials to the locale.library and I'll consider remapping from the UTF tables already present to try and get Japanese Displaying properly first... following up with getting the Input handled properly
Post Reply