filesysbox ntfs ubs massStorage problem

A forum for general AmigaOS 4.x support questions that are not platform-specific
User avatar
gazelle
Posts: 102
Joined: Sun Mar 04, 2012 12:49 pm
Location: Frohnleiten, Austria

Re: filesysbox ntfs ubs massStorage problem

Post by gazelle »

salass00 wrote:In fact this is its definition from SDK/newlib/include/stddef.h:

Code: Select all

typedef int wchar_t;
Only for __VBCC__ as far as I can see.
joerg
Posts: 371
Joined: Sat Mar 01, 2014 5:42 am

Re: filesysbox ntfs ubs massStorage problem

Post by joerg »

salass00 wrote:To implement case insensitive string comparison and hash functions I need a toupper() function that supports unicode.

AFAICT if I use setlocale(LC_CTYPE, "C-UTF-8") first I should then be able to use towupper() for this purpose, but I guess this doesn't work so well in a shared where it will be called from many different programs?
You can use most of the newlib.library functions only in programs linked with the C library startup code which creates a separate context, don't use such functions in the global context (only opening newlib.library and it's interface).

But the functions are small, everything in newlib uses BSD-style licences and therefore you can simply add them to your sources (unless it's GPL/LGPL code), use the parts inside #ifdef _MB_CAPABLE and remove c = _jp2uc (c);
https://sourceware.org/cgi-bin/cvsweb.c ... vsroot=src
https://sourceware.org/cgi-bin/cvsweb.c ... vsroot=src
Last edited by joerg on Thu Mar 13, 2014 3:29 pm, edited 3 times in total.
joerg
Posts: 371
Joined: Sat Mar 01, 2014 5:42 am

Re: filesysbox ntfs ubs massStorage problem

Post by joerg »

gazelle wrote:
salass00 wrote:In fact this is its definition from SDK/newlib/include/stddef.h:

Code: Select all

typedef int wchar_t;
Only for __VBCC__ as far as I can see.
For GCC it's using GCC's own stddef.h with #include_next, the GCC wchar_t is 32 bit as well.
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

joerg wrote: But the functions are small, everything in newlib uses BSD-style licences and therefore you can simply add them to your sources (unless it's GPL/LGPL code), use the parts inside #ifdef _MB_CAPABLE and remove c = _jp2uc (c);
https://sourceware.org/cgi-bin/cvsweb.c ... vsroot=src
https://sourceware.org/cgi-bin/cvsweb.c ... vsroot=src
Thanks. This is probably what I will do since it seems to be the best solution so far. Filesysbox code is APL (AROS Public License) licensed BTW (this doesn't cause any problems for including BSD code, right?).
User avatar
colinw
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 207
Joined: Mon Aug 15, 2011 9:20 am
Location: Brisbane, QLD. Australia.

Re: filesysbox ntfs ubs massStorage problem

Post by colinw »

joerg wrote:
gazelle wrote:
salass00 wrote:In fact this is its definition from SDK/newlib/include/stddef.h:

Code: Select all

typedef int wchar_t;
Only for __VBCC__ as far as I can see.
For GCC it's using GCC's own stddef.h with #include_next, the GCC wchar_t is 32 bit as well.
I have SDK:clib2/include/stddef.h saying that wchar_t is "unsigned short" (16 bits).
Better make sure you know which one you are getting. :shock:
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

Just to give an idea what filenames containing non-ASCII characters will look like from now on:

Image

That weird looking À sequence should be the character ä BTW.
User avatar
gazelle
Posts: 102
Joined: Sun Mar 04, 2012 12:49 pm
Location: Frohnleiten, Austria

Re: filesysbox ntfs ubs massStorage problem

Post by gazelle »

@salass00:

You really are a hard nut to crack ;)

Why are you so reluctant to the idea of using the local charset? Most users will only use characters in their own charset.

What happens now if I create a file with the name "täterätätä" as that is clearly not a leagal UTF-8 sequence?
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

gazelle wrote: What happens now if I create a file with the name "täterätätä" as that is clearly not a leagal UTF-8 sequence?
It will fail with ERROR_INVALID_COMPONENT_NAME. This is because I perform UTF-8 validity checks on all strings that are passed to the filesystem from outside before trying to do anything with them.
joerg
Posts: 371
Joined: Sat Mar 01, 2014 5:42 am

Re: filesysbox ntfs ubs massStorage problem

Post by joerg »

gazelle wrote:Why are you so reluctant to the idea of using the local charset? Most users will only use characters in their own charset.
The software used for displaying or entering the names has to do the conversion between UTF-8 and the local 8 bit charset, in this case it's the Workbench which has to be updated. Doing it in the file system (or dos.library) instead is wrong and can't work, currently only 8 bit charsets are supported by AmigaOS 4.x but the file systems, especially the ones used for transferring data form/to other OSes, have to support all Unicode chars.
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

@gazelle & joerg

You guys are reading too much into my post there. If I wasn't convinced that it is the right thing to do long term I wouldn't be making this change in the first place :-).
Post Reply