Page 4 of 8

Re: filesysbox ntfs ubs massStorage problem

Posted: Sun Feb 23, 2014 10:09 pm
by salass00
gazelle wrote: But the czech locale does use the ISO-8859-2 charset, which should be able to display this character.
Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.

Re: filesysbox ntfs ubs massStorage problem

Posted: Sun Feb 23, 2014 10:35 pm
by gazelle
salass00 wrote:File names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
Thats interesting, because I just did a small test with my RAM, SFS2 and JXF4 and all of them are happily using whatever charset your locale is for the filenames. If you change it on the fly so do your filenames (if they are using chars > 0xA0).

Re: filesysbox ntfs ubs massStorage problem

Posted: Mon Feb 24, 2014 10:14 am
by sailorMH
salass00 wrote:
The MorphOS "solution" seems a bit strange. Can you still access that file normally with czech character missing? What happens if you already have a file with the same name except that this character is missing?
I tried it with several filenames:
Y-ň.txt
Y-ř.txt
Y-ťďň-ŤĎŇ.txt

Mos omits only these letters: ř,ť,ď,ň. Others are visible ( i.e. probably have equivalent in ISO-8859-1 or ISO-8859-15)
MOS-IdenticalFiles.jpg
MOS ambient winow file listing
(100.58 KiB) Downloaded 211 times
Anyway, it is only under ambient. In shell I see normally all characters.
In ambient window I can still edit "identical" files separately (see filename Y-ň.txt on top):
MOS-openFile.jpg
MOS-openFile.jpg (15.72 KiB) Viewed 5126 times
And if I use rename function from Ambient window, I also see all characters:
MOS-rename.jpg
MOS-rename.jpg (11.73 KiB) Viewed 5126 times
i.e. MOS handles names separately - in shell is everything OK,
only some upper programs (ambient window, Dopus) does not show all characters.

Re: filesysbox ntfs ubs massStorage problem

Posted: Mon Feb 24, 2014 10:28 am
by salass00
salass00 wrote: The NTFS 0.9 solution of converting the non-representable character into a backslash followed by the hexadecimal represention of the UTF-8 character seem so bad at least from an implementation standpoint. Unfortunately going that way I will probably have to use something else than iconv() for the character set conversion...
After some thinking this type of solution is only really suitable for a read-only filesystem implementation as NTFS 0.9, not so much for a read-write filesystem like NTFS3G.

Re: filesysbox ntfs ubs massStorage problem

Posted: Mon Feb 24, 2014 6:22 pm
by ssolie
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.

@all
The character encoding used by DOS and file systems is assumed. So yes, it will "work" in the sense that byte values will be stored and retrieved. But how they are displayed will depend on your current character encoding. Note the locale directory structure was reworked because of this architecture limitation.

Re: filesysbox ntfs ubs massStorage problem

Posted: Mon Feb 24, 2014 10:40 pm
by chris
ssolie wrote:
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.

@all
The character encoding used by DOS and file systems is assumed. So yes, it will "work" in the sense that byte values will be stored and retrieved. But how they are displayed will depend on your current character encoding. Note the locale directory structure was reworked because of this architecture limitation.
Is there any chance of moving to UTF-8 for filename storage? I get similar problems with SMBFS where it looks fine on the Amiga side, but from elsewhere the non-ASCII filenames are nonsense.

Re: filesysbox ntfs ubs massStorage problem

Posted: Tue Feb 25, 2014 9:08 am
by Raziel
chris wrote:
ssolie wrote:
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.

@all
The character encoding used by DOS and file systems is assumed. So yes, it will "work" in the sense that byte values will be stored and retrieved. But how they are displayed will depend on your current character encoding. Note the locale directory structure was reworked because of this architecture limitation.
Is there any chance of moving to UTF-8 for filename storage? I get similar problems with SMBFS where it looks fine on the Amiga side, but from elsewhere the non-ASCII filenames are nonsense.
Very much second

Vice-versa is the same btw.
I created some dirs on Linux/Windows side, they show up fine in any other program/hardware (i.e. WLAN radio player)...only the Amiga side creates rubbish out of umlauts (and of course can't access those directories/files)

If i create them on the Amiga side (renaming those files is not possible, trying to will do nothing, probably because it can't read/write it in the first place) i can work fine with them on the Amiga, but now every other program/hardware acts up.

So i took the path of least stress and not use them on the Amiga side

Renaming them to ue, ae and such isn't an option, because i have thousands of files on my NAS (and the user shouldn't be forced to)

Re: filesysbox ntfs ubs massStorage problem

Posted: Mon Mar 10, 2014 10:25 pm
by salass00
gazelle wrote:It's pretty easy to convert between an 8bit charset and unicode with IDiskfont->ObtainCharsetInfo() and the DFCS_MAPTABLE.
Thanks for this suggestion. I've just been reading up on how UTF-8 works on wikipedia. Given that I'm only interested in one mapping table (ISO-8859-15) and that it's not very large (only one kilobyte) I will probably compile and link it directly into filesysbox.library rather than relying on diskfont.library to be available.

Re: filesysbox ntfs ubs massStorage problem

Posted: Tue Mar 11, 2014 9:20 am
by salass00
@sailorMH

Now that I remember, the MorphOS filesysbox.library (at least the APL 0.730 version) actually converts filenames to whatever happened to be the default system charset when the filesystem was mounted so if this happened to be ISO-8859-2 then your filenames containing Czech letters won't be a problem. This is however not a very sound solution for a filesystem as it causes inconsistent behaviour if f.e. the filesystem is mounted before IPrefs has been run as may be the case for auto-mounted disks/partitions.

In the latest version 53.27 of filesysbox.library I've made the directory scan function reject all filenames that contain unicodes that can't be accurately mapped into ISO-8859-15. Such filenames will never work reliably until AmigaOS gets proper support for UTF-8 filenames. As a consequence of this change all files and directories with valid filenames will always be shown as filenames containing invalid characters no longer interrupt the directory scanning process.

Re: filesysbox ntfs ubs massStorage problem

Posted: Tue Mar 11, 2014 9:33 am
by salass00
ssolie wrote:
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.
The filesysbox.library behaviour with regards to codesets is based on the answer I got from Olaf Barthel.

IIRC even though in programs the filenames may be displayed as if they were using the locale defined codeset the filesystems still always handle the filenames internally as if they are ISO-8859-1/ISO-8859-15 encoded when doing operations with them like case insensitive string comparison.