filesysbox ntfs ubs massStorage problem

A forum for general AmigaOS 4.x support questions that are not platform-specific
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

gazelle wrote: But the czech locale does use the ISO-8859-2 charset, which should be able to display this character.
Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
User avatar
gazelle
Posts: 102
Joined: Sun Mar 04, 2012 12:49 pm
Location: Frohnleiten, Austria

Re: filesysbox ntfs ubs massStorage problem

Post by gazelle »

salass00 wrote:File names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
Thats interesting, because I just did a small test with my RAM, SFS2 and JXF4 and all of them are happily using whatever charset your locale is for the filenames. If you change it on the fly so do your filenames (if they are using chars > 0xA0).
User avatar
sailorMH
Posts: 230
Joined: Wed Aug 28, 2013 6:01 pm
Location: Czech republic

Re: filesysbox ntfs ubs massStorage problem

Post by sailorMH »

salass00 wrote:
The MorphOS "solution" seems a bit strange. Can you still access that file normally with czech character missing? What happens if you already have a file with the same name except that this character is missing?
I tried it with several filenames:
Y-ň.txt
Y-ř.txt
Y-ťďň-ŤĎŇ.txt

Mos omits only these letters: ř,ť,ď,ň. Others are visible ( i.e. probably have equivalent in ISO-8859-1 or ISO-8859-15)
MOS-IdenticalFiles.jpg
MOS ambient winow file listing
(100.58 KiB) Downloaded 211 times
Anyway, it is only under ambient. In shell I see normally all characters.
In ambient window I can still edit "identical" files separately (see filename Y-ň.txt on top):
MOS-openFile.jpg
MOS-openFile.jpg (15.72 KiB) Viewed 4401 times
And if I use rename function from Ambient window, I also see all characters:
MOS-rename.jpg
MOS-rename.jpg (11.73 KiB) Viewed 4401 times
i.e. MOS handles names separately - in shell is everything OK,
only some upper programs (ambient window, Dopus) does not show all characters.
Micro A1-C (G3/1.2 GHz), AmigaOne XE (G4/1.4 GHz), Pegasos II (G4/1.33 GHz), Sam440ep, Sam440ep-flex, AmigaOne X1000
Efika 5200b, Pegasos I, Powerbook, Mac Mini (1.83 GHz), iMac, Powermac Quad

AmigaOS, MorphOS, linux, MacOS X
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

salass00 wrote: The NTFS 0.9 solution of converting the non-representable character into a backslash followed by the hexadecimal represention of the UTF-8 character seem so bad at least from an implementation standpoint. Unfortunately going that way I will probably have to use something else than iconv() for the character set conversion...
After some thinking this type of solution is only really suitable for a read-only filesystem implementation as NTFS 0.9, not so much for a read-write filesystem like NTFS3G.
User avatar
ssolie
Beta Tester
Beta Tester
Posts: 1010
Joined: Mon Dec 20, 2010 8:51 pm
Location: Canada
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by ssolie »

salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.

@all
The character encoding used by DOS and file systems is assumed. So yes, it will "work" in the sense that byte values will be stored and retrieved. But how they are displayed will depend on your current character encoding. Note the locale directory structure was reworked because of this architecture limitation.
ExecSG Team Lead
chris
Posts: 562
Joined: Sat Jun 18, 2011 11:05 am
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by chris »

ssolie wrote:
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.

@all
The character encoding used by DOS and file systems is assumed. So yes, it will "work" in the sense that byte values will be stored and retrieved. But how they are displayed will depend on your current character encoding. Note the locale directory structure was reworked because of this architecture limitation.
Is there any chance of moving to UTF-8 for filename storage? I get similar problems with SMBFS where it looks fine on the Amiga side, but from elsewhere the non-ASCII filenames are nonsense.
User avatar
Raziel
Posts: 1171
Joined: Sat Jun 18, 2011 4:00 pm
Location: a dying planet

Re: filesysbox ntfs ubs massStorage problem

Post by Raziel »

chris wrote:
ssolie wrote:
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.

@all
The character encoding used by DOS and file systems is assumed. So yes, it will "work" in the sense that byte values will be stored and retrieved. But how they are displayed will depend on your current character encoding. Note the locale directory structure was reworked because of this architecture limitation.
Is there any chance of moving to UTF-8 for filename storage? I get similar problems with SMBFS where it looks fine on the Amiga side, but from elsewhere the non-ASCII filenames are nonsense.
Very much second

Vice-versa is the same btw.
I created some dirs on Linux/Windows side, they show up fine in any other program/hardware (i.e. WLAN radio player)...only the Amiga side creates rubbish out of umlauts (and of course can't access those directories/files)

If i create them on the Amiga side (renaming those files is not possible, trying to will do nothing, probably because it can't read/write it in the first place) i can work fine with them on the Amiga, but now every other program/hardware acts up.

So i took the path of least stress and not use them on the Amiga side

Renaming them to ue, ae and such isn't an option, because i have thousands of files on my NAS (and the user shouldn't be forced to)
People are dying.
Entire ecosystems are collapsing.
We are in the beginning of a mass extinction.
And all you can talk about is money and fairytales of eternal economic growth.
How dare you!
– Greta Thunberg
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

gazelle wrote:It's pretty easy to convert between an 8bit charset and unicode with IDiskfont->ObtainCharsetInfo() and the DFCS_MAPTABLE.
Thanks for this suggestion. I've just been reading up on how UTF-8 works on wikipedia. Given that I'm only interested in one mapping table (ISO-8859-15) and that it's not very large (only one kilobyte) I will probably compile and link it directly into filesysbox.library rather than relying on diskfont.library to be available.
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

@sailorMH

Now that I remember, the MorphOS filesysbox.library (at least the APL 0.730 version) actually converts filenames to whatever happened to be the default system charset when the filesystem was mounted so if this happened to be ISO-8859-2 then your filenames containing Czech letters won't be a problem. This is however not a very sound solution for a filesystem as it causes inconsistent behaviour if f.e. the filesystem is mounted before IPrefs has been run as may be the case for auto-mounted disks/partitions.

In the latest version 53.27 of filesysbox.library I've made the directory scan function reject all filenames that contain unicodes that can't be accurately mapped into ISO-8859-15. Such filenames will never work reliably until AmigaOS gets proper support for UTF-8 filenames. As a consequence of this change all files and directories with valid filenames will always be shown as filenames containing invalid characters no longer interrupt the directory scanning process.
User avatar
salass00
AmigaOS Core Developer
AmigaOS Core Developer
Posts: 530
Joined: Sat Jun 18, 2011 3:12 pm
Location: Finland
Contact:

Re: filesysbox ntfs ubs massStorage problem

Post by salass00 »

ssolie wrote:
salass00 wrote:Yes, but file names returned from filesystems should always be ISO-8859-1 or ISO-8859-15.
I remember we discussed this ages ago on the dev list but I don't remember the final answer right now.
The filesysbox.library behaviour with regards to codesets is based on the answer I got from Olaf Barthel.

IIRC even though in programs the filenames may be displayed as if they were using the locale defined codeset the filesystems still always handle the filenames internally as if they are ISO-8859-1/ISO-8859-15 encoded when doing operations with them like case insensitive string comparison.
Post Reply