Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Need help capturng unicoded directory names
#11
the only viable path is really dir/x, because otherwise even DIR shows ?? in the full name mode.

Spriggsy, I'm really curious how your program will deal with this problem.


Attached Files Image(s)
   


Reply
#12
Here is a test file.  I have compress the jpg images to smallest possible.  There are 2 sub directories inside test.  Both are named with special characters.
Seems you can not post .7z .rar  Only a jpg.  I used depositfiles to post the test.rar

https://depositfiles.com/files/o5ur5mrk1
Reply
#13
(Yesterday, 05:52 PM)Petr Wrote: the only viable path is really dir/x, because otherwise even DIR shows ?? in the full name mode.

Spriggsy, I'm really curious how your program will deal with this problem.

Once I make my code 4.0 compatible, I can test this out again and show it off. But my unicode stuff never had an issue with Asian characters. I actually got involved in Unicode specifically because I had lots of files with Asian characters in them that I wanted to be able to work with without having to rename them all.
Tread on those who tread on you

Reply
#14
(Yesterday, 07:03 PM)SpriggsySpriggs Wrote:
(Yesterday, 05:52 PM)Petr Wrote: the only viable path is really dir/x, because otherwise even DIR shows ?? in the full name mode.

Spriggsy, I'm really curious how your program will deal with this problem.

Once I make my code 4.0 compatible, I can test this out again and show it off. But my unicode stuff never had an issue with Asian characters. I actually got involved in Unicode specifically because I had lots of files with Asian characters in them that I wanted to be able to work with without having to rename them all.
I'm glad to hear that. I'd love to see how you do it and then try to use it to improve my file reading functions.

As I already know, the DIR way is wrong for one reason. Some Windows operating systems simply have a different number of spaces in the listings and therefore it cannot be reliably used internationally. (we have tested this with Ashish long time ago) and Indian windows use different spacing than Czech windows, so was worked for me, worked not for Ashish and vice versa - what worked for Ashish worked not here. Problem was - different spaces count in output DIR  text file in row.


Reply
#15
Of course, if my very favorite DirEntry.h also failed in this case, it is certainly not enough to display the file correctly on the monitor, but also to allow access to it via the Open command. I don't know if there is already a function (maybe there is), but if _MapUnicode can convert 2-byte text to correctly displayed strings, then I somehow miss the possibility of using it to access the file.
It is probably not possible at all, because the IDE would directly support it. And it does not support it. Do a test: Create a folder with characters in a language other than English, put a BAS file in it and then try to open it via the IDE. It doesn't work. The folder is displayed (with a garbled name) and when you try to open it, its contents are not displayed because its name is incorrect. I can't deal with this problem. Huh


Reply
#16
Good news, bad news.  Good news, I found a workaround.  Bad news, requires an external program (called renamemaster)  Really bad news, Still need a working solution which uses qb64pe to automate the process.  Really, really bad news, Still need a unicode file/dir name solution to use with qb64pe for other programs which could benefit.

Thanks, looking forward to SpriggsySpriggs coming to the rescue. (his v4 compatible code)
Reply
#17
[Image: image.png]

There's got to be some system settings which need tweaking.  As you can see from the screenshot above, I'm not having any issues whatsoever with this returning the proper values for me.
Reply
#18
Give the following program a test run (change paths as necessary):

Code: (Select All)
$Console:Only

Shell "CHCP 437"
Shell "dir Z:\test\*.* > temp.txt"
Open "temp.txt" For Input As #1
Do Until EOF(1)
Line Input #1, text$
Print text$
Loop
Close

Shell "CHCP 65001"
Shell "dir Z:\test\*.* > temp.txt"
Open "temp.txt" For Input As #1
Do Until EOF(1)
Line Input #1, text$
Print text$
Loop




Quote:Active code page: 437
Volume in drive Z is RamDisk
Volume Serial Number is 30D1-E851

Directory of Z:\test

12/17/2024 10:04 PM <DIR> .
12/17/2024 12:45 PM <DIR> Vol.08 Ch.0040 (en) [?????? Scan]
12/17/2024 12:45 PM <DIR> Vol.09 Ch.0041 (en) [?????? Scan]
0 File(s) 0 bytes
3 Dir(s) 8,427,694,080 bytes free
Active code page: 65001
Volume in drive Z is RamDisk
Volume Serial Number is 30D1-E851

Directory of Z:\test

12/17/2024 10:04 PM <DIR> .
12/17/2024 12:45 PM <DIR> Vol.08 Ch.0040 (en) [一人の新しい Scan]
12/17/2024 12:45 PM <DIR> Vol.09 Ch.0041 (en) [一人の新しい Scan]
0 File(s) 0 bytes
3 Dir(s) 8,427,694,080 bytes free

Press any key to continue

You can see the difference in the output above for me.

You need a proper font set, and then the proper unicode codepage (65001), and then it works without issue on my machine.

And you should be able to use one of these to set the console font:

_ConsoleFont "Lucida Console", 20
_ConsoleFont "Consolas", 24
_ConsoleFont "Courier New", 16
Reply
#19
And to showcase that this approach works on my system, here's a deeper example where it uses that file information to swap to get information and interact with those internal directories:

Code: (Select All)
$Console:Only

'_ConsoleFont "Lucida Console", 20
_ConsoleFont "Consolas", 24
'_ConsoleFont "Courier New", 16
Shell "CHCP 65001"
Shell "dir /b Z:\test\*.* > temp.txt"
Open "temp.txt" For Input As #1
Do Until EOF(1)
    Line Input #1, text$
    Print text$
    fulldir$ = "z:\test\" + text$
    If _DirExists(fulldir$) Then
        Shell "dir /b " + Chr$(34) + fulldir$ + Chr$(34) + "\*.* > temp2.txt"
        Open "temp2.txt" For Input As #2
        Do Until EOF(2)
            Line Input #2, temp$
            Print Tab(10); "---->"; temp$
        Loop
        Close #2
    End If
Loop
Close



[Image: image.png]
Reply
#20
Thanks steve for coming up with a QB64pe only solution.  And I realize it's the console window that stays open, allowing code page to be found (or changed) by multiple shell's.  I will experiment with different unicode pages in the console window.  To see if they can be identified separately.  I suspect they can.

This has to be referenced in the wiki, in the Shell and Console:Only as foot notes.  I can't be the only one stumped by this.

Thanks again

My external solution works too, but not as elegantly as yours.  I hate having to rely on third part programs.
If it can't be done with QB64pe.  Then keep bashing until it can.
Reply




Users browsing this thread: 4 Guest(s)