QB64 Phoenix Edition
fast file find with wildcards, optional date range, match # bytes, binary compare? - Printable Version

+- QB64 Phoenix Edition (https://qb64phoenix.com/forum)
+-- Forum: QB64 Rising (https://qb64phoenix.com/forum/forumdisplay.php?fid=1)
+--- Forum: Code and Stuff (https://qb64phoenix.com/forum/forumdisplay.php?fid=3)
+---- Forum: Help Me! (https://qb64phoenix.com/forum/forumdisplay.php?fid=10)
+---- Thread: fast file find with wildcards, optional date range, match # bytes, binary compare? (/showthread.php?tid=3300)



fast file find with wildcards, optional date range, match # bytes, binary compare? - madscijr - 12-18-2024

I have a ton (many thousands) of files that need to be organized and deduped. In the past I made due with Beyond Compare 4 and Agent Ransack (free fast desktop file search utility for Windows) but I'm going to need some fancy logic for this, and speed is important, and I'm thinking QB64PE might be a good platform. 

Has anyone used QB64PE to do any of these (preferably natively)? 
  • recursively search subfolders
  • compare filenames (matching strings with * ? wildcards)
  • retrieve & compare two files' size in bytes
  • retrieve & compare two files' modified dates
  • binary compare file contents
  • rename / move / copy / delete files
  • create folders
  • rename folders
  • retrieve & compare folder names
  • update a file's modified date to x

All those are things I'm going to need to do, but haven't done much of in the past in QB64PE, and examples would be most helpful. 

PS I have considered shelling out to Beyond Compare / Agent Ransack, but if this can be done natively in QB64PE with comparable performance, I'd prefer doing it natively in QB64PE, not only because it simplifies & reduces dependencies, but all these will come in handy for future QB64PE utilities. 

Any examples, links, info, much appreciated...


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - Pete - 12-19-2024

https://qb64phoenix.com/forum/showthread.php?tid=3204

https://qb64phoenix.com/forum/showthread.php?tid=3208

https://qb64phoenix.com/forum/showthread.php?tid=3298

You might be able to cannibalize half the things you need from these three programs.

Pete


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - madscijr - 12-19-2024

(12-19-2024, 12:30 AM)Pete Wrote: https://qb64phoenix.com/forum/showthread.php?tid=3204

https://qb64phoenix.com/forum/showthread.php?tid=3208

https://qb64phoenix.com/forum/showthread.php?tid=3298

You might be able to cannibalize half the things you need from these three programs.

Pete
Thank you, sir. I will check those out!!!  Smile


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - mdijkens - 12-19-2024


.zip   qb64dir.zip (Size: 6.92 KB / Downloads: 105)
The 2 bas programs in the zip contain most of this
mdDir.bas is a bit like _Files$() but also does recursion and contains all attributes/timestamps/etc.
mdDiff.bas is high performance file compare (meant for text files, but you can use same logic for binary)

For big files (>1GB) you might want to use this function to speed things up:
Code: (Select All)
Function readBigFile~&& (file$) '1.5GB/sec
Const BLOCKSIZE = 4194304 '=64*65536 = 4 MB
If Not _FileExists(file$) Then Exit Function
Dim As _Unsigned _Integer64 fsize, blocks
fsize = fileSize(file$)
Dim mem As _MEM: mem = _MemNew(fsize + BLOCKSIZE)
Dim block As String * BLOCKSIZE '=64*65536
filenum% = FreeFile
Open file$ For Random Access Read As filenum% Len = BLOCKSIZE
blocks = fsize \ BLOCKSIZE: blocks = blocks - ((fsize Mod blocks) > 0)

$Checking:Off
For blck~& = 1 To blocks
Get filenum%, , block
_MemPut mem, mem.OFFSET + mpos~&&, block
mpos~&& = mpos~&& + BLOCKSIZE
Next blck~&
Close filenum%
'For c~&& = 0 To fsize - 1
' ch% = _MemGet(mem, mem.OFFSET + c~&&, _Unsigned _Byte)
' char~&&(ch%) = char~&&(ch%) + 1
'Next c~&&
$Checking:On
_MemFree mem
readBigFile~&& = fsize
End Function



RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - SpriggsySpriggs - 12-19-2024

@mdjkens

Oooo, recursion! Do me a favor... Try running it on "C:\Program Files" or "C:\Program Files (x86)" for all file types and do a timer to see how long it takes to get all the files, not counting the printing of the file names. Oh, and print how many files were found. I want to see if it is faster than Steve's direntry.h or my simple one-line PowerShell command that kicked his ass.


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - mdijkens - 12-19-2024

(12-19-2024, 01:49 PM)SpriggsySpriggs Wrote: @mdjkens

Oooo, recursion! Do me a favor... Try running it on "C:\Program Files" or "C:\Program Files (x86)" for all file types and do a timer to see how long it takes to get all the files, not counting the printing of the file names. Oh, and print how many files were found. I want to see if it is faster than Steve's direntry.h or my simple one-line PowerShell command that kicked his ass.

2.5 seconds for 80,507 files and 14,164 folders (no printing)
just remarked all prints in mdDir.bas "C:\Program Files (x86)" /s


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - SpriggsySpriggs - 12-19-2024

Not bad, amigo! I think that does beat Steve's speed. I can't find the post on the old forums where he and I duked it out over this. His direntry.h was faster on smaller directories but slowed down considerably in larger ones. PowerShell kicked ass in big directories. The algorithm that PowerShell uses is technically faster than direntry.h at all times, but the slowdown happened due to including the time to call it from QB64.

I didn't find the one I was looking for but I did find this: https://qb64forum.alephc.xyz/index.php?topic=4360.msg137995#msg137995

It would have been even faster if I had added "-NoProfile" to my PowerShell call.


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - mdijkens - 12-19-2024

I see, but specific for my solution is the fact that I wanted all file info like attributes, timestamps etc.
If you would remark that out in mdDir.bas it would be a whole lot faster


RE: fast file find with wildcards, optional date range, match # bytes, binary compare? - SpriggsySpriggs - 12-19-2024

If I remember (I probably won't), I'll try your code out soon.