Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
fast file find with wildcards, optional date range, match # bytes, binary compare?
#1
Question 
I have a ton (many thousands) of files that need to be organized and deduped. In the past I made due with Beyond Compare 4 and Agent Ransack (free fast desktop file search utility for Windows) but I'm going to need some fancy logic for this, and speed is important, and I'm thinking QB64PE might be a good platform. 

Has anyone used QB64PE to do any of these (preferably natively)? 
  • recursively search subfolders
  • compare filenames (matching strings with * ? wildcards)
  • retrieve & compare two files' size in bytes
  • retrieve & compare two files' modified dates
  • binary compare file contents
  • rename / move / copy / delete files
  • create folders
  • rename folders
  • retrieve & compare folder names
  • update a file's modified date to x

All those are things I'm going to need to do, but haven't done much of in the past in QB64PE, and examples would be most helpful. 

PS I have considered shelling out to Beyond Compare / Agent Ransack, but if this can be done natively in QB64PE with comparable performance, I'd prefer doing it natively in QB64PE, not only because it simplifies & reduces dependencies, but all these will come in handy for future QB64PE utilities. 

Any examples, links, info, much appreciated...
Reply
#2
https://qb64phoenix.com/forum/showthread.php?tid=3204

https://qb64phoenix.com/forum/showthread.php?tid=3208

https://qb64phoenix.com/forum/showthread.php?tid=3298

You might be able to cannibalize half the things you need from these three programs.

Pete
Reply
#3
(12-19-2024, 12:30 AM)Pete Wrote: https://qb64phoenix.com/forum/showthread.php?tid=3204

https://qb64phoenix.com/forum/showthread.php?tid=3208

https://qb64phoenix.com/forum/showthread.php?tid=3298

You might be able to cannibalize half the things you need from these three programs.

Pete
Thank you, sir. I will check those out!!!  Smile
Reply
#4

.zip   qb64dir.zip (Size: 6.92 KB / Downloads: 23)
The 2 bas programs in the zip contain most of this
mdDir.bas is a bit like _Files$() but also does recursion and contains all attributes/timestamps/etc.
mdDiff.bas is high performance file compare (meant for text files, but you can use same logic for binary)

For big files (>1GB) you might want to use this function to speed things up:
Code: (Select All)
Function readBigFile~&& (file$) '1.5GB/sec
Const BLOCKSIZE = 4194304 '=64*65536 = 4 MB
If Not _FileExists(file$) Then Exit Function
Dim As _Unsigned _Integer64 fsize, blocks
fsize = fileSize(file$)
Dim mem As _MEM: mem = _MemNew(fsize + BLOCKSIZE)
Dim block As String * BLOCKSIZE '=64*65536
filenum% = FreeFile
Open file$ For Random Access Read As filenum% Len = BLOCKSIZE
blocks = fsize \ BLOCKSIZE: blocks = blocks - ((fsize Mod blocks) > 0)

$Checking:Off
For blck~& = 1 To blocks
Get filenum%, , block
_MemPut mem, mem.OFFSET + mpos~&&, block
mpos~&& = mpos~&& + BLOCKSIZE
Next blck~&
Close filenum%
'For c~&& = 0 To fsize - 1
' ch% = _MemGet(mem, mem.OFFSET + c~&&, _Unsigned _Byte)
' char~&&(ch%) = char~&&(ch%) + 1
'Next c~&&
$Checking:On
_MemFree mem
readBigFile~&& = fsize
End Function
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply
#5
@mdjkens

Oooo, recursion! Do me a favor... Try running it on "C:\Program Files" or "C:\Program Files (x86)" for all file types and do a timer to see how long it takes to get all the files, not counting the printing of the file names. Oh, and print how many files were found. I want to see if it is faster than Steve's direntry.h or my simple one-line PowerShell command that kicked his ass.
The noticing will continue
Reply
#6
(12-19-2024, 01:49 PM)SpriggsySpriggs Wrote: @mdjkens

Oooo, recursion! Do me a favor... Try running it on "C:\Program Files" or "C:\Program Files (x86)" for all file types and do a timer to see how long it takes to get all the files, not counting the printing of the file names. Oh, and print how many files were found. I want to see if it is faster than Steve's direntry.h or my simple one-line PowerShell command that kicked his ass.

2.5 seconds for 80,507 files and 14,164 folders (no printing)
just remarked all prints in mdDir.bas "C:\Program Files (x86)" /s
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply
#7
Not bad, amigo! I think that does beat Steve's speed. I can't find the post on the old forums where he and I duked it out over this. His direntry.h was faster on smaller directories but slowed down considerably in larger ones. PowerShell kicked ass in big directories. The algorithm that PowerShell uses is technically faster than direntry.h at all times, but the slowdown happened due to including the time to call it from QB64.

I didn't find the one I was looking for but I did find this: https://qb64forum.alephc.xyz/index.php?t...#msg137995

It would have been even faster if I had added "-NoProfile" to my PowerShell call.
The noticing will continue
Reply
#8
I see, but specific for my solution is the fact that I wanted all file info like attributes, timestamps etc.
If you would remark that out in mdDir.bas it would be a whole lot faster
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply
#9
If I remember (I probably won't), I'll try your code out soon.
The noticing will continue
Reply




Users browsing this thread: 1 Guest(s)