This is my first program contribution, it worked so well I thought I'd pass it on.
Here is the backstory about why I wrote this program.
I had a problem. My Windows computer started BSODing during initial boot/startup. Since it was an actual BSOD with frowny face, this means it's a Windows problem, not a hardware one. After several attempts to use any of the recovery methods which would leave my files intact, I came to the conclusion that I am in a pickle. Then, I find the recovery partition no longer works. Now, I came to the condclusion that I was, quite frankly, screwed. So, I used the time-tested method of attacking a problem: I threw money at it.
I went on Amazon and purchased a refurbished machine. The new (to me) machine is actually better than the one I had. A "Dell OptiPlex 7020 Desktop Computer,Intel Quad Core i7 4790 3.6Ghz, 32GB Ram New 2TB SSD." After I received it, I discovered that while it has 4 cores (as I had expected) it has 8 threads (which i did not.) So the computer is actually better than what I thought I was buying. With that I purchased something I really needed: a 4 TB ruggedized external hard drive, so I can back up my computers without worrying about someone dropping it. I already have a 6 TB external, but I don't feel good about it being moved around.
The price was terrific: with Windows 10 Professional, it was $265.00. Add the external drive and sales tax, $401. So I set up the new computer, and have it build a Windows recovery SSD on an SD card. Plug the reader into the old computer, reset the BIOS to allow boot from an external drive, and I try again. Nothing works. So, from my new computer, I download a Linux distribution,Xubuntu. Repeat the process and it boots fine, file manager can see the internal drive, and it can even see the 6TB external, but not the 4TB ruggedized (even though the new machine does). So I copied my most recent files from my working directory to the 6 TB.
The Problem
I do have a backup of my huge collection of downloaded open-source software on the 6TB but it's old., from last year. It does not have local changes I made from writing programs. On Windows, I just have Free File Sync scan the work directory and mirror to backup. So that is out. I had, however, downloaded the Linux version of QB64PE, but attempt to install it fails because the Wifi adapter apparently is not recognized; it can't download required packages.
Well, I could just copy the new archive to replace the backup. About 1.3 million files, 100,000+ directories, 411 GB, and will take about 500 hours, So, that's not an answer. So how can I solve this problem? So, it hits me: run an 'ls' directory scan with recursive subdirectory search, piped to a file, then take that file over to my new computer and write a filtering program to run there. I had ls exclude owner and group, and list one file per line. Output from ls looks like this:
Output:
What can be determined from this is:
I have one additional problem. Just the listing of files itself is an 88 megabyte text file!
The solution:
Result? Of more than 900,000 files scanned, I need to copy 53. That's all. The program took 5 minutes, processing an average of about 2500 items per second. A really satisfying conclusion, and should put paid to those who claim Basic, and specifically QuickBasic, is not relevant for solving real-world problems.
Paul
Here is the backstory about why I wrote this program.
I had a problem. My Windows computer started BSODing during initial boot/startup. Since it was an actual BSOD with frowny face, this means it's a Windows problem, not a hardware one. After several attempts to use any of the recovery methods which would leave my files intact, I came to the conclusion that I am in a pickle. Then, I find the recovery partition no longer works. Now, I came to the condclusion that I was, quite frankly, screwed. So, I used the time-tested method of attacking a problem: I threw money at it.
I went on Amazon and purchased a refurbished machine. The new (to me) machine is actually better than the one I had. A "Dell OptiPlex 7020 Desktop Computer,Intel Quad Core i7 4790 3.6Ghz, 32GB Ram New 2TB SSD." After I received it, I discovered that while it has 4 cores (as I had expected) it has 8 threads (which i did not.) So the computer is actually better than what I thought I was buying. With that I purchased something I really needed: a 4 TB ruggedized external hard drive, so I can back up my computers without worrying about someone dropping it. I already have a 6 TB external, but I don't feel good about it being moved around.
The price was terrific: with Windows 10 Professional, it was $265.00. Add the external drive and sales tax, $401. So I set up the new computer, and have it build a Windows recovery SSD on an SD card. Plug the reader into the old computer, reset the BIOS to allow boot from an external drive, and I try again. Nothing works. So, from my new computer, I download a Linux distribution,Xubuntu. Repeat the process and it boots fine, file manager can see the internal drive, and it can even see the 6TB external, but not the 4TB ruggedized (even though the new machine does). So I copied my most recent files from my working directory to the 6 TB.
The Problem
I do have a backup of my huge collection of downloaded open-source software on the 6TB but it's old., from last year. It does not have local changes I made from writing programs. On Windows, I just have Free File Sync scan the work directory and mirror to backup. So that is out. I had, however, downloaded the Linux version of QB64PE, but attempt to install it fails because the Wifi adapter apparently is not recognized; it can't download required packages.
Well, I could just copy the new archive to replace the backup. About 1.3 million files, 100,000+ directories, 411 GB, and will take about 500 hours, So, that's not an answer. So how can I solve this problem? So, it hits me: run an 'ls' directory scan with recursive subdirectory search, piped to a file, then take that file over to my new computer and write a filtering program to run there. I had ls exclude owner and group, and list one file per line. Output from ls looks like this:
Output:
Code: (Select All)
Paul (From LENOVO)/:
total 39638
drwxrwxrwx 1 163840 Apr 30 17:45 gatekeeper
drwxrwxrwx 1 20480 Feb 21 17:06 MERGER-raw
drwxrwxrwx 1 4096 Feb 21 16:49 cvs2svn
...
-rwxrwxrwx 1 631462 Nov 23 2017 .cardpeek.log
-rwxrwxrwx 1 52475 Aug 27 2017 reasonable-argument.png
Paul (From LENOVO)/gatekeeper:
total 604715
-rwxrwxrwx 1 527259 Apr 30 17:45 Marnie.odt
What can be determined from this is:
- The current directory is shown followed by a colon.
- The first letter of a file entry is 'd' for a directory. Ignore these; we get specific directories from the prior item.
- Size summary starts with "total ".
- There is a blank line before a new directory.
- Items from 2023 have a colon in the time field, older files have a year in the field.
- Entries are separated by one space, with the file name last.
I have one additional problem. Just the listing of files itself is an 88 megabyte text file!
The solution:
Code: (Select All)
' Process ls program output to exclude files before this year
FN$ = "k:\files.list"
outFile$ = "k:\keepfiles.list"
Print
Locate 5, 1
Print Time$
FF& = FreeFile
lc = 0
Total$ = "total "
Open FN$ For Input Access Read As #FF&
OutFile& = FreeFile
Open outFile$ For Output As #OutFile&
While Not EOF(FF&)
Line Input #FF&, Line$
Line$ = _Trim$(Line$)
LineEnd = Len(Line$)
If Line$ = "" Then GoTo SKIP
If Left$(Line$, 6) = "total " GoTo SKIP ' avoid summary
Colon = InStr(Line$, ":")
If Colon = Len(Line$) Then ' it's a directry being listed
Curdir$ = _Trim$(Left$(Line$, Colon - 1))
If Right$(Curdir$, 1) <> "/" Then Curdir$ = Curdir$ + "/"
Locate 9, 1: Print Space$(240): Locate 9, 1
Print "Current dir="; Curdir$
ListDir = ListDir + 1
GoTo SKIP
End If
If Left$(Line$, 1) = "d" Then
DirCount = DirCount + 1
GoTo SKIP
End If
FileCount = FileCount + 1
' First, skip attributes
SpacePos = InStr(1, Line$, " ")
'Skip over node count
SpacePos = InStr(SpacePos + 1, Line$, " ")
'Skip over file size
SpacePos = InStr(SpacePos + 1, Line$, " ")
' determine if current year
Colon = InStr(SpacePos, Line$, ":")
If Colon = 0 Then
skipFile = skipFile + 1
' Print "colon at "; Colon; " skipping "
' Print Line$
GoTo SKIP
End If
SpacePos = InStr(Colon + 1, Line$, " ")
Print #OutFile&, Curdir$ + Mid$(Line$, SpacePos + 1)
Chosen = Chosen + 1
SKIP: '
If FileCount Mod 5000 = 0 Then
Locate 2, 1
Print FileCount; " ";
Locate 5, 20
Print Time$
End If
Wend
Close #FF&
Close #OutFile&
Locate 6, 1
Print Time$
Print "Search directories "; ListDir
Print "Subdirectories "; DirCount
Print FileCount; " Files Found"
Print skipFile; " Files skipped"
Print Chosen; " Files chosen for review"
End
Result? Of more than 900,000 files scanned, I need to copy 53. That's all. The program took 5 minutes, processing an average of about 2500 items per second. A really satisfying conclusion, and should put paid to those who claim Basic, and specifically QuickBasic, is not relevant for solving real-world problems.
Paul
While 1
Fix Bugs
report all bugs fixed
receive bug report
end while
Fix Bugs
report all bugs fixed
receive bug report
end while