Posts: 649
Threads: 95
Joined: Apr 2022
Reputation:
22
02-26-2024, 04:20 AM
(This post was last modified: 02-26-2024, 04:22 AM by PhilOfPerth.)
Another simple one:
What's the best way to skip a data item when searching a file? (the number of items is not known, and the item lengths are varied).
For example, to skip an item and jump to a GetData line in this listing:
MyFile$="MyFile"
Open MyFile$ For Input As #1
' put GetData: here?
While Not EOF(1)
' or GetData: here?
Input #1, data$
If Len(data$) <> 5 Then
GoTo GetData
else
DealWithIt
End If
' or GetData: here
Wend
Close
Forgetting our "phobias" about GoTo for now,
If the GetData point is before the While statement, will this create another loop, and leave the first one unresolved?
If it's just after the While statement, does whe While restart input at the beginning of the file?
Is the position just before the Wend the correct place?
Posts: 1,277
Threads: 120
Joined: Apr 2022
Reputation:
100
(02-26-2024, 04:20 AM)PhilOfPerth Wrote: Another simple one:
What's the best way to skip a data item when searching a file? (the number of items is not known, and the item lengths are varied).
For example, to skip an item and jump to a GetData line in this listing:
MyFile$="MyFile"
Open MyFile$ For Input As #1
' put GetData: here?
While Not EOF(1)
' or GetData: here?
Input #1, data$
If Len(data$) <> 5 Then
GoTo GetData
else
DealWithIt
End If
' or GetData: here
Wend
Close
Forgetting our "phobias" about GoTo for now,
If the GetData point is before the While statement, will this create another loop, and leave the first one unresolved?
If it's just after the While statement, does whe While restart input at the beginning of the file?
Is the position just before the Wend the correct place? Why not do this:
MyFile$ = "MyFile"
OPEN MyFile$ FOR INPUT AS #1
WHILE NOT EOF(1)
INPUT #1, data$
IF LEN(data$) = 5 THEN DealWithIt
WEND
CLOSE #1
Using GOTO to jump somewhere within a loop is mostly ok, jumping out of the loop (after the ending loop statement) is ok, but jumping to before the loop statement is going to cause headaches.
Also, the "phobias" against using GOTO are well founded. If possible always try to find another way.
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Posts: 597
Threads: 110
Joined: Apr 2022
Reputation:
34
(02-26-2024, 04:20 AM)PhilOfPerth Wrote: Another simple one:
What's the best way to skip a data item when searching a file? (the number of items is not known, and the item lengths are varied).
For example, to skip an item and jump to a GetData line in this listing:
MyFile$="MyFile"
Open MyFile$ For Input As #1
' put GetData: here?
While Not EOF(1)
' or GetData: here?
Input #1, data$
If Len(data$) <> 5 Then
GoTo GetData
else
DealWithIt
End If
' or GetData: here
Wend
Close
Forgetting our "phobias" about GoTo for now,
If the GetData point is before the While statement, will this create another loop, and leave the first one unresolved?
If it's just after the While statement, does whe While restart input at the beginning of the file?
Is the position just before the Wend the correct place?
A Goto in the following scenario, I am very comfortable with that. It stays within the loop, and it is a GOTO with close proximity, so easy to see what is going on.
Code: (Select All) MyFile$="MyFile"
Open MyFile$ For Input As #1
While Not EOF(1)
Input #1, data$
If Len(data$) <> 5 Then
GoTo [b]GetData[/b]
else
DealWithIt
End If
' [b]GetData:[/b] here
Wend
Close
Posts: 3,936
Threads: 175
Joined: Apr 2022
Reputation:
216
02-26-2024, 03:14 PM
(This post was last modified: 02-26-2024, 05:45 PM by bplus.)
@PhilOfPerth If this is for picking things out of a Dictionary, you should load the dictionary into an array because file access is so slow compared to RAM the program uses. Surely you would use the dictionary more than once in an app.
b = b + ...
Posts: 649
Threads: 95
Joined: Apr 2022
Reputation:
22
02-26-2024, 11:37 PM
(This post was last modified: 02-26-2024, 11:39 PM by PhilOfPerth.)
Thanks bplus, yes, it is for the dictionary.
I was using file access so that I could give users the option of downloading, or not, the dictioanary files.
Maybe that isn't such an overhead; I'll look at including them in the prog.
When I first constructed the Wordlists, I was including the word-meanings, and this was much larger.
Posts: 3,936
Threads: 175
Joined: Apr 2022
Reputation:
216
02-27-2024, 01:41 AM
(This post was last modified: 02-27-2024, 01:57 AM by bplus.)
Yeah loading a dictionary file into an array takes awhile.
I took a dictionary file and made an Random access file of just the words. When I know which index the word is at then I know what the index the word and definition is at in the full file. Index same as Record number ie Get #1, recNumber...
For the Random access words I used a record length of 15 letters for the biggest word I could use.
That left word and definition records for the other Random access file I setup something like fixed string length of 237, no word and definition was greater than 237 letters long.
in setup for game play:
Code: (Select All) Open "Collins_Word_List.RA" For Random As #1 Len = 15
So I used this code to find the word using Random Access for files where you get stuff by record number:
Code: (Select All) Function Find& (x$) ' if I am using this only to find words in dictionary, I can mod to optimize
' the RA file is opened and ready for gets
Dim As Long low, hi, test
Dim w$
If Len(x$) < 3 Then Exit Function ' words need to be 3 letters
low = 1: hi = NTopWord
While low <= hi
test = Int((low + hi) / 2)
Get #1, test, rec15
w$ = _Trim$(rec15)
If w$ = x$ Then
Find& = test: Exit Function
Else
If w$ < x$ Then low = test + 1 Else hi = test - 1
End If
Wend
End Function
This is similar to that Hi Lo Game I showed a couple weeks ago for Binary Search.
Here is code I used to grab the definition when the user requested it:
Code: (Select All) Function defineWord$ (w$) ' this will not edit out definitions that have () in them
Dim nDef As Long
w$ = UCase$(w$)
nDef = Find&(w$)
If nDef Then
Open "Collins Words and Defs.RA" For Random As #2 Len = 237
Get #2, nDef, rec237
Close #2
End If
defineWord$ = _Trim$(rec237)
End Function
So I did not load the dictionary file into an array in the code and use RAM, I opened the Random Access file for the words at the start of the program and did the binary seach for words when needed. Again the index of the word was same as index to word and definition in the other Random Access file. It worked very well as there was no long delay at the start of the program to load the array and Binary Search for a word in a Random Access file did not take noticable time either.
So that method is something to consider that takes a little bit of work to set up the word and word plus definitions files but saves time when using the game.
rec15 and rec237 are fixed length strings I used for a record buffer for getting words or words with definitions.
NTopWord is the highest record number the files contained. NTopWord = 279496 for my Collins Dictionary after removing words longer than 15 letters.
@PhilOfPerth if you want I can zip the two RA files to you.
b = b + ...
Posts: 649
Threads: 95
Joined: Apr 2022
Reputation:
22
(02-27-2024, 01:41 AM)bplus Wrote: Yeah loading a dictionary file into an array takes awhile.
I took a dictionary file and made an Random access file of just the words. When I know which index the word is at then I know what the index the word and definition is at in the full file. Index same as Record number ie Get #1, recNumber...
For the Random access words I used a record length of 15 letters for the biggest word I could use.
That left word and definition records for the other Random access file I setup something like fixed string length of 237, no word and definition was greater than 237 letters long.
in setup for game play:
Code: (Select All) Open "Collins_Word_List.RA" For Random As #1 Len = 15
So I used this code to find the word using Random Access for files where you get stuff by record number:
Code: (Select All) Function Find& (x$) ' if I am using this only to find words in dictionary, I can mod to optimize
' the RA file is opened and ready for gets
Dim As Long low, hi, test
Dim w$
If Len(x$) < 3 Then Exit Function ' words need to be 3 letters
low = 1: hi = NTopWord
While low <= hi
test = Int((low + hi) / 2)
Get #1, test, rec15
w$ = _Trim$(rec15)
If w$ = x$ Then
Find& = test: Exit Function
Else
If w$ < x$ Then low = test + 1 Else hi = test - 1
End If
Wend
End Function
This is similar to that Hi Lo Game I showed a couple weeks ago for Binary Search.
Here is code I used to grab the definition when the user requested it:
Code: (Select All) Function defineWord$ (w$) ' this will not edit out definitions that have () in them
Dim nDef As Long
w$ = UCase$(w$)
nDef = Find&(w$)
If nDef Then
Open "Collins Words and Defs.RA" For Random As #2 Len = 237
Get #2, nDef, rec237
Close #2
End If
defineWord$ = _Trim$(rec237)
End Function
So I did not load the dictionary file into an array in the code and use RAM, I opened the Random Access file for the words at the start of the program and did the binary seach for words when needed. Again the index of the word was same as index to word and definition in the other Random Access file. It worked very well as there was no long delay at the start of the program to load the array and Binary Search for a word in a Random Access file did not take noticable time either.
So that method is something to consider that takes a little bit of work to set up the word and word plus definitions files but saves time when using the game.
rec15 and rec237 are fixed length strings I used for a record buffer for getting words or words with definitions.
NTopWord is the highest record number the files contained. NTopWord = 279496 for my Collins Dictionary after removing words longer than 15 letters.
@PhilOfPerth if you want I can zip the two RA files to you.
Thanks bplus.
It took me a while (I tend to lose track of things these days ) , but I worked through both the RA file builder and the search that you wrote, and after experimenting a bit I have a RA of words up to 9 letters (that's all I need). It takes a while to build the file, but once built it's very quick, quicker than my 26-file linear access version anyway.
.
Posts: 2,164
Threads: 222
Joined: Apr 2022
Reputation:
103
Old age sucks. I pulled something off 15 or so years ago, using a hash table method to do some dictionary-like thing. I have little memory of anything anymore detailed than that to go on. Oops, there's pudding being served in the rec room. Gotta go.
Pete
Posts: 2,697
Threads: 327
Joined: Apr 2022
Reputation:
217
How many words are we talking about here? Can you share them and their definitions for us? We'll have fun seeing who can come up with the fastest way to index and access them.
Posts: 649
Threads: 95
Joined: Apr 2022
Reputation:
22
02-29-2024, 06:56 AM
(This post was last modified: 02-29-2024, 07:08 AM by PhilOfPerth.)
Well, there are about 280000 words in the file, with meanings for each (nearly 18M all up), but we may not need all of these - I only use up to 11 chars in length for the words, and weeded out a lot of guff from the meanings. It would be good to have a quick search for all of the words, and optionally, meanings separately.
I can attach it, but it's a bit huge!
|