Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Function IsWord%(test$)
#1
Inspired by mdijkens comment here: https://qb64phoenix.com/forum/showthread...9#pid40209

This is really easy way to confirm if a word is in a file list of words say from Collins Dictionary:
Code: (Select All)
_Title "Test IsWord function" 'bplus 2026-02-25
Dim Shared FStr$
FStr$ = _ReadFile$("Words.txt") ' once and for all time
While 1
    Input "Enter a word to see if in Collins Dictionary"; w$
    If IsWord%(w$) Then Print w$; ", is a word." Else Print w$; ", is Not a word"
Wend
Function IsWord% (test$)
    testeol$ = Chr$(13) + Chr$(10) + UCase$(test$) + Chr$(13) + Chr$(10)
    If InStr(FStr$, testeol$) Then IsWord% = -1 Else IsWord% = 0
End Function

"Words.txt" is a text file using CRLF's to end lines. All I had to do was insert a blank line at beginning of file so that the word "AA" could be verified as a word from the file.

This is way better IMHO than using a Random Access file!

Here is the file to run your own tests:


Attached Files
.txt   Words.txt (Size: 2.96 MB / Downloads: 10)
  724  855  599  923  575  468  400  206  147  564  878  823  652  556 bxor cross forever
Reply
#2
Isn't the easiest, and fastest way, simply to load the entire file in one go and then parse it and save the data in an array?  Then just binary search on the index of the array until you find out if it's in there or not.  No random file access needed, no instr searches which go byte by byte looking for a match.  Just one array which is completely stored in memory and then a fast search for less than a few dozen binary comparisons to get the answer.

It's how we've did it countless times before on the forums, with a bazillion demos and examples to draw from out there.  Why is everyone feeling the need to look for some different way to do things now?  Just stick to the tried and true little routines that work so effortlessly.
Reply
#3
For example:

Code: (Select All)
Screen _NewImage(1024, 720, 32)
$Color:32

ReDim dict(0) As String
t1 = Timer(0.001)
LoadWordList "Scrabble WordList 2006.txt", dict()
t2 = Timer(0.001)

Print Using "#.##### seconds to load ###,### words."; t2 - t1, UBound(dict)
For i = 1 To 10
    Print dict(i),
Next
Print
For i = UBound(dict) - 10 To UBound(dict)
    Print dict(i),
Next
Print

'and just to showcase how quick arrays are, let's do a random search of the array from start to bottom
For i = 1 To 100 'let's look for 100 random words
    word$ = Chr$(65 + Rnd * 26) + Chr$(65 + Rnd * 26) + Chr$(65 + Rnd * 26)
    match = _FALSE
    For j = 1 To UBound(dict)
        Color Yellow
        If word$ = dict(j) Then Print "MATCH:"; word$,: match = _TRUE
        Color White
    Next
    If _Negate (match) Then Print "No match:"; word$,
Next

'note that this is doing nothing to binary search, nor is it exiting after finding a match, nor doing anything else to speed up the process
'this is simply testing the entire word array one by one looking for a match for our words.
'and it takes... no noticable time at all.

Sub LoadWordList (file$, WordList() As String) 'this sub loads a list of words for use later
    $Checking:Off
    Dim As String temp, t1
    Dim As Long count, p, p1
    ReDim WordList(250000) As String 'let's make a nice large array to told the words.
    '                                    I doubt any word list is going to have more than 250,000 words in it!
    If _FileExists(file$) Then 'then we have a found word list.  Let's load and parse it
        temp = _ReadFile$(file$)
        p = 1
        Do
            p1 = InStr(p, temp, Chr$(10)) 'look for a chr$(10) end of line marker
            If p1 = 0 Then p1 = InStr(p, temp, Chr$(13)) 'if no chr$(10) then look for a chr$(13) for odd files with it as the CRLF
            If p1 Then 'then we have a delimiter
                t1 = _Trim$(Mid$(temp, p, p1 - p))
                If Right$(t1, 1) = Chr$(13) Then t1 = Left$(t1, Len(t1) - 1) 'if CRLF then strip off chr$(13)
                If t1 <> "" Then 'don't add blank lines to the list
                    count = count + 1
                    WordList(count) = t1
                End If
                p = p1 + 1
            End If
        Loop Until p1 = 0
        If p < Len(temp) Then 'if there's no CRLF for the end of file, we want to last word here
            t1 = Mid$(temp$, p)
            If t1 <> "" Then 'again, don't add if it's a blank line
                count = count + 1
                WordList(count) = Mid$(temp, p)
            End If
        End If
    End If
    ReDim _Preserve WordList(count) As String
    $Checking:On
End Sub


Attached Files
.txt   Scrabble WordList 2006.txt (Size: 1.85 MB / Downloads: 1)
Reply
#4
And with the dictionary bplus used for his:

Code: (Select All)
Screen _NewImage(1024, 720, 32)
$Color:32

ReDim dict(0) As String
t1 = Timer(0.001)
LoadWordList "Collins.txt", dict(), 300000
t2 = Timer(0.001)

Print Using "#.##### seconds to load ###,### words."; t2 - t1, UBound(dict)
For i = 1 To 10
    Print dict(i),
Next
Print
For i = UBound(dict) - 10 To UBound(dict)
    Print dict(i),
Next
Print

'and just to showcase how quick arrays are, let's do a random search of the array from start to bottom
For i = 1 To 100 'let's look for 100 random words
    word$ = Chr$(65 + Rnd * 26) + Chr$(65 + Rnd * 26) + Chr$(65 + Rnd * 26)
    match = _FALSE
    For j = 1 To UBound(dict)
        Color Yellow
        If word$ = dict(j) Then Print "MATCH:"; word$,: match = _TRUE
        Color White
    Next
    If _Negate (match) Then Print "No match:"; word$,
Next

'note that this is doing nothing to binary search, nor is it exiting after finding a match, nor doing anything else to speed up the process
'this is simply testing the entire word array one by one looking for a match for our words.
'and it takes... no noticable time at all.

Sub LoadWordList (file$, WordList() As String, Limit As Long) 'this sub loads a list of words for use later
    $Checking:Off
    Dim As String temp, t1
    Dim As Long count, p, p1
    ReDim WordList(Limit) As String 'let's make a nice large array to told the words.  Set the limit you want for yourself
    If _FileExists(file$) Then 'then we have a found word list.  Let's load and parse it
        temp = _ReadFile$(file$)
        p = 1
        Do
            p1 = InStr(p, temp, Chr$(10)) 'look for a chr$(10) end of line marker
            If p1 = 0 Then p1 = InStr(p, temp, Chr$(13)) 'if no chr$(10) then look for a chr$(13) for odd files with it as the CRLF
            If p1 Then 'then we have a delimiter
                t1 = _Trim$(Mid$(temp, p, p1 - p))
                If Right$(t1, 1) = Chr$(13) Then t1 = Left$(t1, Len(t1) - 1) 'if CRLF then strip off chr$(13)
                If t1 <> "" Then 'don't add blank lines to the list
                    count = count + 1
                    WordList(count) = t1
                End If
                p = p1 + 1
            End If
        Loop Until p1 = 0
        If p < Len(temp) Then 'if there's no CRLF for the end of file, we want to last word here
            t1 = Mid$(temp$, p)
            If t1 <> "" Then 'again, don't add if it's a blank line
                count = count + 1
                WordList(count) = Mid$(temp, p)
            End If
        End If
    End If
    ReDim _Preserve WordList(count) As String
    $Checking:On
End Sub

less than a second or so from start to finish for me, and this is without trying to optimize anything or make it speedy.  Load and parse 270k words into an array.  Search that array from top to bottom 100 times... and the whole process is less than a second.  For most applications with words, it's going to be more than fast enough for whatever type of lookup you might be doing.
Reply
#5
OK and to round out this discussion here is finding a word or not (in same dictionary file as first post)
using Binary search and dandy handy Split Function.
Code: (Select All)
_Title "Test FindW&() function" 'bplus 2026-02-25
Dim Shared CRLF$
CRLF$ = Chr$(13) + Chr$(10)
ReDim Shared words$(1)
Split _ReadFile$("Words.txt"), CRLF$, words$()
Print UBound(words$)

While 1
    Input "Enter a word to see if in Collins Dictionary"; w$
    If FindW&(w$) Then Print w$; ", is a word." Else Print w$; ", is Not a word"
Wend

Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
    curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub

Function FindW& (wd$)
    Dim As Long lo, hi, m
    Dim wrd As String: wrd = UCase$(wd$)
    lo = 1: hi = 279423
    While lo <= hi
        m = (hi + lo) \ 2
        If words$(m) = wrd Then
            FindW& = m: Exit Function
        ElseIf words$(m) < wrd Then
            lo = m + 1
        Else
            hi = m - 1
        End If
    Wend
End Function
  724  855  599  923  575  468  400  206  147  564  878  823  652  556 bxor cross forever
Reply
#6
(02-25-2026, 08:58 PM)bplus Wrote: Inspired by mdijkens comment here: https://qb64phoenix.com/forum/showthread...9#pid40209

This is really easy way to confirm if a word is in a file list of words say from Collins Dictionary:
Code: (Select All)
_Title "Test IsWord function" 'bplus 2026-02-25
Dim Shared FStr$
FStr$ = _ReadFile$("Words.txt") ' once and for all time
While 1
    Input "Enter a word to see if in Collins Dictionary"; w$
    If IsWord%(w$) Then Print w$; ", is a word." Else Print w$; ", is Not a word"
Wend
Function IsWord% (test$)
    testeol$ = Chr$(13) + Chr$(10) + UCase$(test$) + Chr$(13) + Chr$(10)
    If InStr(FStr$, testeol$) Then IsWord% = -1 Else IsWord% = 0
End Function

"Words.txt" is a text file using CRLF's to end lines. All I had to do was insert a blank line at beginning of file so that the word "AA" could be verified as a word from the file.

This is way better IMHO than using a Random Access file!

Here is the file to run your own tests:
That's indeed what I had in mind. (dont forget FStr$ = Chr$(13) + Chr$(10) +  ReadFile$("Words.txt") to also find first word in file)

I fully agree with approach of Steve to load list in array, but that depends if you need/prefer efficiency and speed over short simple code. If you need to search 1000's of time, then better make a list array
45y and 2M lines of MBASIC>BASICA>QBASIC>QBX>QB64 experience
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Zeller's congruence pass 3: test day-of-week calculation algorythms for accuracy TDarcos 0 1,097 10-23-2024, 05:04 PM
Last Post: TDarcos
  Test sorting algorithms eoredson 3 969 05-04-2023, 09:38 PM
Last Post: eoredson
  Long Date Function AtomicSlaughter 2 970 05-24-2022, 08:22 PM
Last Post: bplus

Forum Jump:


Users browsing this thread: