Word-list creator - PhilOfPerth - 02-09-2025
In many of my games. I need to check in a word-list, either sequentially or randomly.
I use this programme to prepare lists of words of the appropriate length, to 15 letters.
It creates 4 lists from the Full Oxford Dictionary: - a sequential file of words up to the selected length
- a sequential file with words of exactly the selected length
- a random-access file with words up to the selected length
- a random-access file with words of exactly the selected length
Maybe someone else will find it useful.
Code: (Select All) ' Creates 2 Random Access files and 2 Sequential lists of words of a selected length
' from the Collins Scrabble Dictionary,' which has about 280000 words of 2 to 15 letters.
' One of the Random and one of the Sequential lists has has all words UP TO AND INCLUDING
' the selected length. The other has only words OF EXACTLY the selected length.
' Time taken should not exceed 10 seconds.
ScreenSetup:
Screen _NewImage(1200, 800, 32)
SetFont: f& = _LoadFont("C:\WINDOWS\fonts\courbd.ttf", 24, "monospace"): _Font f&
lhs = (_DesktopWidth - 1200) / 2
_ScreenMove lhs, 86 ' centre display on screen
Common Shared CPL, CTR
CPL = 1200 / _PrintWidth("X")
Data "MaxThree","MaxFour","MaxFive","MaxSix","MaxSeven","MaxEight","MaxNine"
DefLng A ' 4 bytes allows for all word lengths
Yellow: Centre "Word-List Maker", 2: white: Print
Print " This prog prepares four files wih words to a length you select:": Print
Print " 1 Random-access, words of the selected length only (Rn) "
Print " 2 Random-access, words up to and including selected length (R_ALLn) "
Print " 3 Sequential, words of the selected length only (Sn)"
Print " 4 Sequential, words up to and including selected length (S_ALLn)"
Print Tab(5); "(where n is the trimmed string of your selected length)"
Red: Centre "If files exist with any of those names, they will be replaced!", 12
Yellow: Centre "Choose the max word size you require (3 to 15 letters).", 14
GEtSize:
Yellow: Locate 16, 32: Input MaxL$: white: Cls
MaxL = Val(MaxL$)
If MaxL < 3 Or MaxL > 15 Then GoTo GEtSize
Rn:
n = 0
Open "w15.txt" For Input As #1 '
FileName$ = "R" + LTrim$(Str$(MaxL)): RecSize = MaxL + 4
If _FileExists(FileName$) Then Kill FileName$
Txt$ = "Creating Random Access file " + FileName$
Yellow: Centre Txt$, 2: Print
Open FileName$ For Random As #2 Len = RecSize
While Not EOF(1)
Input #1, wd$
If Len(wd$) = MaxL Then
Put #2, , wd$
n = n + 1
End If
Wend
ConfirmRn:
Txt$ = FileName$ + " has" + Str$(n) + " words of" + Str$(MaxL) + " letters, for Random Access": white
Centre Txt$, CsrLin
For a = 1 To 5
Get #2, a, wd$: Print wd$
Next
Print "..."
Get #2, n, wd$: Print wd$
Close
Sleep 2
Cls
R_ALLn:
n = 0
Open "w15.txt" For Input As #1 '
FileName$ = "R_ALL" + LTrim$(Str$(MaxL)): RecSize = MaxL + 4
If _FileExists(FileName$) Then Kill FileName$
Txt$ = "Creating Random Access file " + FileName$
Yellow: Centre Txt$, 2: Print
Open FileName$ For Random As #2 Len = RecSize
While Not EOF(1)
Input #1, wd$
If Len(wd$) <= MaxL Then
Put #2, , wd$
n = n + 1
End If
Wend
ConfirmR_alln:
Txt$ = FileName$ + " has" + Str$(n) + " words, to" + Str$(MaxL) + " letters max, for Random Access": white
Centre Txt$, CsrLin
For a = 1 To 5
Get #2, a, wd$: Print wd$
Next
Print "..."
Get #2, n, wd$: Print wd$
Close
Sleep 2
Cls
Sn:
n = 0
Open "w15.txt" For Input As #1 '
FileName$ = "S" + LTrim$(Str$(MaxL)): RecSize = MaxL + 4
If _FileExists(FileName$) Then Kill FileName$
Txt$ = "Creating Sequential file " + FileName$
Yellow: Centre Txt$, 2: Print
Open FileName$ For Output As #2
While Not EOF(1)
Input #1, wd$
If Len(wd$) = MaxL Then
Write #2, wd$
n = n + 1
End If
Wend
Close
ConfirmSn:
Open FileName$ For Input As #1
Txt$ = FileName$ + " has" + Str$(n) + " words of" + Str$(MaxL) + " letters, for Sequential Access": white
Centre Txt$, CsrLin
For a = 1 To 5
Input #1, wd$: Print wd$
Next
Print "..."
For a = 6 To n - 1
Input #1, wd$
Next
Input #1, wd$: Print wd$
Close
Sleep 2
Cls
S_ALLn:
n = 0
Open "w15.txt" For Input As #1 '
FileName$ = "S_ALL" + LTrim$(Str$(MaxL)): RecSize = MaxL + 4
If _FileExists(FileName$) Then Kill FileName$
Txt$ = "Creating Sequential file " + FileName$
Yellow: Centre Txt$, 2: Print
Open FileName$ For Output As #2
While Not EOF(1)
Input #1, wd$
If Len(wd$) <= MaxL Then
Write #2, wd$
n = n + 1
End If
Wend
Close
ConfirmS_alln:
Txt$ = FileName$ + " has" + Str$(n) + " words, to" + Str$(MaxL) + " letters max, for Sequential Access": white
Centre Txt$, CsrLin
Open FileName$ For Input As #1
For a = 1 To 5
Input #1, wd$: Print wd$
Next
Print "..."
For a = 6 To n - 1
Input #1, wd$
Next
Input #1, wd$: Print wd$
Close
Print: Print "Created 4 files"
Sleep
Sub WIPE (ln$)
If Len(ln$) = 1 Then ln$ = "0" + ln$ ' catch single-digit line numbers
For a = 1 To Len(ln$) - 1 Step 2
wl = Val(Mid$(ln$, a, 2))
Locate wl, 1: Print Space$(100)
Next
End Sub
Sub Centre (txt$, linenum)
CTR = Int(CPL / 2 - Len(txt$) / 2) + 1
Locate linenum, CTR
Print txt$
End Sub
Sub white
Color _RGB(255, 255, 255)
End Sub
Sub Yellow
Color _RGB(255, 255, 0)
End Sub
Sub Red
Color _RGB(255, 100, 100)
End Sub
RE: Word-list creator - Sanmayce - 02-17-2025
Did you make a benchmark measuring how many words per second you can check against your wordlist?
How rich your wordlist is?
You may find this post instrumental in making cross-lists:
https://forums.fedoraforum.org/showthread.php?334070-Extra-High-Quality-English-Wordlist-cooking-amp-refining&p=1890383#post1890383
Do you have the feeling you need to speed it up, bigtime?
Using an inferior hash chain in a simplistic C program makes wonders, a 1 million-words-per-second should be the worst rate of finding if you ask me - just making the Q standing for Quick again, hee-hee.
RE: Word-list creator - PhilOfPerth - 02-17-2025
(02-17-2025, 06:50 AM)Sanmayce Wrote: Did you make a benchmark measuring how many words per second you can check against your wordlist?
How rich your wordlist is?
You may find this post instrumental in making cross-lists:
https://forums.fedoraforum.org/showthread.php?334070-Extra-High-Quality-English-Wordlist-cooking-amp-refining&p=1890383#post1890383
Do you have the feeling you need to speed it up, bigtime?
Using an inferior hash chain in a simplistic C program makes wonders, a 1 million-words-per-second should be the worst rate of finding if you ask me - just making the Q standing for Quick again, hee-hee.
@Sanmayce No, no and no.
I was not interested in speed; It creates the four lists with all 15 chars max words from the Oxford dictionary (about 500000 words), and takes about 45 seconds to do this - long enough for me to grab a drink. It's just a tool that I use occasionally for my word-game preparations; no biggie. Thought someone may find it convenient.
|