Posts: 343
Threads: 24
Joined: Jul 2022
Reputation:
20
Hi DSMan195276
intersting your post:
the result of test that compares the different algorithms for storing and finding data using a value as index.
I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.
And so this way of storing data shows the weakness of string delimiter in this case!
The bottom line of output should be the Hashtable with buckets, the algorithm that you have shared here.
Please can you post also the performance of HashTable with buckets using the previous way to store data?
Better have two points of observation than one.
Posts: 714
Threads: 102
Joined: Apr 2022
Reputation:
14
(04-02-2023, 07:50 PM)bplus Wrote: OK you guys made me look at what I did with Dictionary, no hash but I did get multiple values for a given key:
Code: (Select All) ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
'notes: REDIM the array(0) to be loaded before calling Split '<<<< IMPORTANT dynamic array and empty, can use any lbound though
'This SUB will take a given N delimited string, and delimiter$ and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
'notes: the loadMeArray() needs to be dynamic string array and will not change the LBOUND of the array it is given. rev 2019-08-27
Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
dpos = InStr(curpos, SplitMeString, delim)
Do Until dpos = 0
loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
arrpos = arrpos + 1
If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
curpos = dpos + LD
dpos = InStr(curpos, SplitMeString, delim)
Loop
loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub
I did use Split but for removing or modifying multiple items under a key. K stands for key and V stands for value(s).
Hey bplus, dictionary aside, I see you have an updated split function that preserves the lbound of the array.
I've been using Luke's forever, but preserving lbound might be a nice feature.
I tried testing both versions of split, but yours is blowing up on me.
If you get a minute, could you give this a try?
Code: (Select All) ReDim MyArray(-1) As String
Dim MyString As String
Dim MyDelim As String
Dim iLoop As Integer
' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print
MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String
Print "Before LukeSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print
MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String
Print "Before LukeSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #2-A"
Print
MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String
Print "Before BplusSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #2-B"
Print
MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String
Print "Before BplusSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' /////////////////////////////////////////////////////////////////////////////
' FROM:
' An hash array dictonary step by step (reply #14)
' https://qb64phoenix.com/forum/showthread.php?tid=1547&pid=14929#pid14929
' note: I buggered this twice now,
' FOR base 1 array REDIM MyArray (1 to 1) AS ...
' the (1 to 1) is not same as (1) which was the Blunder!!!
' notes: REDIM the array(0) to be loaded before calling Split
' <<<< IMPORTANT dynamic array and empty, can use any lbound though
' This SUB will take a given N delimited string, and delimiter$
' and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
' notes: the loadMeArray() needs to be dynamic string array
' and will not change the LBOUND of the array it is given.
' rev 2019-08-27
Sub BplusSplit (SplitMeString As String, delim As String, loadMeArray() As String)
Dim curpos As Long
Dim arrpos As Long
Dim LD As Long
Dim dpos As Long
curpos = 1
arrpos = LBound(loadMeArray) ' fix use the Lbound the array already has
LD = Len(delim)
dpos = InStr(curpos, SplitMeString, delim)
Do Until dpos = 0
loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
arrpos = arrpos + 1
If arrpos > UBound(loadMeArray) Then
ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
End If
curpos = dpos + LD
dpos = InStr(curpos, SplitMeString, delim)
Loop
loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String ' get the ubound correct
End Sub ' BplusSplit
' /////////////////////////////////////////////////////////////////////////////
' Splits a string in$ by delimeter delimiter$
' into an array result$().
' Split and join strings
' https://www.qb64.org/forum/index.php?topic=1073.0
'
' FROM luke, QB64 Developer
' Date: February 15, 2019, 04:11:07 AM
'
' Given a string of words separated by spaces (or any other character),
' splits it into an array of the words. I've no doubt many people have
' written a version of this over the years and no doubt there's a million
' ways to do it, but I thought I'd put mine here so we have at least one
' version. There's also a join function that does the opposite
' array -> single string.
'
' Code is hopefully reasonably self explanatory with comments and a little demo.
' Note, this is akin to Python/JavaScript split/join, PHP explode/implode.
' Split in$ into pieces, chopping at every occurrence of delimiter$.
' Multiple consecutive occurrences of delimiter$ are treated as a single instance.
' The chopped pieces are stored in result$().
' delimiter$ must be one character long.
' result$() must have been REDIMmed previously.
' Modified to handle multi-character delimiters
Sub LukeSplit (in$, delimiter$, result$())
Dim start As Integer
Dim finish As Integer
Dim iDelimLen As Integer
ReDim result$(-1)
iDelimLen = Len(delimiter$)
start = 1
Do
While Mid$(in$, start, iDelimLen) = delimiter$
start = start + iDelimLen
If start > Len(in$) Then
Exit Sub
End If
Wend
finish = InStr(start, in$, delimiter$)
If finish = 0 Then
finish = Len(in$) + 1
End If
ReDim _Preserve result$(0 To UBound(result$) + 1)
result$(UBound(result$)) = Mid$(in$, start, finish - start)
start = finish + 1
Loop While start <= Len(in$)
End Sub ' LukeSplit
Posts: 301
Threads: 10
Joined: Apr 2022
Reputation:
44
(04-11-2023, 05:36 PM)TempodiBasic Wrote: I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.
Kinda, it's more that Luke's hash table doesn't support multiple values with the same key. I should have included the code for my "fix":
Code: (Select All) SUB InsertLookup (key$, lngIndex&)
DIM hIndex~&
hIndex~& = GetHash~&(key$, UBOUND(m_arrLookup))
DO
IF m_arrLookup(hIndex~&) = 0 THEN EXIT DO
IF m_arrDict(m_arrLookup(hIndex~&)).key = key$ THEN EXIT DO ' I added this line
hIndex~& = (hIndex~& + 1) MOD (UBOUND(m_arrLookup) + 1)
LOOP
m_arrLookup(hIndex~&) = lngIndex&
END SUB ' InsertLookup
The change causes the insert to return an existing entry if the same key is already in the hash table. Otherwise, the loop in that function has to loop over all the existing identical keys before it finds the next free spot, which is very slow and gets slower the more keys you add.
My hash table is the same and also doesn't support multiple entries with the same key (lookup can only return a single entry for a given key), but it doesn't have the performance issue Luke's does because `HashTableAdd&` doesn't have to loop over all the existing entries in a bucket. It still doesn't work though, when you add an entry with the same key the old ones are still in the bucket but can't be accessed via the lookup function.
Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to `nxt` that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.
That said, I'd probably call this the user's problem A hash table doesn't really need to support this natively, you can always modify the `Value` Type to hold a list and get the same thing.
Additionally, for a speed comparison, I can add a similar check in my `HashTableAdd&` that blocks adding entries with the same key. Technically in typical usage this actually makes it slower, because now the `Add` function has to loop over all the entries. For your particular test though it actually makes it faster, because it means avoiding some ReDim's on the `HashEntries()` array:
It's not particularly consistent on my machine though (you can see everything is a bit slower today).
Posts: 3,885
Threads: 174
Joined: Apr 2022
Reputation:
201
04-11-2023, 11:49 PM
(This post was last modified: 04-11-2023, 11:56 PM by bplus.)
(04-11-2023, 08:12 PM)madscijr Wrote: (04-02-2023, 07:50 PM)bplus Wrote: OK you guys made me look at what I did with Dictionary, no hash but I did get multiple values for a given key:
Code: (Select All) ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
'notes: REDIM the array(0) to be loaded before calling Split '<<<< IMPORTANT dynamic array and empty, can use any lbound though
'This SUB will take a given N delimited string, and delimiter$ and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
'notes: the loadMeArray() needs to be dynamic string array and will not change the LBOUND of the array it is given. rev 2019-08-27
Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
dpos = InStr(curpos, SplitMeString, delim)
Do Until dpos = 0
loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
arrpos = arrpos + 1
If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
curpos = dpos + LD
dpos = InStr(curpos, SplitMeString, delim)
Loop
loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub
I did use Split but for removing or modifying multiple items under a key. K stands for key and V stands for value(s).
Hey bplus, dictionary aside, I see you have an updated split function that preserves the lbound of the array.
I've been using Luke's forever, but preserving lbound might be a nice feature.
I tried testing both versions of split, but yours is blowing up on me.
If you get a minute, could you give this a try?
Code: (Select All) ReDim MyArray(-1) As String
Dim MyString As String
Dim MyDelim As String
Dim iLoop As Integer
' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print
MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String
Print "Before LukeSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print
MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String
Print "Before LukeSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #2-A"
Print
MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String
Print "Before BplusSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #2-B"
Print
MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String
Print "Before BplusSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' /////////////////////////////////////////////////////////////////////////////
' FROM:
' An hash array dictonary step by step (reply #14)
' https://qb64phoenix.com/forum/showthread.php?tid=1547&pid=14929#pid14929
' note: I buggered this twice now,
' FOR base 1 array REDIM MyArray (1 to 1) AS ...
' the (1 to 1) is not same as (1) which was the Blunder!!!
' notes: REDIM the array(0) to be loaded before calling Split
' <<<< IMPORTANT dynamic array and empty, can use any lbound though
' This SUB will take a given N delimited string, and delimiter$
' and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
' notes: the loadMeArray() needs to be dynamic string array
' and will not change the LBOUND of the array it is given.
' rev 2019-08-27
Sub BplusSplit (SplitMeString As String, delim As String, loadMeArray() As String)
Dim curpos As Long
Dim arrpos As Long
Dim LD As Long
Dim dpos As Long
curpos = 1
arrpos = LBound(loadMeArray) ' fix use the Lbound the array already has
LD = Len(delim)
dpos = InStr(curpos, SplitMeString, delim)
Do Until dpos = 0
loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
arrpos = arrpos + 1
If arrpos > UBound(loadMeArray) Then
ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
End If
curpos = dpos + LD
dpos = InStr(curpos, SplitMeString, delim)
Loop
loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String ' get the ubound correct
End Sub ' BplusSplit
' /////////////////////////////////////////////////////////////////////////////
' Splits a string in$ by delimeter delimiter$
' into an array result$().
' Split and join strings
' https://www.qb64.org/forum/index.php?topic=1073.0
'
' FROM luke, QB64 Developer
' Date: February 15, 2019, 04:11:07 AM
'
' Given a string of words separated by spaces (or any other character),
' splits it into an array of the words. I've no doubt many people have
' written a version of this over the years and no doubt there's a million
' ways to do it, but I thought I'd put mine here so we have at least one
' version. There's also a join function that does the opposite
' array -> single string.
'
' Code is hopefully reasonably self explanatory with comments and a little demo.
' Note, this is akin to Python/JavaScript split/join, PHP explode/implode.
' Split in$ into pieces, chopping at every occurrence of delimiter$.
' Multiple consecutive occurrences of delimiter$ are treated as a single instance.
' The chopped pieces are stored in result$().
' delimiter$ must be one character long.
' result$() must have been REDIMmed previously.
' Modified to handle multi-character delimiters
Sub LukeSplit (in$, delimiter$, result$())
Dim start As Integer
Dim finish As Integer
Dim iDelimLen As Integer
ReDim result$(-1)
iDelimLen = Len(delimiter$)
start = 1
Do
While Mid$(in$, start, iDelimLen) = delimiter$
start = start + iDelimLen
If start > Len(in$) Then
Exit Sub
End If
Wend
finish = InStr(start, in$, delimiter$)
If finish = 0 Then
finish = Len(in$) + 1
End If
ReDim _Preserve result$(0 To UBound(result$) + 1)
result$(UBound(result$)) = Mid$(in$, start, finish - start)
start = finish + 1
Loop While start <= Len(in$)
End Sub ' LukeSplit
Wow how can ubound be -1 and lbound be 0 ???
Change
ReDim MyArray(-1) as String ' this messes up my assumptions about Ubound and LBound
to
ReDim MyArray(-1 to -1) as String
and it works fine.
I've always assumed the Ubound >= Lbound, silly me! ;-))
I even had this note above the Split routine:
Quote: ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
You make same blunder with (-1).
b = b + ...
Posts: 714
Threads: 102
Joined: Apr 2022
Reputation:
14
(04-11-2023, 11:49 PM)bplus Wrote: (04-11-2023, 08:12 PM)madscijr Wrote: (04-02-2023, 07:50 PM)bplus Wrote: OK you guys made me look at what I did with Dictionary, no hash but I did get multiple values for a given key:
Code: (Select All) ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
'notes: REDIM the array(0) to be loaded before calling Split '<<<< IMPORTANT dynamic array and empty, can use any lbound though
'This SUB will take a given N delimited string, and delimiter$ and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
'notes: the loadMeArray() needs to be dynamic string array and will not change the LBOUND of the array it is given. rev 2019-08-27
Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
dpos = InStr(curpos, SplitMeString, delim)
Do Until dpos = 0
loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
arrpos = arrpos + 1
If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
curpos = dpos + LD
dpos = InStr(curpos, SplitMeString, delim)
Loop
loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub
I did use Split but for removing or modifying multiple items under a key. K stands for key and V stands for value(s).
Hey bplus, dictionary aside, I see you have an updated split function that preserves the lbound of the array.
I've been using Luke's forever, but preserving lbound might be a nice feature.
I tried testing both versions of split, but yours is blowing up on me.
If you get a minute, could you give this a try?
Code: (Select All) ReDim MyArray(-1) As String
Dim MyString As String
Dim MyDelim As String
Dim iLoop As Integer
' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print
MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String
Print "Before LukeSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print
MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String
Print "Before LukeSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #2-A"
Print
MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String
Print "Before BplusSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' =============================================================================
Cls
Print "Compare array split functions, test #2-B"
Print
MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String
Print "Before BplusSplit:"
Print " ReDim MyArray(-1) As String:"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print " MyString =" + Chr$(34) + MyString + Chr$(34)
Print " MyDelim =" + Chr$(34) + MyDelim + Chr$(34)
Print
BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print " lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print " ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
Print " MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print
Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1
' /////////////////////////////////////////////////////////////////////////////
' FROM:
' An hash array dictonary step by step (reply #14)
' https://qb64phoenix.com/forum/showthread.php?tid=1547&pid=14929#pid14929
' note: I buggered this twice now,
' FOR base 1 array REDIM MyArray (1 to 1) AS ...
' the (1 to 1) is not same as (1) which was the Blunder!!!
' notes: REDIM the array(0) to be loaded before calling Split
' <<<< IMPORTANT dynamic array and empty, can use any lbound though
' This SUB will take a given N delimited string, and delimiter$
' and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
' notes: the loadMeArray() needs to be dynamic string array
' and will not change the LBOUND of the array it is given.
' rev 2019-08-27
Sub BplusSplit (SplitMeString As String, delim As String, loadMeArray() As String)
Dim curpos As Long
Dim arrpos As Long
Dim LD As Long
Dim dpos As Long
curpos = 1
arrpos = LBound(loadMeArray) ' fix use the Lbound the array already has
LD = Len(delim)
dpos = InStr(curpos, SplitMeString, delim)
Do Until dpos = 0
loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
arrpos = arrpos + 1
If arrpos > UBound(loadMeArray) Then
ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
End If
curpos = dpos + LD
dpos = InStr(curpos, SplitMeString, delim)
Loop
loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String ' get the ubound correct
End Sub ' BplusSplit
' /////////////////////////////////////////////////////////////////////////////
' Splits a string in$ by delimeter delimiter$
' into an array result$().
' Split and join strings
' https://www.qb64.org/forum/index.php?topic=1073.0
'
' FROM luke, QB64 Developer
' Date: February 15, 2019, 04:11:07 AM
'
' Given a string of words separated by spaces (or any other character),
' splits it into an array of the words. I've no doubt many people have
' written a version of this over the years and no doubt there's a million
' ways to do it, but I thought I'd put mine here so we have at least one
' version. There's also a join function that does the opposite
' array -> single string.
'
' Code is hopefully reasonably self explanatory with comments and a little demo.
' Note, this is akin to Python/JavaScript split/join, PHP explode/implode.
' Split in$ into pieces, chopping at every occurrence of delimiter$.
' Multiple consecutive occurrences of delimiter$ are treated as a single instance.
' The chopped pieces are stored in result$().
' delimiter$ must be one character long.
' result$() must have been REDIMmed previously.
' Modified to handle multi-character delimiters
Sub LukeSplit (in$, delimiter$, result$())
Dim start As Integer
Dim finish As Integer
Dim iDelimLen As Integer
ReDim result$(-1)
iDelimLen = Len(delimiter$)
start = 1
Do
While Mid$(in$, start, iDelimLen) = delimiter$
start = start + iDelimLen
If start > Len(in$) Then
Exit Sub
End If
Wend
finish = InStr(start, in$, delimiter$)
If finish = 0 Then
finish = Len(in$) + 1
End If
ReDim _Preserve result$(0 To UBound(result$) + 1)
result$(UBound(result$)) = Mid$(in$, start, finish - start)
start = finish + 1
Loop While start <= Len(in$)
End Sub ' LukeSplit
Wow how can ubound be -1 and lbound be 0 ???
Change
ReDim MyArray(-1) as String ' this messes up my assumptions about Ubound and LBound
to
ReDim MyArray(-1 to -1) as String
and it works fine.
I've always assumed the Ubound >= Lbound, silly me! ;-))
I even had this note above the Split routine:
Quote: ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
You make same blunder with (-1).
When I was first learning QB64 back in 2020, I read that the way to initialize
an array was to redim it to (-1) so that's where that came from.
Thanks for getting to the bottom of the problem!
Posts: 2,673
Threads: 325
Joined: Apr 2022
Reputation:
214
(04-11-2023, 09:17 PM)DSMan195276 Wrote: (04-11-2023, 05:36 PM)TempodiBasic Wrote: I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.
Kinda, it's more that Luke's hash table doesn't support multiple values with the same key. I should have included the code for my "fix":
Code: (Select All) SUB InsertLookup (key$, lngIndex&)
DIM hIndex~&
hIndex~& = GetHash~&(key$, UBOUND(m_arrLookup))
DO
IF m_arrLookup(hIndex~&) = 0 THEN EXIT DO
IF m_arrDict(m_arrLookup(hIndex~&)).key = key$ THEN EXIT DO ' I added this line
hIndex~& = (hIndex~& + 1) MOD (UBOUND(m_arrLookup) + 1)
LOOP
m_arrLookup(hIndex~&) = lngIndex&
END SUB ' InsertLookup
The change causes the insert to return an existing entry if the same key is already in the hash table. Otherwise, the loop in that function has to loop over all the existing identical keys before it finds the next free spot, which is very slow and gets slower the more keys you add.
My hash table is the same and also doesn't support multiple entries with the same key (lookup can only return a single entry for a given key), but it doesn't have the performance issue Luke's does because `HashTableAdd&` doesn't have to loop over all the existing entries in a bucket. It still doesn't work though, when you add an entry with the same key the old ones are still in the bucket but can't be accessed via the lookup function.
Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to `nxt` that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.
That said, I'd probably call this the user's problem A hash table doesn't really need to support this natively, you can always modify the `Value` Type to hold a list and get the same thing.
Additionally, for a speed comparison, I can add a similar check in my `HashTableAdd&` that blocks adding entries with the same key. Technically in typical usage this actually makes it slower, because now the `Add` function has to loop over all the entries. For your particular test though it actually makes it faster, because it means avoiding some ReDim's on the `HashEntries()` array:
It's not particularly consistent on my machine though (you can see everything is a bit slower today).
Here's how I tend to do this type of little thing:
Code: (Select All) Dim Shared dict(1000003) As String
Print "One moment, loading dictionary..."
ImportDict "370099 Word List.txt" 'change to your favorite local dictionary
Screen _NewImage(800, 600, 32)
Do
Print "Enter a word to check for => ";
Input word$
If word$ = "" Then System
If DictLookup(word$) Then
Print word$; " is found in the loaded dictionary."
Else
Print word$; " is not found in the loaded dictionary. Sorry."
End If
Loop
Sub ImportDict (dict$)
f = FreeFile
Open dict$ For Binary As #f
Do Until EOF(1)
Line Input #1, word$
DictAdd word$
Loop
Close #f
End Sub
Function DictLookup (word$)
v = WordValue(word$)
If InStr(dict(v), Chr$(0) + word$ + Chr$(0)) Then DictLookup = -1
End Function
Sub DictAdd (word$)
v = WordValue(word$)
dict(v) = dict(v) + Chr$(0) + word$ + Chr$(0)
End Sub
Function WordValue (word$)
For i = 1 To Len(word$)
a = Asc(word$, i)
If i < 7 Then m = 10 ^ (7 - i)
a = a * m
v = v + a
Next
WordValue = v Mod 1000003
End Function
370099 Word List.7z (Size: 880.44 KB / Downloads: 45)
<< Get the word list here.
Now, with a dictionary of 370000 words, this generates a dictionary which contains a quick lookup table with a max of 15 entries per value, and in most cases has a 1 to 1 lookup value. (A binary search would have a max search of 19 passes to find our word, just for comparison...)
Quick. Simple. Efficient. What's not to love?
Posts: 3,885
Threads: 174
Joined: Apr 2022
Reputation:
201
04-14-2023, 05:18 PM
(This post was last modified: 04-14-2023, 05:48 PM by bplus.)
What's throwing me off is that the way I learned a Dictionary Structure it has 2 parts for each item, a Key and a Value for that Key.
1) Key is lookup Identifier like names of variables or colors or words
2) the associated value(s) would be qualities of the name, values for the variable, or colors in _RGB32 say for the color name or definitions for word keys.
So Steve's example is too simple. It is a regular dictionary function you would use for checking if words existed in a given dictionary. You aren't even getting the associated definition for a word. It's very nice for what it is though.
ref:
https://en.wikibooks.org/wiki/A-level_Co...ctionaries
b = b + ...
Posts: 301
Threads: 10
Joined: Apr 2022
Reputation:
44
Yeah we've been talking about the CS data structure "dictionary", which maps between a key and a value. Steve's data structure is a set, you can add entries to it and then ask it whether a particular entry is in the set (DictLookup).
Posts: 2,673
Threads: 325
Joined: Apr 2022
Reputation:
214
(04-14-2023, 05:18 PM)bplus Wrote: What's throwing me off is that the way I learned a Dictionary Structure it has 2 parts for each item, a Key and a Value for that Key.
1) Key is lookup Identifier like names of variables or colors or words
2) the associated value(s) would be qualities of the name, values for the variable, or colors in _RGB32 say for the color name or definitions for word keys.
So Steve's example is too simple. It is a regular dictionary function you would use for checking if words existed in a given dictionary. You aren't even getting the associated definition for a word. It's very nice for what it is though.
ref:
https://en.wikibooks.org/wiki/A-level_Co...ctionaries
Aye. I was just showing how I'd tend to link more items onto the same hash entry, so I wanted to keep it nice and simple to just highlight that type of usage. For the associative values, I'd tend to do the same thing, except I'd structure my data so that the associative data came after the dictionary data.
Foe example, let's say I was working with color values, such as you mention above. The DictValue would stay the same, but when I wrote my data, I'd write it as:
CHR$(0) + color$ + chr$(0) + color_value + chr$(1)
Thus when I did my lookup, I'd simply look for the CHR$(0) + color$ + CHR$(0) value as I'm doing now, to find my dictionary key, and then I'd grab the color value from that last CHR$(0) until the CHR$(1).
It's basically the combination of a linked list with a hash table, allowing for a quick search and lookup for the items inside them.
My philosophy tends to be: When showing a demo or an example of something, try to focus *only* on what you're wanting to demonstrate as much as possible and leave anything extra for later. What I was showing here was what Matt was talking about with:
Quote:Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to `nxt` that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.
That said, I'd probably call this the user's problem A hash table doesn't really need to support this natively, you can always modify the `Value` Type to hold a list and get the same thing.
Posts: 476
Threads: 25
Joined: Nov 2022
Reputation:
45
QB64PE needs a native dictionary
*gets his coat*
|