Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
An hash array dictonary step by step
#21
Hi DSMan195276

intersting your post:
the result of test that compares the different algorithms for storing and finding data using a value as index.

I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.
And so this way of storing data shows the weakness of string delimiter in this case!
The bottom line of output should be the Hashtable with buckets, the algorithm that you have shared here.

Please can you post also the performance of HashTable with buckets  using the previous way to store data?
Better have two points of observation than one.
Reply
#22
(04-02-2023, 07:50 PM)bplus Wrote: OK you guys made me look at what I did with Dictionary, no hash but I did get multiple values for a given key:
Code: (Select All)
' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
'notes: REDIM the array(0) to be loaded before calling Split '<<<< IMPORTANT dynamic array and empty, can use any lbound though
'This SUB will take a given N delimited string, and delimiter$ and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
'notes: the loadMeArray() needs to be dynamic string array and will not change the LBOUND of the array it is given.  rev 2019-08-27
Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
    curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub

I did use Split but for removing or modifying multiple items under a key. K stands for key and V stands for value(s).

Hey bplus, dictionary aside, I see you have an updated split function that preserves the lbound of the array. 
I've been using Luke's forever, but preserving lbound might be a nice feature. 
I tried testing both versions of split, but yours is blowing up on me. 
If you get a minute, could you give this a try? 

Code: (Select All)
ReDim MyArray(-1) As String
Dim MyString As String
Dim MyDelim As String
Dim iLoop As Integer

' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print

MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String

Print "Before LukeSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print

MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String

Print "Before LukeSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #2-A"
Print

MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String

Print "Before BplusSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #2-B"
Print

MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String

Print "Before BplusSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' /////////////////////////////////////////////////////////////////////////////

' FROM:
' An hash array dictonary step by step (reply #14)
' https://qb64phoenix.com/forum/showthread.php?tid=1547&pid=14929#pid14929

' note: I buggered this twice now,
' FOR base 1 array REDIM MyArray (1 to 1) AS ...
' the (1 to 1) is not same as (1) which was the Blunder!!!

' notes: REDIM the array(0) to be loaded before calling Split
' <<<< IMPORTANT dynamic array and empty, can use any lbound though

' This SUB will take a given N delimited string, and delimiter$
' and create an array of N+1 strings using the LBOUND of the given dynamic array to load.

' notes: the loadMeArray() needs to be dynamic string array
' and will not change the LBOUND of the array it is given.
' rev 2019-08-27

Sub BplusSplit (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long
    Dim arrpos As Long
    Dim LD As Long
    Dim dpos As Long
    curpos = 1
    arrpos = LBound(loadMeArray) ' fix use the Lbound the array already has
    LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then
            ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        End If
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String ' get the ubound correct
End Sub ' BplusSplit

' /////////////////////////////////////////////////////////////////////////////
' Splits a string in$ by delimeter delimiter$
' into an array result$().

' Split and join strings
' https://www.qb64.org/forum/index.php?topic=1073.0
'
' FROM luke, QB64 Developer
' Date: February 15, 2019, 04:11:07 AM
'
' Given a string of words separated by spaces (or any other character),
' splits it into an array of the words. I've no doubt many people have
' written a version of this over the years and no doubt there's a million
' ways to do it, but I thought I'd put mine here so we have at least one
' version. There's also a join function that does the opposite
' array -> single string.
'
' Code is hopefully reasonably self explanatory with comments and a little demo.
' Note, this is akin to Python/JavaScript split/join, PHP explode/implode.

' Split in$ into pieces, chopping at every occurrence of delimiter$.
' Multiple consecutive occurrences of delimiter$ are treated as a single instance.
' The chopped pieces are stored in result$().

' delimiter$ must be one character long.
' result$() must have been REDIMmed previously.

' Modified to handle multi-character delimiters

Sub LukeSplit (in$, delimiter$, result$())
    Dim start As Integer
    Dim finish As Integer
    Dim iDelimLen As Integer
    ReDim result$(-1)

    iDelimLen = Len(delimiter$)

    start = 1
    Do
        While Mid$(in$, start, iDelimLen) = delimiter$
            start = start + iDelimLen
            If start > Len(in$) Then
                Exit Sub
            End If
        Wend
        finish = InStr(start, in$, delimiter$)
        If finish = 0 Then
            finish = Len(in$) + 1
        End If

        ReDim _Preserve result$(0 To UBound(result$) + 1)

        result$(UBound(result$)) = Mid$(in$, start, finish - start)
        start = finish + 1
    Loop While start <= Len(in$)
End Sub ' LukeSplit
Reply
#23
(04-11-2023, 05:36 PM)TempodiBasic Wrote: I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.

Kinda, it's more that Luke's hash table doesn't support multiple values with the same key. I should have included the code for my "fix":

Code: (Select All)
SUB InsertLookup (key$, lngIndex&)
    DIM hIndex~&

    hIndex~& = GetHash~&(key$, UBOUND(m_arrLookup))
    DO
        IF m_arrLookup(hIndex~&) = 0 THEN EXIT DO
        IF m_arrDict(m_arrLookup(hIndex~&)).key = key$ THEN EXIT DO ' I added this line
        hIndex~& = (hIndex~& + 1) MOD (UBOUND(m_arrLookup) + 1)
    LOOP
    m_arrLookup(hIndex~&) = lngIndex&
END SUB ' InsertLookup

The change causes the insert to return an existing entry if the same key is already in the hash table. Otherwise, the loop in that function has to loop over all the existing identical keys before it finds the next free spot, which is very slow and gets slower the more keys you add.

My hash table is the same and also doesn't support multiple entries with the same key (lookup can only return a single entry for a given key), but it doesn't have the performance issue Luke's does because `HashTableAdd&` doesn't have to loop over all the existing entries in a bucket. It still doesn't work though, when you add an entry with the same key the old ones are still in the bucket but can't be accessed via the lookup function.

Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to `nxt` that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.

That said, I'd probably call this the user's problem Big Grin A hash table doesn't really need to support this natively, you can always modify the `Value` Type to hold a list and get the same thing.

Additionally, for a speed comparison, I can add a similar check in my `HashTableAdd&` that blocks adding entries with the same key. Technically in typical usage this actually makes it slower, because now the `Add` function has to loop over all the entries. For your particular test though it actually makes it faster, because it means avoiding some ReDim's on the `HashEntries()` array:

[Image: hashtable2.png]

It's not particularly consistent on my machine though (you can see everything is a bit slower today).
Reply
#24
(04-11-2023, 08:12 PM)madscijr Wrote:
(04-02-2023, 07:50 PM)bplus Wrote: OK you guys made me look at what I did with Dictionary, no hash but I did get multiple values for a given key:
Code: (Select All)
' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
'notes: REDIM the array(0) to be loaded before calling Split '<<<< IMPORTANT dynamic array and empty, can use any lbound though
'This SUB will take a given N delimited string, and delimiter$ and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
'notes: the loadMeArray() needs to be dynamic string array and will not change the LBOUND of the array it is given.  rev 2019-08-27
Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
    curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub

I did use Split but for removing or modifying multiple items under a key. K stands for key and V stands for value(s).

Hey bplus, dictionary aside, I see you have an updated split function that preserves the lbound of the array. 
I've been using Luke's forever, but preserving lbound might be a nice feature. 
I tried testing both versions of split, but yours is blowing up on me. 
If you get a minute, could you give this a try? 

Code: (Select All)
ReDim MyArray(-1) As String
Dim MyString As String
Dim MyDelim As String
Dim iLoop As Integer

' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print

MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String

Print "Before LukeSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print

MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String

Print "Before LukeSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #2-A"
Print

MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String

Print "Before BplusSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #2-B"
Print

MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String

Print "Before BplusSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' /////////////////////////////////////////////////////////////////////////////

' FROM:
' An hash array dictonary step by step (reply #14)
' https://qb64phoenix.com/forum/showthread.php?tid=1547&pid=14929#pid14929

' note: I buggered this twice now,
' FOR base 1 array REDIM MyArray (1 to 1) AS ...
' the (1 to 1) is not same as (1) which was the Blunder!!!

' notes: REDIM the array(0) to be loaded before calling Split
' <<<< IMPORTANT dynamic array and empty, can use any lbound though

' This SUB will take a given N delimited string, and delimiter$
' and create an array of N+1 strings using the LBOUND of the given dynamic array to load.

' notes: the loadMeArray() needs to be dynamic string array
' and will not change the LBOUND of the array it is given.
' rev 2019-08-27

Sub BplusSplit (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long
    Dim arrpos As Long
    Dim LD As Long
    Dim dpos As Long
    curpos = 1
    arrpos = LBound(loadMeArray) ' fix use the Lbound the array already has
    LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then
            ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        End If
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String ' get the ubound correct
End Sub ' BplusSplit

' /////////////////////////////////////////////////////////////////////////////
' Splits a string in$ by delimeter delimiter$
' into an array result$().

' Split and join strings
' https://www.qb64.org/forum/index.php?topic=1073.0
'
' FROM luke, QB64 Developer
' Date: February 15, 2019, 04:11:07 AM
'
' Given a string of words separated by spaces (or any other character),
' splits it into an array of the words. I've no doubt many people have
' written a version of this over the years and no doubt there's a million
' ways to do it, but I thought I'd put mine here so we have at least one
' version. There's also a join function that does the opposite
' array -> single string.
'
' Code is hopefully reasonably self explanatory with comments and a little demo.
' Note, this is akin to Python/JavaScript split/join, PHP explode/implode.

' Split in$ into pieces, chopping at every occurrence of delimiter$.
' Multiple consecutive occurrences of delimiter$ are treated as a single instance.
' The chopped pieces are stored in result$().

' delimiter$ must be one character long.
' result$() must have been REDIMmed previously.

' Modified to handle multi-character delimiters

Sub LukeSplit (in$, delimiter$, result$())
    Dim start As Integer
    Dim finish As Integer
    Dim iDelimLen As Integer
    ReDim result$(-1)

    iDelimLen = Len(delimiter$)

    start = 1
    Do
        While Mid$(in$, start, iDelimLen) = delimiter$
            start = start + iDelimLen
            If start > Len(in$) Then
                Exit Sub
            End If
        Wend
        finish = InStr(start, in$, delimiter$)
        If finish = 0 Then
            finish = Len(in$) + 1
        End If

        ReDim _Preserve result$(0 To UBound(result$) + 1)

        result$(UBound(result$)) = Mid$(in$, start, finish - start)
        start = finish + 1
    Loop While start <= Len(in$)
End Sub ' LukeSplit

Wow how can ubound be -1 and lbound be 0 ???

Change 
ReDim MyArray(-1) as String  ' this messes up my assumptions about Ubound and LBound
to
ReDim MyArray(-1 to -1) as String 
and it works fine.

I've always assumed the Ubound >= Lbound, silly me! ;-))

I even had this note above the Split routine:
Quote: ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
You make same blunder with (-1).
b = b + ...
Reply
#25
(04-11-2023, 11:49 PM)bplus Wrote:
(04-11-2023, 08:12 PM)madscijr Wrote:
(04-02-2023, 07:50 PM)bplus Wrote: OK you guys made me look at what I did with Dictionary, no hash but I did get multiple values for a given key:
Code: (Select All)
' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
'notes: REDIM the array(0) to be loaded before calling Split '<<<< IMPORTANT dynamic array and empty, can use any lbound though
'This SUB will take a given N delimited string, and delimiter$ and create an array of N+1 strings using the LBOUND of the given dynamic array to load.
'notes: the loadMeArray() needs to be dynamic string array and will not change the LBOUND of the array it is given.  rev 2019-08-27
Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
    curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
End Sub

I did use Split but for removing or modifying multiple items under a key. K stands for key and V stands for value(s).

Hey bplus, dictionary aside, I see you have an updated split function that preserves the lbound of the array. 
I've been using Luke's forever, but preserving lbound might be a nice feature. 
I tried testing both versions of split, but yours is blowing up on me. 
If you get a minute, could you give this a try? 

Code: (Select All)
ReDim MyArray(-1) As String
Dim MyString As String
Dim MyDelim As String
Dim iLoop As Integer

' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print

MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String

Print "Before LukeSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #1-B"
Print

MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String

Print "Before LukeSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

LukeSplit MyString, ",", MyArray()
Print "After LukeSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #2-A"
Print

MyString = "a,b,c,d,e,f"
MyDelim = ","
ReDim MyArray(-1) As String

Print "Before BplusSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' =============================================================================
Cls
Print "Compare array split functions, test #2-B"
Print

MyString = "g,h,i,j,k,l"
MyDelim = ","
ReDim MyArray(10 To 12) As String

Print "Before BplusSplit:"
Print "  ReDim MyArray(-1) As String:"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
Print "  MyString       =" + Chr$(34) + MyString + Chr$(34)
Print "  MyDelim        =" + Chr$(34) + MyDelim + Chr$(34)
Print

BplusSplit MyString, ",", MyArray()
Print "After BplusSplit MyString, " + Chr$(34) + "," + Chr$(34) + ", MyArray():"
Print "  lbound(MyArray)=" + _Trim$(Str$(LBound(MyArray)))
Print "  ubound(MyArray)=" + _Trim$(Str$(UBound(MyArray)))
For iLoop = LBound(MyArray) To UBound(MyArray)
    Print "  MyArray(" + _Trim$(Str$(iLoop)) + ")=" + Chr$(34) + MyArray(iLoop) + Chr$(34)
Next iLoop
Print

Print "Press any key to continue.": Sleep: _KeyClear: '_Delay 1

' /////////////////////////////////////////////////////////////////////////////

' FROM:
' An hash array dictonary step by step (reply #14)
' https://qb64phoenix.com/forum/showthread.php?tid=1547&pid=14929#pid14929

' note: I buggered this twice now,
' FOR base 1 array REDIM MyArray (1 to 1) AS ...
' the (1 to 1) is not same as (1) which was the Blunder!!!

' notes: REDIM the array(0) to be loaded before calling Split
' <<<< IMPORTANT dynamic array and empty, can use any lbound though

' This SUB will take a given N delimited string, and delimiter$
' and create an array of N+1 strings using the LBOUND of the given dynamic array to load.

' notes: the loadMeArray() needs to be dynamic string array
' and will not change the LBOUND of the array it is given.
' rev 2019-08-27

Sub BplusSplit (SplitMeString As String, delim As String, loadMeArray() As String)
    Dim curpos As Long
    Dim arrpos As Long
    Dim LD As Long
    Dim dpos As Long
    curpos = 1
    arrpos = LBound(loadMeArray) ' fix use the Lbound the array already has
    LD = Len(delim)
    dpos = InStr(curpos, SplitMeString, delim)
    Do Until dpos = 0
        loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
        arrpos = arrpos + 1
        If arrpos > UBound(loadMeArray) Then
            ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
        End If
        curpos = dpos + LD
        dpos = InStr(curpos, SplitMeString, delim)
    Loop
    loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
    ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String ' get the ubound correct
End Sub ' BplusSplit

' /////////////////////////////////////////////////////////////////////////////
' Splits a string in$ by delimeter delimiter$
' into an array result$().

' Split and join strings
' https://www.qb64.org/forum/index.php?topic=1073.0
'
' FROM luke, QB64 Developer
' Date: February 15, 2019, 04:11:07 AM
'
' Given a string of words separated by spaces (or any other character),
' splits it into an array of the words. I've no doubt many people have
' written a version of this over the years and no doubt there's a million
' ways to do it, but I thought I'd put mine here so we have at least one
' version. There's also a join function that does the opposite
' array -> single string.
'
' Code is hopefully reasonably self explanatory with comments and a little demo.
' Note, this is akin to Python/JavaScript split/join, PHP explode/implode.

' Split in$ into pieces, chopping at every occurrence of delimiter$.
' Multiple consecutive occurrences of delimiter$ are treated as a single instance.
' The chopped pieces are stored in result$().

' delimiter$ must be one character long.
' result$() must have been REDIMmed previously.

' Modified to handle multi-character delimiters

Sub LukeSplit (in$, delimiter$, result$())
    Dim start As Integer
    Dim finish As Integer
    Dim iDelimLen As Integer
    ReDim result$(-1)

    iDelimLen = Len(delimiter$)

    start = 1
    Do
        While Mid$(in$, start, iDelimLen) = delimiter$
            start = start + iDelimLen
            If start > Len(in$) Then
                Exit Sub
            End If
        Wend
        finish = InStr(start, in$, delimiter$)
        If finish = 0 Then
            finish = Len(in$) + 1
        End If

        ReDim _Preserve result$(0 To UBound(result$) + 1)

        result$(UBound(result$)) = Mid$(in$, start, finish - start)
        start = finish + 1
    Loop While start <= Len(in$)
End Sub ' LukeSplit

Wow how can ubound be -1 and lbound be 0 ???

Change 
ReDim MyArray(-1) as String  ' this messes up my assumptions about Ubound and LBound
to
ReDim MyArray(-1 to -1) as String 
and it works fine.

I've always assumed the Ubound >= Lbound, silly me! ;-))

I even had this note above the Split routine:
Quote: ' note: I buggered this twice now, FOR base 1 array REDIM MyArray (1 to 1) AS ... the (1 to 1) is not same as (1) which was the Blunder!!!
You make same blunder with (-1).

When I was first learning QB64 back in 2020, I read that the way to initialize
an array was to redim it to (-1) so that's where that came from. 
Thanks for getting to the bottom of the problem!
Reply
#26
(04-11-2023, 09:17 PM)DSMan195276 Wrote:
(04-11-2023, 05:36 PM)TempodiBasic Wrote: I must understand ,thinking about your post, that the very better result of Luke's algorithm is linked to a different way to store data.

Kinda, it's more that Luke's hash table doesn't support multiple values with the same key. I should have included the code for my "fix":

Code: (Select All)
SUB InsertLookup (key$, lngIndex&)
    DIM hIndex~&

    hIndex~& = GetHash~&(key$, UBOUND(m_arrLookup))
    DO
        IF m_arrLookup(hIndex~&) = 0 THEN EXIT DO
        IF m_arrDict(m_arrLookup(hIndex~&)).key = key$ THEN EXIT DO ' I added this line
        hIndex~& = (hIndex~& + 1) MOD (UBOUND(m_arrLookup) + 1)
    LOOP
    m_arrLookup(hIndex~&) = lngIndex&
END SUB ' InsertLookup

The change causes the insert to return an existing entry if the same key is already in the hash table. Otherwise, the loop in that function has to loop over all the existing identical keys before it finds the next free spot, which is very slow and gets slower the more keys you add.

My hash table is the same and also doesn't support multiple entries with the same key (lookup can only return a single entry for a given key), but it doesn't have the performance issue Luke's does because `HashTableAdd&` doesn't have to loop over all the existing entries in a bucket. It still doesn't work though, when you add an entry with the same key the old ones are still in the bucket but can't be accessed via the lookup function.

Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to `nxt` that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.

That said, I'd probably call this the user's problem Big Grin A hash table doesn't really need to support this natively, you can always modify the `Value` Type to hold a list and get the same thing.

Additionally, for a speed comparison, I can add a similar check in my `HashTableAdd&` that blocks adding entries with the same key. Technically in typical usage this actually makes it slower, because now the `Add` function has to loop over all the entries. For your particular test though it actually makes it faster, because it means avoiding some ReDim's on the `HashEntries()` array:

[Image: hashtable2.png]

It's not particularly consistent on my machine though (you can see everything is a bit slower today).

Here's how I tend to do this type of little thing:

Code: (Select All)
Dim Shared dict(1000003) As String
Print "One moment, loading dictionary..."
ImportDict "370099 Word List.txt" 'change to your favorite local dictionary

Screen _NewImage(800, 600, 32)
Do
    Print "Enter a word to check for => ";
    Input word$
    If word$ = "" Then System
    If DictLookup(word$) Then
        Print word$; " is found in the loaded dictionary."
    Else
        Print word$; " is not found in the loaded dictionary. Sorry."
    End If
Loop


Sub ImportDict (dict$)
    f = FreeFile
    Open dict$ For Binary As #f
    Do Until EOF(1)
        Line Input #1, word$
        DictAdd word$
    Loop
    Close #f
End Sub


Function DictLookup (word$)
    v = WordValue(word$)
    If InStr(dict(v), Chr$(0) + word$ + Chr$(0)) Then DictLookup = -1
End Function

Sub DictAdd (word$)
    v = WordValue(word$)
    dict(v) = dict(v) + Chr$(0) + word$ + Chr$(0)
End Sub

Function WordValue (word$)
    For i = 1 To Len(word$)
        a = Asc(word$, i)
        If i < 7 Then m = 10 ^ (7 - i)
        a = a * m
        v = v + a
    Next
    WordValue = v Mod 1000003
End Function


.7z   370099 Word List.7z (Size: 880.44 KB / Downloads: 44)   << Get the word list here.  


Now, with a dictionary of 370000 words, this generates a dictionary which contains a quick lookup table with a max of 15 entries per value, and in most cases has a 1 to 1 lookup value.  (A binary search would have a max search of 19 passes to find our word, just for comparison...)

Quick.  Simple.  Efficient.  What's not to love?
Reply
#27
What's throwing me off is that the way I learned a Dictionary Structure it has 2 parts for each item, a Key and a Value for that Key.
1) Key is lookup Identifier like names of variables or colors or words 
2) the associated value(s) would be qualities of the name, values for the variable, or colors in _RGB32 say for the color name or definitions for word keys. 

So Steve's example is too simple. It is a regular dictionary function you would use for checking if words existed in a given dictionary. You aren't even getting the associated definition for a word. It's very nice for what it is though. Smile

ref:
https://en.wikibooks.org/wiki/A-level_Co...ctionaries
b = b + ...
Reply
#28
Yeah we've been talking about the CS data structure "dictionary", which maps between a key and a value. Steve's data structure is a set, you can add entries to it and then ask it whether a particular entry is in the set (DictLookup).
Reply
#29
(04-14-2023, 05:18 PM)bplus Wrote: What's throwing me off is that the way I learned a Dictionary Structure it has 2 parts for each item,  a Key and a Value for that Key.
1) Key is lookup Identifier like names of variables or colors or words 
2) the associated value(s) would be qualities of the name, values for the variable, or colors in _RGB32 say for the color name or definitions for word keys. 

So Steve's example is too simple. It is a regular dictionary function you would use for checking if words existed in a given dictionary. You aren't even getting the associated definition for a word. It's very nice for what it is though. Smile

ref:
https://en.wikibooks.org/wiki/A-level_Co...ctionaries

Aye.  I was just showing how I'd tend to link more items onto the same hash entry, so I wanted to keep it nice and simple to just highlight that type of usage.  For the associative values, I'd tend to do the same thing, except I'd structure my data so that the associative data came after the dictionary data.

Foe example, let's say I was working with color values, such as you mention above.  The DictValue would stay the same, but when I wrote my data, I'd write it as:

CHR$(0) + color$ + chr$(0) + color_value + chr$(1)

Thus when I did my lookup, I'd simply look for the CHR$(0) + color$ + CHR$(0) value as I'm doing now, to find my dictionary key, and then I'd grab the color value from that last CHR$(0) until the CHR$(1).

It's basically the combination of a linked list with a hash table, allowing for a quick search and lookup for the items inside them.  

My philosophy tends to be:  When showing a demo or an example of something, try to focus *only* on what you're wanting to demonstrate as much as possible and leave anything extra for later.  Wink  What I was showing here was what Matt was talking about with:

Quote:Now if you _want_ to be able to add multiple entries with the same key my bucket approach is better but neither are really suited for it. What you'd probably want is to create a linked list for each set of entries with the same key. For my approach, that means adding another link next to `nxt` that points to entries with the same key, making each bucket a bit of a "list of lists". You could also do this with Luke's approach but I think it would require a separate array to store the links for the list of entries with the same key. It's definitely doable though.

That said, I'd probably call this the user's problem [Image: biggrin.png] A hash table doesn't really need to support this natively, you can always modify the `Value` Type to hold a list and get the same thing.
Reply
#30
QB64PE needs a native dictionary Smile

*gets his coat*
grymmjack (gj!)
GitHubYouTube | Soundcloud | 16colo.rs
Reply




Users browsing this thread: 28 Guest(s)