Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Alphabetical sort of characters within a string.
#31
oh crap! i messed up my tests i need a separate counter for the array index because i wanted to run the test 25 times on opening closing the file.

screwed up day
b = b + ...
Reply
#32
peteCode still edging out anaCode even when run first:
Code: (Select All)
_Title "anaCode$ versus peteCode$ with dictionary" ' b+ 2024-04-25 speed test
Dim a$(25000), p$(25000)

' check i made steve 2nd optimization correctly
'test$(0) = "grmaana"
'test$(1) = "angiogram"
'test$(2) = "naagrma"
'test$(3) = "telgram"
'test$(4) = "gramana"
'test$(5) = "gram"
'test$(6) = "nag"
'test$(7) = "tag"
'test$(8) = "am"
'test$(9) = "grip"

'For i = 0 To 9
'    Print test$(i), AnaCode$(UCase$(test$(i))), peteCode$(UCase$(test$(i)))
'Next
' ok looks right


Dim i As Integer, j As Integer

start = Timer(.01)
For i = 1 To 25
    j = 0
    Open "WORDS.txt" For Input As #1
    While Not EOF(1)
        Input #1, w$
        j = j + 1
        p$(j) = peteCode$(w$)
    Wend
    Close
Next
petetime = Timer(.01) - start

start = Timer(.01)
For i = 1 To 25
    j = 0
    Open "WORDS.txt" For Input As #1
    While Not EOF(1)
        Input #1, w$
        j = j + 1
        a$(j) = AnaCode$(w$)
    Wend
    Close
Next
anatime = Timer(.01) - start


Print "PeteTime ="; petetime, "AnaTime ="; anatime
For i = 1 To 25000
    If a$(i) <> p$(i) Then Print a$(i), p$(i)
Next
Print "END OF DIFF CHECK"

Print: Print " Check tail end of arrays:"
For i = j - 15 To j
    Print a$(i), p$(i)
Next

' return sorted ancagram code string for any word, call ucase$(wrd$) for all caps
Function AnaCode$ (wrd$) ' anaCode$ converts word to an Anagram pattern
    ' wrd$ is assumed to be in all capitals!!!
    ' number of A's in first, number of B's in 2nd, number of C's in third
    Dim As Integer L(65 To 90), i, p
    Dim rtn$
    For i = 1 To Len(wrd$)
        p = Asc(wrd$, i) ' A=1, B=2...
        L(p) = L(p) + 1
    Next
    For i = 65 To 90
        rtn$ = rtn$ + String$(L(i), Chr$(i)) ' thanks steve for string$ idea
    Next
    AnaCode$ = rtn$
End Function

Function peteCode$ (a$) 'converts word to an Anagram pattern
    ' a$ is assumed to be all caps call ucase$(a$) if not
    Dim i%, seed%, rtn$
    For i% = 65 To 90
        seed% = InStr(a$, Chr$(i%))
        While seed%
            rtn$ = rtn$ + Chr$(i%)
            seed% = InStr(seed% + 1, a$, Chr$(i%))
        Wend
    Next
    peteCode$ = rtn$
End Function

   

check this out carefully i've been making mistakes left and right today!
b = b + ...
Reply
#33
One thing for certain: don't do the string work on blank strings.

For i = 65 To 90
IF L(i) > 0 THEN
rtn$ = rtn$ + String$(L(i), Chr$(i)) ' thanks steve for string$ idea
END IF
Next

And drop the CHR$:
rtn$ = rtn$ + String$(L(i), i)
Reply
#34
(04-25-2024, 08:03 PM)SMcNeill Wrote: One thing for certain: don't do the string work on blank strings.

For i = 65 To 90
IF L(i) > 0 THEN
rtn$ = rtn$ + String$(L(i), Chr$(i)) ' thanks steve for string$ idea
END IF
Next

And drop the CHR$:
rtn$ = rtn$ + String$(L(i), i)

+1 ok @SMcNeill that made a significant difference! did not know string$ could do either chr$ or asc
Code: (Select All)
_Title "anaCode$ versus peteCode$ with dictionary" ' b+ 2024-04-25 speed test

' anaCode$ much improved now

Dim a$(25000), p$(25000)

' check i made steve 2nd optimization correctly
'test$(0) = "grmaana"
'test$(1) = "angiogram"
'test$(2) = "naagrma"
'test$(3) = "telgram"
'test$(4) = "gramana"
'test$(5) = "gram"
'test$(6) = "nag"
'test$(7) = "tag"
'test$(8) = "am"
'test$(9) = "grip"

'For i = 0 To 9
'    Print test$(i), AnaCode$(UCase$(test$(i))), peteCode$(UCase$(test$(i)))
'Next
' ok looks right


Dim i As Integer, j As Integer

start = Timer(.01)
For i = 1 To 25
    j = 0
    Open "WORDS.txt" For Input As #1
    While Not EOF(1)
        Input #1, w$
        j = j + 1
        a$(j) = AnaCode$(w$)
    Wend
    Close
Next
anatime = Timer(.01) - start

start = Timer(.01)
For i = 1 To 25
    j = 0
    Open "WORDS.txt" For Input As #1
    While Not EOF(1)
        Input #1, w$
        j = j + 1
        p$(j) = peteCode$(w$)
    Wend
    Close
Next
petetime = Timer(.01) - start

Print "AnaTime ="; anatime, "PeteTime ="; petetime

For i = 1 To 25000 ' checking for differences between array values
    If a$(i) <> p$(i) Then Print a$(i), p$(i)
Next
Print "END OF DIFF CHECK"

Print: Print " Check tail end of arrays:"
For i = j - 15 To j
    Print a$(i), p$(i)
Next

' return sorted ancagram code string for any word, call ucase$(wrd$) for all caps
Function AnaCode$ (wrd$) ' anaCode$ converts word to an Anagram pattern
    ' wrd$ is assumed to be in all capitals!!!
    ' number of A's in first, number of B's in 2nd, number of C's in third
    Dim As Integer L(65 To 90), i, p
    Dim rtn$
    For i = 1 To Len(wrd$)
        p = Asc(wrd$, i) ' A=1, B=2...
        L(p) = L(p) + 1
    Next
    For i = 65 To 90
        If L(i) Then rtn$ = rtn$ + String$(L(i), i) ' thanks steve for whole line here!!!
    Next
    AnaCode$ = rtn$
End Function

Function peteCode$ (a$) 'converts word to an Anagram pattern
    ' a$ is assumed to be all caps call ucase$(a$) if not
    Dim i%, seed%, rtn$
    For i% = 65 To 90
        seed% = InStr(a$, Chr$(i%))
        While seed%
            rtn$ = rtn$ + Chr$(i%)
            seed% = InStr(seed% + 1, a$, Chr$(i%))
        Wend
    Next
    peteCode$ = rtn$
End Function

   
b = b + ...
Reply
#35
@bplus As for why I swapped over to using mid$ as I did, take a look at my newest post on speed optimizations here: https://qb64phoenix.com/forum/showthread...1#pid24751

In this case, it may not make such a large difference as the strings are all just single words in size.  I'm just trying to incorportate the new method into my standard coding habits, as much as possible, as it certainly makes a sizable difference in times for larger strings.  Smile
Reply
#36
@bplus

I think it was some five or six years ago I started working on a C/C++keyboard input routine. I finished it, and from a few of the articles I read discovered that it is supposed to be faster to manipulate an existing string than to add to a string. Now for the functions you are using, I modified both INSTR() approaches to work in that manner. The first does unique letters, the second displays multiple instances of any duplicated letters, which I think was the one you were testing with. Now it may be faster, but to achieve the same results, another counting variable needed to be added and the initial string to apply mid$() had to be created. So the question of any actual speed improvements is uncertain.

If you'd like to test it, just replace whichever one is appropriate in your timed system and see how it performs...

Code: (Select All)
a$ = "uncopyrightable"
Print ">"; petecode1$(a$); "<"
Function petecode1$ (a$)
    rtn$ = Space$(26)
    For i% = 97 To 122
        seed% = InStr(a$, Chr$(i%))
        If seed% Then
            j% = j% + 1
            Mid$(rtn$, j%) = Chr$(i%)
        End If
    Next
    petecode1$ = RTrim$(rtn$)
End Function

Code: (Select All)
a$ = "uncopyrightaabbble"
Print ">"; petecode2$(a$); "<"
Function petecode2$ (a$)
    rtn$ = Space$(Len(a$))
    For i% = 97 To 122
        seed% = InStr(a$, Chr$(i%))
        While seed%
            j% = j% + 1
            Mid$(rtn$, j%) = Chr$(i%)
            seed% = InStr(seed% + 1, a$, Chr$(i%))
        Wend
    Next
    petecode2$ = RTrim$(rtn$)
End Function

Pete
Reply
#37
Pete wrote:

a$ = "uncopyrightable"
For i = 1 To 26
    If InStr(LCase$(a$), Chr$(96 + i)) Then Print Chr$(96 + i);
Next


Ok, very short.
But, if the word has the same letter twice (or more), some of them are lost if the final result must have the same length.
Why not yes ?
Reply
#38
(04-26-2024, 07:23 AM)euklides Wrote: Pete wrote:

a$ = "uncopyrightable"
For i = 1 To 26
    If InStr(LCase$(a$), Chr$(96 + i)) Then Print Chr$(96 + i);
Next


Ok, very short.
But, if the word has the same letter twice (or more), some of them are lost if the final result must have the same length.
Disregarding the lost letters mentioned above, what if we adjust the for/next values so we can eliminate two additions that repeat every loop? Would that speed things up a little, or are the calculation results cached somehow so that recalcs are not needed?

a$ = "uncopyrightable"
For i = 97 To 122
    If InStr(LCase$(a$), Chr$(i)) Then Print Chr$(i);
Next
Reply
#39
(04-26-2024, 12:52 AM)SMcNeill Wrote: @bplus As for why I swapped over to using mid$ as I did, take a look at my newest post on speed optimizations here: https://qb64phoenix.com/forum/showthread...1#pid24751

In this case, it may not make such a large difference as the strings are all just single words in size.  I'm just trying to incorportate the new method into my standard coding habits, as much as possible, as it certainly makes a sizable difference in times for larger strings.  Smile

ok, ha i tried a version of mid$ already too bad it was cluttered with a bunch of other string functions at the time

another damn speed test, judging by your latest post and op tests it should be worth the trouble.
b = b + ...
Reply
#40
(04-26-2024, 01:38 AM)Pete Wrote: @bplus

I think it was some five or six years ago I started working on a C/C++keyboard input routine. I finished it, and from a few of the articles I read discovered that it is supposed to be faster to manipulate an existing string than to add to a string. Now for the functions you are using, I modified both INSTR() approaches to work in that manner. The first does unique letters, the second displays multiple instances of any duplicated letters, which I think was the one you were testing with. Now it may be faster, but to achieve the same results, another counting variable needed to be added and the initial string to apply mid$() had to be created. So the question of any actual speed improvements is uncertain.

If you'd like to test it, just replace whichever one is appropriate in your timed system and see how it performs...

Code: (Select All)
a$ = "uncopyrightable"
Print ">"; petecode1$(a$); "<"
Function petecode1$ (a$)
    rtn$ = Space$(26)
    For i% = 97 To 122
        seed% = InStr(a$, Chr$(i%))
        If seed% Then
            j% = j% + 1
            Mid$(rtn$, j%) = Chr$(i%)
        End If
    Next
    petecode1$ = RTrim$(rtn$)
End Function

Code: (Select All)
a$ = "uncopyrightaabbble"
Print ">"; petecode2$(a$); "<"
Function petecode2$ (a$)
    rtn$ = Space$(Len(a$))
    For i% = 97 To 122
        seed% = InStr(a$, Chr$(i%))
        While seed%
            j% = j% + 1
            Mid$(rtn$, j%) = Chr$(i%)
            seed% = InStr(seed% + 1, a$, Chr$(i%))
        Wend
    Next
    petecode2$ = RTrim$(rtn$)
End Function

Pete

ok looks like i will be comparing anacode with the mid$ mod to petecode with mid$ mod
sounds interesting, how much imporvment for 25,000 word file (almost), done 25 times, with mid$

good experiment alright!
b = b + ...
Reply




Users browsing this thread: 3 Guest(s)