Login

bplus · 04-25-2024, 06:52 PM

oh crap! i messed up my tests i need a separate counter for the array index because i wanted to run the test 25 times on opening closing the file.

screwed up day

bplus · (This post was last modified: 04-25-2024, 07:44 PM by bplus.)

peteCode still edging out anaCode even when run first:

Code: (Select All)
_Title "anaCode$ versus peteCode$ with dictionary" ' b+ 2024-04-25 speed test

Dim a$(25000), p$(25000)

' check i made steve 2nd optimization correctly

'test$(0) = "grmaana"

'test$(1) = "angiogram"

'test$(2) = "naagrma"

'test$(3) = "telgram"

'test$(4) = "gramana"

'test$(5) = "gram"

'test$(6) = "nag"

'test$(7) = "tag"

'test$(8) = "am"

'test$(9) = "grip"

'For i = 0 To 9

'    Print test$(i), AnaCode$(UCase$(test$(i))), peteCode$(UCase$(test$(i)))

'Next

' ok looks right

Dim i As Integer, j As Integer

start = Timer(.01)

For i = 1 To 25

    j = 0

    Open "WORDS.txt" For Input As #1

    While Not EOF(1)

        Input #1, w$

        j = j + 1

        p$(j) = peteCode$(w$)

    Wend

    Close

Next

petetime = Timer(.01) - start

start = Timer(.01)

For i = 1 To 25

    j = 0

    Open "WORDS.txt" For Input As #1

    While Not EOF(1)

        Input #1, w$

        j = j + 1

        a$(j) = AnaCode$(w$)

    Wend

    Close

Next

anatime = Timer(.01) - start

Print "PeteTime ="; petetime, "AnaTime ="; anatime

For i = 1 To 25000

    If a$(i) <> p$(i) Then Print a$(i), p$(i)

Next

Print "END OF DIFF CHECK"

Print: Print " Check tail end of arrays:"

For i = j - 15 To j

    Print a$(i), p$(i)

Next

' return sorted ancagram code string for any word, call ucase$(wrd$) for all caps

Function AnaCode$ (wrd$) ' anaCode$ converts word to an Anagram pattern

    ' wrd$ is assumed to be in all capitals!!!

    ' number of A's in first, number of B's in 2nd, number of C's in third

    Dim As Integer L(65 To 90), i, p

    Dim rtn$

    For i = 1 To Len(wrd$)

        p = Asc(wrd$, i) ' A=1, B=2...

        L(p) = L(p) + 1

    Next

    For i = 65 To 90

        rtn$ = rtn$ + String$(L(i), Chr$(i)) ' thanks steve for string$ idea

    Next

    AnaCode$ = rtn$

End Function

Function peteCode$ (a$) 'converts word to an Anagram pattern

    ' a$ is assumed to be all caps call ucase$(a$) if not

    Dim i%, seed%, rtn$

    For i% = 65 To 90

        seed% = InStr(a$, Chr$(i%))

        While seed%

            rtn$ = rtn$ + Chr$(i%)

            seed% = InStr(seed% + 1, a$, Chr$(i%))

        Wend

    Next

    peteCode$ = rtn$

End Function

Filename: petecode still edging out anacode.PNG Size: 48.61 KB 04-25-2024, 07:41 PM

check this out carefully i've been making mistakes left and right today!

**SMcNeill** · 04-25-2024, 08:03 PM

One thing for certain: don't do the string work on blank strings.

For i = 65 To 90
IF L(i) > 0 THEN
rtn$ = rtn$ + String$(L(i), Chr$(i)) ' thanks steve for string$ idea
END IF
Next

And drop the CHR$:
rtn$ = rtn$ + String$(L(i), i)

bplus · (This post was last modified: 04-25-2024, 09:50 PM by bplus.)

(04-25-2024, 08:03 PM)SMcNeill Wrote: One thing for certain: don't do the string work on blank strings.

For i = 65 To 90
IF L(i) > 0 THEN
rtn$ = rtn$ + String$(L(i), Chr$(i)) ' thanks steve for string$ idea
END IF
Next

And drop the CHR$:
rtn$ = rtn$ + String$(L(i), i)

+1 ok @SMcNeill that made a significant difference! did not know string$ could do either chr$ or asc

Code: (Select All)
_Title "anaCode$ versus peteCode$ with dictionary" ' b+ 2024-04-25 speed test

' anaCode$ much improved now

Dim a$(25000), p$(25000)

' check i made steve 2nd optimization correctly

'test$(0) = "grmaana"

'test$(1) = "angiogram"

'test$(2) = "naagrma"

'test$(3) = "telgram"

'test$(4) = "gramana"

'test$(5) = "gram"

'test$(6) = "nag"

'test$(7) = "tag"

'test$(8) = "am"

'test$(9) = "grip"

'For i = 0 To 9

'    Print test$(i), AnaCode$(UCase$(test$(i))), peteCode$(UCase$(test$(i)))

'Next

' ok looks right

Dim i As Integer, j As Integer

start = Timer(.01)

For i = 1 To 25

    j = 0

    Open "WORDS.txt" For Input As #1

    While Not EOF(1)

        Input #1, w$

        j = j + 1

        a$(j) = AnaCode$(w$)

    Wend

    Close

Next

anatime = Timer(.01) - start

start = Timer(.01)

For i = 1 To 25

    j = 0

    Open "WORDS.txt" For Input As #1

    While Not EOF(1)

        Input #1, w$

        j = j + 1

        p$(j) = peteCode$(w$)

    Wend

    Close

Next

petetime = Timer(.01) - start

Print "AnaTime ="; anatime, "PeteTime ="; petetime

For i = 1 To 25000 ' checking for differences between array values

    If a$(i) <> p$(i) Then Print a$(i), p$(i)

Next

Print "END OF DIFF CHECK"

Print: Print " Check tail end of arrays:"

For i = j - 15 To j

    Print a$(i), p$(i)

Next

' return sorted ancagram code string for any word, call ucase$(wrd$) for all caps

Function AnaCode$ (wrd$) ' anaCode$ converts word to an Anagram pattern

    ' wrd$ is assumed to be in all capitals!!!

    ' number of A's in first, number of B's in 2nd, number of C's in third

    Dim As Integer L(65 To 90), i, p

    Dim rtn$

    For i = 1 To Len(wrd$)

        p = Asc(wrd$, i) ' A=1, B=2...

        L(p) = L(p) + 1

    Next

    For i = 65 To 90

        If L(i) Then rtn$ = rtn$ + String$(L(i), i) ' thanks steve for whole line here!!!

    Next

    AnaCode$ = rtn$

End Function

Function peteCode$ (a$) 'converts word to an Anagram pattern

    ' a$ is assumed to be all caps call ucase$(a$) if not

    Dim i%, seed%, rtn$

    For i% = 65 To 90

        seed% = InStr(a$, Chr$(i%))

        While seed%

            rtn$ = rtn$ + Chr$(i%)

            seed% = InStr(seed% + 1, a$, Chr$(i%))

        Wend

    Next

    peteCode$ = rtn$

End Function

Filename: speed test with revised anacode$.PNG Size: 11.19 KB 04-25-2024, 09:48 PM

**SMcNeill** · 04-26-2024, 12:52 AM

@bplus As for why I swapped over to using mid$ as I did, take a look at my newest post on speed optimizations here: https://qb64phoenix.com/forum/showthread...1#pid24751

In this case, it may not make such a large difference as the strings are all just single words in size. I'm just trying to incorportate the new method into my standard coding habits, as much as possible, as it certainly makes a sizable difference in times for larger strings. Smile

***Pete*** · (This post was last modified: 04-26-2024, 01:41 AM by Pete.)

@bplus

I think it was some five or six years ago I started working on a C/C++keyboard input routine. I finished it, and from a few of the articles I read discovered that it is supposed to be faster to manipulate an existing string than to add to a string. Now for the functions you are using, I modified both INSTR() approaches to work in that manner. The first does unique letters, the second displays multiple instances of any duplicated letters, which I think was the one you were testing with. Now it may be faster, but to achieve the same results, another counting variable needed to be added and the initial string to apply mid$() had to be created. So the question of any actual speed improvements is uncertain.

If you'd like to test it, just replace whichever one is appropriate in your timed system and see how it performs...

Code: (Select All)

a$ = "uncopyrightable"

Print ">"; petecode1$(a$); "<"

Function petecode1$ (a$)

    rtn$ = Space$(26)

    For i% = 97 To 122

        seed% = InStr(a$, Chr$(i%))

        If seed% Then

            j% = j% + 1

            Mid$(rtn$, j%) = Chr$(i%)

        End If

    Next

    petecode1$ = RTrim$(rtn$)

End Function

Code: (Select All)

a$ = "uncopyrightaabbble"

Print ">"; petecode2$(a$); "<"

Function petecode2$ (a$)

    rtn$ = Space$(Len(a$))

    For i% = 97 To 122

        seed% = InStr(a$, Chr$(i%))

        While seed%

            j% = j% + 1

            Mid$(rtn$, j%) = Chr$(i%)

            seed% = InStr(seed% + 1, a$, Chr$(i%))

        Wend

    Next

    petecode2$ = RTrim$(rtn$)

End Function

Pete

euklides · 04-26-2024, 07:23 AM

Pete wrote:

a$ = "uncopyrightable"
For i = 1 To 26
If InStr(LCase$(a$), Chr$(96 + i)) Then Print Chr$(96 + i);
Next

Ok, very short.
But, if the word has the same letter twice (or more), some of them are lost if the final result must have the same length.

Circlotron · (This post was last modified: 04-26-2024, 01:09 PM by Circlotron.)

(04-26-2024, 07:23 AM)euklides Wrote: Pete wrote:

a$ = "uncopyrightable"
For i = 1 To 26
If InStr(LCase$(a$), Chr$(96 + i)) Then Print Chr$(96 + i);
Next

Ok, very short.
But, if the word has the same letter twice (or more), some of them are lost if the final result must have the same length.

Disregarding the lost letters mentioned above, what if we adjust the for/next values so we can eliminate two additions that repeat every loop? Would that speed things up a little, or are the calculation results cached somehow so that recalcs are not needed?

a$ = "uncopyrightable"
For i = 97 To 122
If InStr(LCase$(a$), Chr$(i)) Then Print Chr$(i);
Next

bplus · 04-26-2024, 01:42 PM

(04-26-2024, 12:52 AM)SMcNeill Wrote: @bplus As for why I swapped over to using mid$ as I did, take a look at my newest post on speed optimizations here: https://qb64phoenix.com/forum/showthread...1#pid24751

In this case, it may not make such a large difference as the strings are all just single words in size. I'm just trying to incorportate the new method into my standard coding habits, as much as possible, as it certainly makes a sizable difference in times for larger strings.

ok, ha i tried a version of mid$ already too bad it was cluttered with a bunch of other string functions at the time

another damn speed test, judging by your latest post and op tests it should be worth the trouble.

bplus · 04-26-2024, 01:49 PM

(04-26-2024, 01:38 AM)Pete Wrote: @bplus

I think it was some five or six years ago I started working on a C/C++keyboard input routine. I finished it, and from a few of the articles I read discovered that it is supposed to be faster to manipulate an existing string than to add to a string. Now for the functions you are using, I modified both INSTR() approaches to work in that manner. The first does unique letters, the second displays multiple instances of any duplicated letters, which I think was the one you were testing with. Now it may be faster, but to achieve the same results, another counting variable needed to be added and the initial string to apply mid$() had to be created. So the question of any actual speed improvements is uncertain.

If you'd like to test it, just replace whichever one is appropriate in your timed system and see how it performs...

Code: (Select All)
a$ = "uncopyrightable"

Print ">"; petecode1$(a$); "<"

Function petecode1$ (a$)

    rtn$ = Space$(26)

    For i% = 97 To 122

        seed% = InStr(a$, Chr$(i%))

        If seed% Then

            j% = j% + 1

            Mid$(rtn$, j%) = Chr$(i%)

        End If

    Next

    petecode1$ = RTrim$(rtn$)

End Function
Code: (Select All)
a$ = "uncopyrightaabbble"

Print ">"; petecode2$(a$); "<"

Function petecode2$ (a$)

    rtn$ = Space$(Len(a$))

    For i% = 97 To 122

        seed% = InStr(a$, Chr$(i%))

        While seed%

            j% = j% + 1

            Mid$(rtn$, j%) = Chr$(i%)

            seed% = InStr(seed% + 1, a$, Chr$(i%))

        Wend

    Next

    petecode2$ = RTrim$(rtn$)

End Function 
Pete

ok looks like i will be comparing anacode with the mid$ mod to petecode with mid$ mod
sounds interesting, how much imporvment for 25,000 word file (almost), done 25 times, with mid$

good experiment alright!

Login
Username/Email:
Password:	Lost Password?
	Remember me