04-26-2024, 03:21 AM
(04-26-2024, 02:19 AM)TerryRitchie Wrote: 20.75 addstrings
0.007 midstrings
Holy cow. Can anything be done to improve QB64's string manipulation?
The second method is definitely thinking outside the box like you said, but well worth utilizing for large string concatenation.
If so, I wouldn't know how -- it'd have to be some sort of change by someone much more knowledgeable about the inner workings of C and its memory than what I have.
To explain the difference in the speed here, let me go through what both of these processes are actually doing for us.
First, let's go with adding strings together. c$ = c$ + a$
1) Fiirst we have to check the length of both strings. len(c$), len (a$)
2) Add those lengths together and make certain we have that much memory availble for use. Create a temp mem-block of the proper size.
3) Add those two strings to that temp mem-block, in order. (Put c$ in that block, then put a$ there.)
4) Free the old c$ from memory.
5) Point c$ to that temp mem-block, so it now contains the full data of both strings put together.
Make a block of memory. Merge the strings to that block. Free a block of memory. Point the old block to the new block... <-- That's the basic process in a nutshell.
And let's compare it to the mid$ method:
1) Fiirst we have to check the length of both strings. len(c$), len (a$)
2) Add those lengths together and make certain we have that much memory availble for use. Create a temp mem-block of the proper size.
3) Add those two strings to that temp mem-block, in order. (Put c$ in that block, then put a$ there.)
Make a block large enough. Put the data in that block. Done. <--- And that's all we do with mid$.
Now, where we REALLY see a difference in speed is when we repeat this process over and over in a loop.
Adding a string repetitively means freeing that old block over and over, and creating a new block over and over, and then pointing the old block to the new block.
On the other hand, if we pre-allocate all the memory for that new string at once, we skip that free/redirect over and over -- and the larger the memory block that we're allocating, freeing, and such, the longer it takes for the OS to make that manipulation.
See why mid$ ends up being faster for us, no matter which OS/compiler options we end up setting? It basically skips the most intensive part of the process -- the repeated allocating/freeing that comes from making certain we have enough memory to add those two strings together and assign them to one.