Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Possible CLS improvement
#11
(10-10-2023, 05:30 PM)Jack Wrote: on my PC running Windows 11 the difference is 0.001
the time varies too much between runs to get a good estimate
On my system running Windows 7 Pro x64 I usually see a difference of between .02 and .03. Windows 11 has a lot going on under the hood (spying, tracking, telemetry, etc..) that is probably skewing the results. Sad
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Reply
#12
(10-10-2023, 06:10 PM)TerryRitchie Wrote:
(10-10-2023, 05:30 PM)Jack Wrote: on my PC running Windows 11 the difference is 0.001
the time varies too much between runs to get a good estimate
On my system running Windows 7 Pro x64 I usually see a difference of between .02 and .03. Windows 11 has a lot going on under the hood (spying, tracking, telemetry, etc..) that is probably skewing the results. Sad
In all seriousness, I rewrote the mechanism to time the routines using frames because your system is probably much faster than the .001 resolution of TIMER.

However, using frames shows that CLS has the slight advantage??

I dunno. Either method of timing is so close though I doubt that it matters much.

Code: (Select All)
'
' CLS Image
'
' There is a very slight increase in performance using CLSI over CLS (not using this timing method).
' The reason I wrote CLS Image was to avoid having to change the
' destination to the image and then back again to the original image.
' The subroutine now handles that cleanly.
'
' Note that CLSI does not support CLS' methods (0 through 2).
'

CONST RED~& = _RGB32(255, 0, 0) '   define a few colors
CONST CYAN~& = _RGB32(0, 255, 255)

DIM Image AS LONG '                 test image
DIM c AS LONG '                     counter
DIM t1 AS DOUBLE '                  time start
DIM t2 AS DOUBLE '                  time end
DIM CLSTime AS DOUBLE '             CLS total time
DIM CLSITime AS DOUBLE '            CLSI total time

Image = _NEWIMAGE(320, 200, 32) '   create test image
SCREEN _NEWIMAGE(640, 480, 32) '    create graphics screen

'+------------------------+
'| CLS 1 second time test |
'+------------------------+

c = 0 '                             reset counter
t1 = TIMER(.001) '                  start time
DO '                                begin counted loop
    c = c + 1 '                     increment counter
    CLS , RED '                     clear main screen red
    _DEST Image '                   change write image to Image
    CLS , CYAN '                    clear image cyan
    _DEST 0 '                       change write image to SCREEN
LOOP UNTIL TIMER(.001) - t1 >= 1 '  leave after 1 second
CLSTime = c '                       total frames

'+-------------------------+
'| CLSI 1 second time test |
'+-------------------------+

c = 0 '                             reset counter
t1 = TIMER(.001) '                  start time
DO '                                begin counted loop
    c = c + 1 '                     increment counter
    CLSI RED, _DEST '               clear main screen red
    CLSI CYAN, Image '              clear image cyan
LOOP UNTIL TIMER(.001) - t1 >= 1 '  leave after 1 second
CLSITime = c '                      total frames

'+-----------------+
'| Display results |
'+-----------------+

_PUTIMAGE (159, 139), Image '       place test image onto graphics screen
PRINT "1 Second Test"
PRINT "-------------"
PRINT "CLS Frames :"; CLSTime '     total frames for CLS
PRINT "CLSI Frames:"; CLSITime '    total frames for CLSI
PRINT "Difference :"; CLSTime - CLSITime ' difference
SLEEP '                             wait for key press
SYSTEM '                            return to OS

' _____________________________________________________________________________
'/                                                                             \
SUB CLSI (bgColor AS _UNSIGNED LONG, ImageHandle AS LONG) '               CLSI |
    ' _________________________________________________________________________|____
    '/                                                                              \
    '| CLS Image                                                                    |
    '| Note: does not support CLS methods                                           |
    '|                                                                              |
    '| bgColor    : color used to clear the image                                   |
    '| ImageHandle: image to clear                                                  |
    '\______________________________________________________________________________/

    DIM oDest AS LONG ' calling destination

    oDest = _DEST '                                      save calling destination
    _DEST ImageHandle '                                  change write image
    LINE (0, 0)-(_WIDTH - 1, _HEIGHT - 1), bgColor, BF ' draw a box filled line
    _DEST oDest '                                        return to calling destination

END SUB
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Reply
#13
Try this:

First, save the following as "SetMemory.h"
Code: (Select All)
#include <algorithm>
#include <cstdint>
#include <cstdio>

inline void SetMemoryByte(uintptr_t dst, uint32_t elements, uint8_t value) {
    std::fill(reinterpret_cast<uint8_t *>(dst),
              reinterpret_cast<uint8_t *>(dst) + elements, value);
}

inline void SetMemoryInteger(uintptr_t dst, uint32_t elements, uint16_t value) {
    std::fill(reinterpret_cast<uint16_t *>(dst),
              reinterpret_cast<uint16_t *>(dst) + elements, value);
}

inline void SetMemoryLong(uintptr_t dst, uint32_t elements, uint32_t value) {
    std::fill(reinterpret_cast<uint32_t *>(dst),
              reinterpret_cast<uint32_t *>(dst) + elements, value);
}

And then give this a test run:

Code: (Select All)
'
' CLS Image
'
' There is a very slight increase in performance using CLSI over CLS (not using this timing method).
' The reason I wrote CLS Image was to avoid having to change the
' destination to the image and then back again to the original image.
' The subroutine now handles that cleanly.
'
' Note that CLSI does not support CLS' methods (0 through 2).
'

CONST RED~& = _RGB32(255, 0, 0) ' define a few colors
CONST CYAN~& = _RGB32(0, 255, 255)
CONST Display = 0

DIM Image AS LONG ' test image
DIM c AS LONG ' counter
DIM t1 AS DOUBLE ' time start
DIM t2 AS DOUBLE ' time end
DIM CLSTime AS DOUBLE ' CLS total time
DIM CLSITime AS DOUBLE ' CLSI total time
DIM CLSMTime AS DOUBLE

Image = _NEWIMAGE(320, 200, 32) ' create test image
SCREEN _NEWIMAGE(640, 480, 32) ' create graphics screen

'+------------------------+
'| CLS 3 second time test |
'+------------------------+

c = 0 ' reset counter
t1 = TIMER(.001) ' start time
DO ' begin counted loop
c = c + 1 ' increment counter
CLS , RED ' clear main screen red
_DEST Image ' change write image to Image
CLS , CYAN ' clear image cyan
_DEST 0 ' change write image to SCREEN
IF Display THEN _DISPLAY
LOOP UNTIL TIMER(.001) - t1 >= 3 ' leave after 1 second
CLSTime = c ' total frames

'+-------------------------+
'| CLSI 3 second time test |
'+-------------------------+

c = 0 ' reset counter
t1 = TIMER(.001) ' start time
DO ' begin counted loop
c = c + 1 ' increment counter
CLSI RED, _DEST ' clear main screen red
CLSI CYAN, Image ' clear image cyan
IF Display THEN _DISPLAY
LOOP UNTIL TIMER(.001) - t1 >= 3 ' leave after 1 second
CLSITime = c ' total frames

'+-------------------------+
'| CLSM 3 second time test |
'+-------------------------+

c = 0 ' reset counter
t1 = TIMER(.001) ' start time
DO ' begin counted loop
c = c + 1 ' increment counter
CLSM RED, _DEST ' clear main screen red
CLSM CYAN, Image ' clear image cyan
IF Display THEN _DISPLAY
LOOP UNTIL TIMER(.001) - t1 >= 3 ' leave after 1 second
CLSMTime = c ' total frames




'+-----------------+
'| Display results |
'+-----------------+
_AUTODISPLAY
_PUTIMAGE (159, 139), Image ' place test image onto graphics screen
PRINT "3 Second Test"
PRINT "-------------"
PRINT "CLS Frames :"; CLSTime ' total frames for CLS
PRINT "CLSI Frames:"; CLSITime ' total frames for CLSI
PRINT "CLSM Franes:"; CLSMTime
DO: LOOP UNTIL _KEYDOWN(27)
SYSTEM ' return to OS

' _____________________________________________________________________________
'/ \
SUB CLSI (bgColor AS _UNSIGNED LONG, ImageHandle AS LONG) ' CLSI |
' _________________________________________________________________________|____
'/ \
'| CLS Image |
'| Note: does not support CLS methods |
'| |
'| bgColor : color used to clear the image |
'| ImageHandle: image to clear |
'\______________________________________________________________________________/

DIM oDest AS LONG ' calling destination

oDest = _DEST ' save calling destination
_DEST ImageHandle ' change write image
LINE (0, 0)-(_WIDTH - 1, _HEIGHT - 1), bgColor, BF ' draw a box filled line
_DEST oDest ' return to calling destination
END SUB



' _____________________________________________________________________________
'/ \
SUB CLSM (bgColor AS _UNSIGNED LONG, ImageHandle AS LONG) ' CLSI |
' _________________________________________________________________________|____
'/ \
'| CLS Mem |
'| Note: does not support CLS methods |
'| |
'| bgColor : color used to clear the image |
'| ImageHandle: image to clear |
'\______________________________________________________________________________/
DECLARE LIBRARY "SetMemory"
SUB SetMemoryByte (BYVAL dst AS _UNSIGNED _OFFSET, BYVAL elements AS _UNSIGNED LONG, BYVAL value AS _UNSIGNED _BYTE)
SUB SetMemoryInteger (BYVAL dst AS _UNSIGNED _OFFSET, BYVAL elements AS _UNSIGNED LONG, BYVAL value AS _UNSIGNED INTEGER)
SUB SetMemoryLong (BYVAL dst AS _UNSIGNED _OFFSET, BYVAL elements AS _UNSIGNED LONG, BYVAL value AS _UNSIGNED LONG)
END DECLARE
DIM oDest AS _MEM
oDest = _MEMIMAGE(ImageHandle)
SetMemoryLong oDest.OFFSET, (_WIDTH(ImageHandle) * _HEIGHT(ImageHandle)), bgColor
END SUB


I added a CONST up top so you can compare speeds with _DISPLAY in use and not. Smile
(Note, that's with the compiler settings checked for optimization, under the OPTIONS menu.)
Reply
#14
A couple interesting notes here:

1) There's no reason to argue over which is faster CLS or LINE..   If you look in libqb.cpp, you'll see this little snippet of code.
Code: (Select All)
            } else { // 32-bit
                i = write_page->alpha_disabled;
                write_page->alpha_disabled = 1;
                if (write_page->clipping_or_scaling) {
                    qb32_boxfill(write_page->window_x1, write_page->window_y1, write_page->window_x2, write_page->window_y2, use_color);
                } else { // fast method (no clipping/scaling)
                    fast_boxfill(0, 0, write_page->width - 1, write_page->height - 1, use_color);
                }
                write_page->alpha_disabled = i;

That's for CLS, and it's where we basically do the screen clear for 32-bit screens.  Notice that what it calls here is a simple routine called "fast_boxfill"...  That's the *exact* same routine that LINE calls when we add the BF tag to the end of it (for Box Filled).

There's not going to be any real difference in the two routines, as they both call the same helper routine to do the exact same job.

2)  As amazing as is it, CLS and LINE ...,BF are both faster than _MEMFILL and other memory filling routines that come with C.  (Such as the std::fill which I was making use of above for testing, and which a470g was so nice as to provide for us.)

HOW is CLS so much faster than _MEMFILL and such???

It's all in the number of operations which the routines end up doing!    Let me explain each for us, and you'll quickly see the difference in performance.

For _MEMFILL, we can basically set a 4 byte color, point it at our image, and then fill *each and every* pixel with that color, using our mem commands.

For CLS (and LINE ..., BF), what Galleon has the code doing is:

a) First we basically do a _MEMFILL to create a single complete line of colors.
b) We then take that completed line and _MEMCOPY it to fill up all the other lines with that same data.

End result is:

_MEMFILL does _WIDTH * _HEIGHT fills of our color.

CLS does _WIDTH fills of our color for one line + _HEIGHT fills of that line to replace all the rest.

See the difference in the number of operations we're making here??

Hats off to Galleon -- he really took some time to work out how to optimize what he was doing with the box_fill routines!  Big Grin
Reply
#15
The "SetMemory.h" might be redundant but it's a keeper. Just a reminder: on Linux make sure the filename is the same letter case. I didn't do that at first and therefore got a "file not found" error message from QB64. There are many people using Linux that prefer lowercase letters in their writing. Smile
Reply




Users browsing this thread: 2 Guest(s)