Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Draw that Circle
#11
(08-26-2022, 06:05 PM)SMcNeill Wrote:
(08-26-2022, 05:59 PM)bplus Wrote: Wow BF makes that much a difference? Seems like something is wrong about Line?

BF is highly optimized and is *much* faster than line, just as DO is faster than FOR...

A FOR loop has to track Start, Stop, Step, counting up or down...   A DO loop is just DO.. LOOP and the user has to deal with the exit conditions.  There's a lot less code to process for a DO-LOOP than there is a FOR-Next...

Same way with a line vs a line, BF.   A line has slope.  You calculate rise/run, do a loop, plot the necessary pixels, increment to the next pixel...

BF is just:
FOR y = start to stop
   memfill x, x.start, s.stop, kolor
NEXT

Care to guess which is going to be faster, once you think about the basic premise behind them?  

BF is always quite a bit faster than without it.  Wink


Oh yes, got it! Thanks for explaining.
b = b + ...
Reply
#12
Excellent!

here's what I got on my machine.  It varies form run to run. I'm assuming that's because of how offscreen writes are dealt with.


[Image: image.png]

free online screenshot no download
Reply
#13
Heart 
(08-26-2022, 06:33 PM)James D Jarvis Wrote: Excellent!

here's what I got on my machine.  It varies form run to run. I'm assuming that's because of how offscreen writes are dealt with.


[Image: image.png]

free online screenshot no download
Thank you for proposing the image-sharing site to me! I had to make this post to discover where to click to see the full image.
Reply
#14
I was curious about if drawing with _mem commands would improve performance. So, I used you test as inspiration for my own test.

I'm using Bresenham Circle drawing algorithm, just as you had, but I get very different results, from your implementation. Most slower, except while using 1 byte per pixel.
  • Optimization flag is set.
  • I tried using _SHL in place of multiplying by 2,4,8 and it slowed it down, which is surprising, but I guess the compiler is doing a better job of multiplying.
  • I did not implement clipping in the MEM rountines just assuming it would just slow down the test.
  • I've noticed that Line command seems to be faster at drawing a horizontal line then memfill with a unsigned long. (Suprising!)

Overall my times are pretty sad. I do have a slow computer.  
  1. Bresenham normal  - 12.68 sec
  2. Bresenham MEM 1bbp  - .99 sec
  3. Bresenham MEM 4bbp  - 16.09 sec

Can one of you guys point out my inefficiency?  Perhaps I'm doing something unnesessary or dumb?
 
Code: (Select All)
_TITLE "Fast Circle Test"

DIM AS LONG scrn
DIM AS LONG count
DIM AS SINGLE t0, t1
DIM AS STRING en

TYPE tRESULTS
  AS SINGLE time
  AS STRING test
END TYPE

DIM AS tRESULTS res(10)

scrn = _NEWIMAGE(800, 500, 256)
SCREEN scrn
CONST iterations = 640000
'____________________________________________________________________________________________________________________________________

res(0).test = "Bresenham Normal Test"
LOCATE 20, 1
PRINT res(0).test
INPUT "Press Enter to start ..."; en

count = 0
t0 = TIMER
DO
  CircleBresenham INT(RND * 800), INT(RND * 500), 30, INT(RND * 255)
  count = count + 1
LOOP WHILE count < iterations
t1 = TIMER
res(0).time = t1 - t0
'____________________________________________________________________________________________________________________________________
res(1).test = "Bresenham MEM 1bpp (no clip)"
LOCATE 20, 1
PRINT res(1).test
INPUT "Press Enter to start ..."; en

count = 0
t0 = TIMER
DIM AS _MEM scr
scr = _MEMIMAGE(scrn)
DO
  CircleBresenham1bpp scr, 30 + INT(RND * 740), 30 + INT(RND * 440), 30, INT(RND * 255)
  count = count + 1
LOOP WHILE count < iterations
_MEMFREE scr
t1 = TIMER
res(1).time = t1 - t0
'____________________________________________________________________________________________________________________________________
res(2).test = "Bresenham MEM 4bpp (no clip)"
LOCATE 20, 1
PRINT res(2).test
INPUT "Press Enter to start ..."; en

_TITLE "Fast Circle Test"
scrn = _NEWIMAGE(800, 500, 32)
SCREEN scrn
LOCATE 20, 1
count = 0
t0 = TIMER
scr = _MEMIMAGE(scrn)
DO
  CircleBresenham4bpp scr, 30 + INT(RND * 740), 30 + INT(RND * 440), 30, _RGB32(INT(RND * 255), INT(RND * 255), INT(RND * 255))
  count = count + 1
LOOP WHILE count < iterations
_MEMFREE scr
t1 = TIMER
res(2).time = t1 - t0
'____________________________________________________________________________________________________________________________________

PRINT "Circle count:"; iterations
FOR count = 0 TO 2
  PRINT res(count).test; "  Time:"; res(count).time
NEXT


'____________________________________________________________________________________________________________________________________
SUB CircleBresenham (xc AS LONG, yc AS LONG, r AS LONG, c AS LONG)
  DIM AS LONG e, x, y, w
  DIM AS LONG l0, l1
  w = _WIDTH(0) * 4
  x = r
  y = 0
  e = 0
  $CHECKING:OFF
  DO
    l0 = x * 2
    l1 = y * 2

    LINE (xc - x, yc - y)-(xc - x + l0, yc - y), c
    LINE (xc - x, yc + y)-(xc - x + l0, yc + y), c
    LINE (xc - y, yc - x)-(xc - y + l1, yc - x), c
    LINE (xc - y, yc + x)-(xc - y + l1, yc + x), c

    IF x <= y THEN EXIT DO
    e = e + y * 2 + 1
    y = y + 1
    IF e > x THEN
      e = e + 1 - x * 2
      x = x - 1
    END IF
  LOOP
  $CHECKING:ON
END SUB


SUB CircleBresenham1bpp (scr AS _MEM, xc AS LONG, yc AS LONG, r AS LONG, c AS _UNSIGNED _BYTE)
  DIM AS LONG e, x, y, w
  DIM AS LONG xof0, xof1, xof2, xof3, l0, l1
  DIM AS LONG yof0, yof1, yof2, yof3
  DIM AS LONG xq0, yq0, xq1, yq1, xq2, yq2, xq3, yq3
  w = _WIDTH(0)
  x = r
  y = 0
  e = 0
  $CHECKING:OFF
  DO
    l0 = x * 2
    l1 = y * 2

    xq0 = xc - x
    yq0 = yc - y
    xof0 = xq0
    yof0 = yq0 * w
    _MEMFILL scr, scr.OFFSET + xof0 + yof0, l0, c AS _UNSIGNED _BYTE
    xq1 = xc - x
    yq1 = yc + y
    xof1 = xq1
    yof1 = yq1 * w
    _MEMFILL scr, scr.OFFSET + xof1 + yof1, l0, c AS _UNSIGNED _BYTE
    xq2 = xc - y
    yq2 = yc - x
    xof2 = xq2
    yof2 = yq2 * w
    _MEMFILL scr, scr.OFFSET + xof2 + yof2, l1, c AS _UNSIGNED _BYTE
    xq3 = xc - y
    yq3 = yc + x
    xof3 = xq3
    yof3 = yq3 * w
    _MEMFILL scr, scr.OFFSET + xof3 + yof3, l1, c AS _UNSIGNED _BYTE

    IF x <= y THEN EXIT DO
    e = e + y * 2 + 1
    y = y + 1
    IF e > x THEN
      e = e + 1 - x * 2
      x = x - 1
    END IF
  LOOP
  $CHECKING:ON
END SUB

SUB CircleBresenham4bpp (scr AS _MEM, xc AS LONG, yc AS LONG, r AS LONG, c AS LONG)
  DIM AS LONG e, x, y, w
  DIM AS LONG xof0, xof1, xof2, xof3, l0, l1
  DIM AS LONG yof0, yof1, yof2, yof3
  DIM AS LONG xq0, yq0, xq1, yq1, xq2, yq2, xq3, yq3
  w = _WIDTH(0) * 4
  x = r
  y = 0
  e = 0
  $CHECKING:OFF
  DO
    l0 = x * 8
    l1 = y * 8
    xq0 = xc - x
    yq0 = yc - y
    xof0 = xq0 * 4
    yof0 = yq0 * w
    _MEMFILL scr, scr.OFFSET + xof0 + yof0, l0, c AS _UNSIGNED LONG
    xq1 = xc - x
    yq1 = yc + y
    xof1 = xq1 * 4
    yof1 = yq1 * w
    _MEMFILL scr, scr.OFFSET + xof1 + yof1, l0, c AS _UNSIGNED LONG
    xq2 = xc - y
    yq2 = yc - x
    xof2 = xq2 * 4
    yof2 = yq2 * w
    _MEMFILL scr, scr.OFFSET + xof2 + yof2, l1, c AS _UNSIGNED LONG
    xq3 = xc - y
    yq3 = yc + x
    xof3 = xq3 * 4
    yof3 = yq3 * w
    _MEMFILL scr, scr.OFFSET + xof3 + yof3, l1, c AS _UNSIGNED LONG

    IF x <= y THEN EXIT DO
    e = e + y * 2 + 1
    y = y + 1
    IF e > x THEN
      e = e + 1 - x * 2
      x = x - 1
    END IF
  LOOP
  $CHECKING:ON
END SUB
Reply
#15
For starters, why calculate the same values multiple times in your loop?

xof1 = xof0
xpf3 = xpf2

There's several math operations removed completely and easily from the loop.
Reply
#16
(08-28-2022, 03:32 AM)SMcNeill Wrote: For starters, why calculate the same values multiple times in your loop?

xof1 = xof0
xpf3 = xpf2

There's several math operations removed completely and easily from the loop.

Yea that was an artifact of code used for 4 byte per pixel code. It's actually the fastest version of the circle routine, so I didn't go back optimize it.
Reply
#17
One big point of efficiency is to swap to _MEMPUT over _MEMFILL. It's hard to beat a simple _MEMPUT when it comes to working with shoving values into memory.

Code: (Select All)
_Title "Fast Circle Test"

Dim As Long scrn
Dim As Long count
Dim As Single t0, t1
Dim As String en

Type tRESULTS
    As Single time
    As String test
End Type

Dim As tRESULTS res(10)

scrn = _NewImage(800, 500, 256)
Screen scrn
Const iterations = 640000
'____________________________________________________________________________________________________________________________________

res(0).test = "Bresenham Normal Test"
Locate 20, 1
Print res(0).test
Input "Press Enter to start ..."; en

count = 0
t0 = Timer
Do
    '   CircleBresenham Int(Rnd * 800), Int(Rnd * 500), 30, Int(Rnd * 255)
    count = count + 1
Loop While count < iterations
t1 = Timer
res(0).time = t1 - t0
'____________________________________________________________________________________________________________________________________
res(1).test = "Bresenham MEM 1bpp (no clip)"
Locate 20, 1
Print res(1).test
Input "Press Enter to start ..."; en

count = 0
t0 = Timer
Dim As _MEM scr
scr = _MemImage(scrn)
Do
    CircleBresenham1bpp scr, 30 + Int(Rnd * 740), 30 + Int(Rnd * 440), 30, Int(Rnd * 255)
    count = count + 1
Loop While count < iterations
_MemFree scr
t1 = Timer
res(1).time = t1 - t0
'____________________________________________________________________________________________________________________________________
res(2).test = "Bresenham MEM 4bpp (no clip)"
Locate 20, 1
Print res(2).test
Input "Press Enter to start ..."; en

_Title "Fast Circle Test"
scrn = _NewImage(800, 500, 32)
Screen scrn
Locate 20, 1
count = 0
t0 = Timer
scr = _MemImage(scrn)
Do
    CircleBresenham4bpp scr, 30 + Int(Rnd * 740), 30 + Int(Rnd * 440), 30, _RGB32(Int(Rnd * 255), Int(Rnd * 255), Int(Rnd * 255))
    count = count + 1
Loop While count < iterations
_MemFree scr
t1 = Timer
res(2).time = t1 - t0
'____________________________________________________________________________________________________________________________________

Print "Circle count:"; iterations
For count = 0 To 2
    Print res(count).test; "  Time:"; res(count).time
Next


'____________________________________________________________________________________________________________________________________
Sub CircleBresenham (xc As Long, yc As Long, r As Long, c As Long)
    Dim As Long e, x, y, w
    Dim As Long l0, l1
    w = _Width(0) * 4
    x = r
    y = 0
    e = 0
    $Checking:Off
    Do
        l0 = x * 2
        l1 = y * 2

        Line (xc - x, yc - y)-(xc - x + l0, yc - y), c
        Line (xc - x, yc + y)-(xc - x + l0, yc + y), c
        Line (xc - y, yc - x)-(xc - y + l1, yc - x), c
        Line (xc - y, yc + x)-(xc - y + l1, yc + x), c

        If x <= y Then Exit Do
        e = e + y * 2 + 1
        y = y + 1
        If e > x Then
            e = e + 1 - x * 2
            x = x - 1
        End If
    Loop
    $Checking:On
End Sub


Sub CircleBresenham1bpp (scr As _MEM, xc As Long, yc As Long, r As Long, c As _Unsigned _Byte)
    Dim As Long e, x, y, w
    Dim As Long xof0, xof1, xof2, xof3, l0, l1
    Dim As Long yof0, yof1, yof2, yof3
    Dim As Long xq0, yq0, xq1, yq1, xq2, yq2, xq3, yq3
    w = _Width(0)
    x = r
    y = 0
    e = 0
    $Checking:Off
    Do
        l0 = x * 2
        l1 = y * 2

        xq0 = xc - x
        yq0 = yc - y
        xof0 = xq0
        yof0 = yq0 * w
        _MemFill scr, scr.OFFSET + xof0 + yof0, l0, c As _UNSIGNED _BYTE
        xq1 = xc - x
        yq1 = yc + y
        xof1 = xq1
        yof1 = yq1 * w
        _MemFill scr, scr.OFFSET + xof1 + yof1, l0, c As _UNSIGNED _BYTE
        xq2 = xc - y
        yq2 = yc - x
        xof2 = xq2
        yof2 = yq2 * w
        _MemFill scr, scr.OFFSET + xof2 + yof2, l1, c As _UNSIGNED _BYTE
        xq3 = xc - y
        yq3 = yc + x
        xof3 = xq3
        yof3 = yq3 * w
        _MemFill scr, scr.OFFSET + xof3 + yof3, l1, c As _UNSIGNED _BYTE

        If x <= y Then Exit Do
        e = e + y * 2 + 1
        y = y + 1
        If e > x Then
            e = e + 1 - x * 2
            x = x - 1
        End If
    Loop
    $Checking:On
End Sub

Sub CircleBresenham4bpp (scr As _MEM, xc As Long, yc As Long, r As Long, c As _Unsigned Long)
    Dim As _Offset e, x, y, w
    Dim As _Offset xof0, xof1, xof2, xof3, l0, l1
    Dim As _Offset yof0, yof1, yof2, yof3
    Dim As _Offset xq0, yq0, xq1, yq1, xq2, yq2, xq3, yq3
    w = _Width(0) * 4
    x = r
    y = 0
    e = 0
    $Checking:Off
    'start time of 7.03 seconds
    'by swapping to memput, the time is now 1.9 seconds.
    Dim As _Offset start, finish
    Do
        l0 = x * 8
        l1 = y * 8
        xq0 = scr.OFFSET + (xc - x) * 4
        yq0 = (yc - y) * w
        ' _MemFill scr, xq0 + yq0, l0, c
        start = xq0 + yq0
        finish = start + l0
        Do
            _MemPut scr, start, c
            start = start + 4
        Loop Until start > finish
        yq1 = (yc + y) * w
        '_MemFill scr, xq0 + yq1, l0, c
        start = xq0 + yq1
        finish = start + l0
        Do
            _MemPut scr, start, c
            start = start + 4
        Loop Until start > finish


        xq2 = scr.OFFSET + (xc - y) * 4
        yq2 = (yc - x) * w
        '_MemFill scr, xq2 + yq2, l1, c
        start = xq2 + yq2
        finish = start + l1
        Do
            _MemPut scr, start, c
            start = start + 4
        Loop Until start > finish

        yq3 = (yc + x) * w
        '_MemFill scr, xq2 + yq3, l1, c
        start = xq2 + yq3
        finish = start + l1
        Do
            _MemPut scr, start, c
            start = start + 4
        Loop Until start > finish
        If x <= y Then Exit Do
        e = e + y + y + 1
        y = y + 1
        If e > x Then
            e = e + 1 - x - x
            x = x - 1
        End If
    Loop
    $Checking:On
End Sub

This went from 7 seconds down to less than 2 seconds on my PC. All you'd need to do is comment out the _MEMPUT and uncomment the _MEMFILL statements, and you can see the difference at play.

I think you'd really need to optimize the math itself to reduce times much more. For example, instead of counting x = x + 1, count x = x + 4 (move by 4 bytes instead of 1 pixel coordinate). Same with y. Instead of y = y + 1, y = y + w. (move a row of 4 byte pixels instead of by a coordinate) Then you can get rid of the * 4 and * w operators, simplifying the number of processes which your loop has to make before finishing.

But your biggest change is going to be _MEMPUT over _MEMFILL (a 300%+ speed improvement!).
Reply
#18
Thank you!

That is very interesting and puzzling. I guess I assumed then _memfill would be more efficient to fill an area of memory. It seems counter intuitive that it would be slower that hand rolled routine using _memput.

You are correct my math needed cleaning up. It seems that there were a lot simple optimizations that could of been done to make it faster. I would of thought bit shifting would of speed up my multiplying by powers of 2, but it appears to be slower that just multiplying.

Again thanks for your help!
Reply




Users browsing this thread: 4 Guest(s)