mImage.OFFSET + (y * sWidth + x) * 4 <-- this right here can be simplified by a ton with just a little logic.
First, distribute the 4 to outside the loop:
x = x + 4 (instead of x = x + 1) and then multiply by 4.
y = y + 4 (instead of the y = y + 1 that it currently is. This would save one calculation per cyctle (mImge.OFFSET + (y * sWidth + x) without that * 4 being needed.)
....
but then that can be simplified down even more:
x = x + 4 <-- this would move x position by 4 bytes, or one pixel at a time
y = y + (sWidth * 4) <--- seems to me that this should increase y by the whole row of pixels in one go.
now that internal math looks like: (mImage.OFFSET + y + x)
And, if you start y somewhere with the offset as the starting point:
y = mImage.OFFSET
Then you can make that whole _MEMPUT statement simplify down to:
Code: (Select All)
_MEMPUT mImage, y + x, Points(Nearest).c
^That'll shave off several internal processing cycles from your routine.
So basically the flow would be:
y = mImage.OFFSET
DO
DO
_MEMPUT mImage, y + x, Points(Nearest).c
x = x + 4
LOOP
y = y + sWidth * 4
LOOP until y + x >= m.SIZE
Anytime you can strip math out of a main loop like that, you can speed things up considerably.
This code is fast! I'm picking it apart to make sense of it.
I _think_ the idea is that if you look at the drawing as a series of horizontal (or vertical) lines, any particular point cannot have more than one segment of it's color on the line. IE. If you have a green point and a blue point, you'll never get a sequence of 'green-blue-green' pixels in a line, it will always have a single segment of green and a single segment of blue (assuming both colors appear on the line).
You can use that fact to throw out many of the points without calculating their distance - if you find that pixel Y=49 is colored by a point but Y=50 is too far away (because there's already another point that is closer), then you don't need to look at the pixels past Y=50. None of those pixels can be colored by your current point because you already finished its segment in the line, the color will never appear again until you start the next line.
There's probably some extra logic you can add to make it even faster - if you have lots of points, most lines are probably only colored by a small portion of them. You could process the points by Y distance to the line (when using horizontal lines) and stop looking at points if every pixel on the line is already colored by a closer point than any of the points left (the Y direction to the line is the absolute minimum distance a point can have to the line).
That said I believe my shader approach is still faster it's the same approach you did, but the GPU can parallelize the process so much that it doesn't matter.
Dim Shared As Long pax(pnt), pay(pnt), indx(px, py), dx(pnt), dy(pnt)
Dim Shared As Long dSqr(px, py)
Dim Shared As _Unsigned Long col(pnt)
For i = 1 To pnt
pax(i) = Int(Rnd * px)
pay(i) = Int(Rnd * py)
col(i) = _RGB32(Rnd * 180, (Rnd < .5) * -255, Rnd * 150, Rnd * 255)
dx(i) = Int(Rnd * 15) + 3
dy(i) = Int(Rnd * 11) - 5
Next
While _KeyDown(27) = 0
For x = 0 To px - 1
For y = 0 To py - 1
dSqr(x, y) = (pax(1) - x) * (pax(1) - x) + (pay(1) - y) * (pay(1) - y)
indx(x, y) = 1
Next
Next
For i = 2 To pnt
ly = py - 1
For x = pax(i) To 0 Step -1
If (scan(i, x, ly)) = 0 Then Exit For
Next x
For x = pax(i) + 1 To px - 1
If (scan(i, x, ly)) = 0 Then Exit For
Next
Next
For x = 0 To px - 1
For y = 0 To py - 1
sy = y
adjct = indx(x, y)
For y = y + 1 To py
If indx(x, y) <> adjct Then y = y - 1: Exit For
Next
Line (x, sy)-(x, y + 1), col(adjct) ' BF ' bplus added BF
Next
Next
For i = 1 To pnt
pax(i) = pax(i) + dx(i)
If pax(i) < 0 Then pax(i) = 0: dx(i) = -dx(i)
If pax(i) > px - 1 Then newPnt i
pay(i) = pay(i) + dy(i)
If pay(i) < 0 Then pay(i) = 0: dy(i) = -dy(i)
If pay(i) > py - 1 Then pay(i) = py - 1: dy(i) = -dy(i)
Next
Wend
Function scan (site As Integer, x As Integer, ly As Integer)
Dim As Integer ty
Dim As Long delt2, dsq
delt2 = (pax(site) - x) * (pax(site) - x)
For ty = 0 To ly
dsq = (pay(site) - ty) * (pay(site) - ty) + delt2
If dsq <= dSqr(x, ty) Then
dSqr(x, ty) = dsq
indx(x, ty) = site
scan = 1
End If
Next
End Function
mImage.OFFSET + (y * sWidth + x) * 4 <-- this right here can be simplified by a ton with just a little logic.
First, distribute the 4 to outside the loop:
x = x + 4 (instead of x = x + 1) and then multiply by 4.
y = y + 4 (instead of the y = y + 1 that it currently is. This would save one calculation per cyctle (mImge.OFFSET + (y * sWidth + x) without that * 4 being needed.)
....
but then that can be simplified down even more:
x = x + 4 <-- this would move x position by 4 bytes, or one pixel at a time
y = y + (sWidth * 4) <--- seems to me that this should increase y by the whole row of pixels in one go.
now that internal math looks like: (mImage.OFFSET + y + x)
And, if you start y somewhere with the offset as the starting point:
y = mImage.OFFSET
Then you can make that whole _MEMPUT statement simplify down to:
Code: (Select All)
_MEMPUT mImage, y + x, Points(Nearest).c
^That'll shave off several internal processing cycles from your routine.
So basically the flow would be:
y = mImage.OFFSET
DO
DO
_MEMPUT mImage, y + x, Points(Nearest).c
x = x + 4
LOOP
y = y + sWidth * 4
LOOP until y + x >= m.SIZE
Anytime you can strip math out of a main loop like that, you can speed things up considerably.
You are so much better at refactoring than I am. Thank you for the suggestions. They are so obvious to me now.
(09-13-2024, 10:42 PM)DSMan195276 Wrote: Something a bit different, I changed your rendering code into a shader. The code is a mess but it does work for me and I can get 500 points rendering at 60FPS. YMMV, I pass the points in as a uniform and that limits how many you can use, you should be able to do at least 100 though. There are better ways to do it that could support more points but this was the easiest way since I was reusing some code I already had.
The code is below, and it requires a header file to be placed next to it so it can access some functions:
Code: (Select All)
Option _Explicit
Const TOTAL = 100
Type vec2
x As Single
y As Single
End Type
Dim Shared points(TOTAL) As vec2
Dim Shared cols(TOTAL) As vec3
Dim Shared vectors(TOTAL) As vec2
Randomize Timer ' seed RND generator
Dim p As Long
For p = 0 To TOTAL - 1 ' cycle through pane center points
points(p).x = Rnd * 1 ' random x location
points(p).y = Rnd * 1 ' random y location
cols(p).x = Rnd * .5 + .5
cols(p).y = Rnd * .5 + .5
cols(p).z = Rnd * .5 + .5
'cols(p) = _RGB32(Rnd * 128 + 128, Rnd * 128 + 128, Rnd * 128 + 128) ' random color above 128, 128, 128 vectors(p).x = (Rnd - Rnd) * .005 ' random x velocity
vectors(p).y = (Rnd - Rnd) * .005 ' random y velocity
Next p
Type vec3
x As Single
y As Single
z As Single
End Type
Type tri
p1 As vec3
p2 As vec3
p3 As vec3
End Type
Type model
bufId As _Unsigned Long
vertArrId As _Unsigned Long
pointsId As _Unsigned Long
colorsId As _Unsigned Long
prog As _Unsigned Long
tris As _MEM
triCount As _Unsigned Long
End Type
Declare Library
Function glCreateProgram& ()
Function glCreateShader& (ByVal flags As _Unsigned Long)
Sub glCompileShader (ByVal shader As _Unsigned Long)
Sub glAttachShader (ByVal prog As _Unsigned Long, Byval shader As _Unsigned Long)
Sub glLinkProgram (ByVal prog As _Unsigned Long)
Sub glUseProgram (ByVal prog As _Unsigned Long)
' Sub glGenBuffers (ByVal size As _Unsigned Long, Byval buffers As _Offset)
Sub glBindBuffer (ByVal target As _Unsigned Long, Byval buffer As _Unsigned Long)
' Sub glGenVertexArrays (ByVal size As _Unsigned Long, Byval buffers As _Offset)
Sub glBindVertexArray (ByVal array As _Unsigned Long)
Sub glEnableVertexAttribArray (ByVal index As _Unsigned Long)
Sub glVertexAttribPointer (ByVal index As _Unsigned Long, _
Byval size As _Unsigned Long, _
Byval typ As _Unsigned Long, _
Byval normal As _Unsigned _Byte, _
Byval stride As _Unsigned Long, _
Byval pointer As _Offset)
Function glGetUniformLocation& (ByVal prog As _Unsigned Long, nam As String)
Sub glEnable (ByVal cap As _Unsigned Long)
Sub glDisable (ByVal cap As _Unsigned Long)
Sub glDepthFunc (ByVal cap As _Unsigned Long)
End Declare
Declare Library "stained_glass_help"
Sub glShaderSource Alias "myShaderSource" (ByVal shader As _Unsigned Long, Byval count As _Unsigned Long, Byval text As _Offset, Byval length As _Offset)
Sub glGenBuffers Alias "myGenBuffers" (ByVal size As _Unsigned Long, Byval buffers As _Offset)
Sub glGenVertexArrays Alias "myGenVertexArrays" (ByVal size As _Unsigned Long, Byval buffers As _Offset)
Sub glBufferData ALIAS "myBufferData" (byval target As _Unsigned Long, _
byval size As _offset, _
byval dat As _Offset, _
byval usage As _Unsigned Long)
Sub glDrawArrays Alias "myDrawArrays" (ByVal mode As _Unsigned Long, Byval first As _Unsigned Long, Byval count As _Unsigned Long)
Sub glUniformMatrix4fv ALIAS "myUniformMatrix4fv" (ByVal location As _Unsigned Long, _
ByVal count As _Unsigned Long, _
ByVal transpose As _Unsigned _Byte, _
ByVal floats As _Offset)
sub glUniform2fv ALIAS "myUniform2fv" (BYVal location As _unsigned long, _
byval count as _unsigned long, _
byval floats as _Offset)
sub glUniform3fv ALIAS "myUniform3fv" (BYVal location As _unsigned long, _
byval count as _unsigned long, _
byval ints as _Offset)
End Declare
Dim p As Long
p = 0
Dim dx As Single, dy As Single
Do ' begin point update loop
dx = points(p).x + vectors(p).x ' calculate new look ahead point x location
dy = points(p).y + vectors(p).y ' calculate new look ahead point y location
If dx < 0 Or dx > 1 Then vectors(p).x = -vectors(p).x ' reverse vector if left/right side of image reached
If dy < 0 Or dy > 1 Then vectors(p).y = -vectors(p).y ' reverse vector if top/bottom side of image reached
points(p).x = points(p).x + vectors(p).x ' calculate new point x location
points(p).y = points(p).y + vectors(p).y ' calculate new point y location
p = p + 1 ' increment point counter
Loop Until p = TOTAL ' leave when all points updated
End Sub
Sub putTriVec (m As _MEM, TriIndex As _Unsigned Long, VecIndex As _Unsigned Long, x As Single, y As Single, z As Single)
Dim v As vec3, t As tri
v.x = x: v.y = y: v.z = z
_MemPut m, m.OFFSET + TriIndex * Len(t) + Len(v) * VecIndex, v
End Sub
Sub setupModel (m As model, vshad As String, fshad As String)
Dim t As tri
Dim vs As _Unsigned Long, fs As _Unsigned Long
Dim o As _Offset
vs = glCreateShader&(GL_VERTEX_SHADER)
o = _Offset(vshad$)
glShaderSource vs, 1, _Offset(o), 0
Dim Shared As Long pax(pnt), pay(pnt), indx(px, py), dx(pnt), dy(pnt)
Dim Shared As Long dSqr(px, py), counter
Dim Shared As _Unsigned Long col(pnt)
For i = 1 To pnt
pax(i) = Int(Rnd * px)
pay(i) = Int(Rnd * py)
col(i) = _RGB32(Rnd * 180, (Rnd < .5) * -255, Rnd * 150, Rnd * 255)
dx(i) = Int(Rnd * 15) + 3
dy(i) = Int(Rnd * 11) - 5
Next
While _KeyDown(27) = 0
For x = 0 To px - 1
For y = 0 To py - 1
dSqr(x, y) = (pax(1) - x) * (pax(1) - x) + (pay(1) - y) * (pay(1) - y)
indx(x, y) = 1
Next
Next
For i = 2 To pnt
ly = py - 1
For x = pax(i) To 0 Step -1
If (scan(i, x, ly)) = 0 Then Exit For
Next x
For x = pax(i) + 1 To px - 1
If (scan(i, x, ly)) = 0 Then Exit For
Next
Next
For x = 0 To px - 1
For y = 0 To py - 1
sy = y
adjct = indx(x, y)
For y = y + 1 To py
If indx(x, y) <> adjct Then y = y - 1: Exit For
Next
Line (x, sy)-(x, y + 1), col(adjct), BF ' bplus added BF
Next
Next
For i = 1 To pnt
pax(i) = pax(i) + dx(i)
If pax(i) < 0 Then pax(i) = 0: dx(i) = -dx(i)
If pax(i) > px - 1 Then newPnt i
pay(i) = pay(i) + dy(i)
If pay(i) < 0 Then pay(i) = 0: dy(i) = -dy(i)
If pay(i) > py - 1 Then pay(i) = py - 1: dy(i) = -dy(i)
Next
Wend
Sub newPnt (i)
Dim a As Long
counter = counter + 1
pax(i) = 0
pay(i) = Int(Rnd * py)
a = Rnd * (255 - counter / 2)
If a < 4 Then a = 4 ' fix
col(i) = _RGB32(Rnd * 180, (Rnd < .5) * -255, Rnd * 150, a)
dx(i) = Int(Rnd * 15) + 3
dy(i) = Int(Rnd * 11) - 5
End Sub
Function scan (site As Integer, x As Integer, ly As Integer)
Dim As Integer ty
Dim As Long delt2, dsq
delt2 = (pax(site) - x) * (pax(site) - x)
For ty = 0 To ly
dsq = (pay(site) - ty) * (pay(site) - ty) + delt2
If dsq <= dSqr(x, ty) Then
dSqr(x, ty) = dsq
indx(x, ty) = site
scan = 1
End If
Next
End Function
Sorry Terry -- The refactoring I mentioned won't quite work in this case as you're making use of those X/Y variables in other places than just that _MEMPUT. You can still simplify the math somewhat though:
sWidth = 640' set screen width
sHeight = 480' set screen height
MaxDist = sWidth * sWidth + sHeight * sHeight ' maximum possible distance
Image = _NewImage(sWidth, sHeight, 32) ' output image
mImage = _MemImage(Image) ' output image memory
ScreenImage = _CopyImage(Image, 32) ' view screen
mScreen = _MemImage(ScreenImage) ' view screen memory Screen ScreenImage ' create view screen RandomizeTimer' seed RND generator For p = 0To TOTAL - 1' cycle through pane center points
Points(p).x = Rnd * sWidth ' random x location
Points(p).y = Rnd * sHeight ' random y location
Points(p).c = _RGB32(Rnd * 128 + 128, Rnd * 128 + 128, Rnd * 128 + 128) ' random color above 128, 128, 128
Points(p).xv = (Rnd - Rnd) * 3' random x velocity
Points(p).yv = (Rnd - Rnd) * 3' random y velocity Next p Do' begin animation loop $Checking:Off
y = 0' reset vertical counter Do' begin vertical loop
x = 0' reset horizontal counter
ty = mImage.OFFSET + y * sWidth * 4 Do' begin horizontal loop
NearestDist = MaxDist ' reset nearest distance seen
p = 0' reset point counter Do' begin point loop
dx = Points(p).x - x ' calculate distance from x to point x
dy = Points(p).y - y ' calculate distance from y to point y
Dist = dx * dx + dy * dy ' calculate hypotenuse distance If Dist < NearestDist Then' is this the nearest distance seen?
Nearest = p ' yes, mark this point as nearest
NearestDist = Dist ' set new nearest distance seen End If
p = p + 1' increment point counter Loop Until p = TOTAL ' leave when all points checked _MemPut mImage, ty + x * 4, Points(Nearest).c ' draw pixel on output image
x = x + 1' increment horizontal counter Loop Until x = sWidth ' leave when width of image reached
y = y + 1' increment vertical counter Loop Until y = sHeight ' leave when height of image reached _MemCopy mImage, mImage.OFFSET, mImage.SIZE To mScreen, mScreen.OFFSET ' copy image to view screen
p = 0' reset point counter Do' begin point update loop
dx = Points(p).x + Points(p).xv ' calculate new look ahead point x location
dy = Points(p).y + Points(p).yv ' calculate new look ahead point y location If dx < 0Or dx > sWidth - 1Then Points(p).xv = -Points(p).xv ' reverse vector if left/right side of image reached If dy < 0Or dy > sHeight - 1Then Points(p).yv = -Points(p).yv ' reverse vector if top/bottom side of image reached
Points(p).x = Points(p).x + Points(p).xv ' calculate new point x location
Points(p).y = Points(p).y + Points(p).yv ' calculate new point y location
p = p + 1' increment point counter Loop Until p = TOTAL ' leave when all points updated $Checking:On Loop Until_KeyDown(27) ' leave when ESC key pressed System' return to operating system
Const sWidth = 640' set screen width Const sHeight = 480' set screen height
MaxDist = sWidth * sWidth + sHeight * sHeight ' maximum possible distance
Image = _NewImage(sWidth, sHeight, 32) ' output image
mImage = _MemImage(Image) ' output image memory 'ScreenImage = _CopyImage(Image, 32) ' view screen 'mScreen = _MemImage(ScreenImage) ' view screen memory Screen Image ' create view screen RandomizeTimer' seed RND generator For p = 0To TOTAL - 1' cycle through pane center points
Points(p).x = Rnd * sWidth ' random x location
Points(p).y = Rnd * sHeight ' random y location
Points(p).c = _RGB32(Rnd * 128 + 128, Rnd * 128 + 128, Rnd * 128 + 128) ' random color above 128, 128, 128
Points(p).xv = (Rnd - Rnd) * 3' random x velocity
Points(p).yv = (Rnd - Rnd) * 3' random y velocity Next p Do' begin animation loop $Checking:Off
y = 0' reset vertical counter Do' begin vertical loop
x = 0' reset horizontal counter
ty = mImage.OFFSET + y * sWidth * 4 Do' begin horizontal loop
NearestDist = MaxDist ' reset nearest distance seen
p = 0' reset point counter Do' begin point loop
dx = Points(p).x - x ' calculate distance from x to point x
dy = Points(p).y - y ' calculate distance from y to point y
Dist = dx * dx + dy * dy ' calculate hypotenuse distance If Dist < NearestDist Then' is this the nearest distance seen?
Nearest = p ' yes, mark this point as nearest
NearestDist = Dist ' set new nearest distance seen End If
p = p + 1' increment point counter Loop Until p = TOTAL ' leave when all points checked _MemPut mImage, ty + x * 4, Points(Nearest).c ' draw pixel on output image
x = x + 1' increment horizontal counter Loop Until x = sWidth ' leave when width of image reached
y = y + 1' increment vertical counter Loop Until y = sHeight ' leave when height of image reached ' _MemCopy mImage, mImage.OFFSET, mImage.SIZE To mScreen, mScreen.OFFSET ' copy image to view screen
p = 0' reset point counter Do' begin point update loop
dx = Points(p).x + Points(p).xv ' calculate new look ahead point x location
dy = Points(p).y + Points(p).yv ' calculate new look ahead point y location If dx < 0_Orelse dx >= sWidth Then Points(p).xv = -Points(p).xv ' reverse vector if left/right side of image reached If dy < 0_Orelse dy >= sHeight Then Points(p).yv = -Points(p).yv ' reverse vector if top/bottom side of image reached
Points(p).x = Points(p).x + Points(p).xv ' calculate new point x location
Points(p).y = Points(p).y + Points(p).yv ' calculate new point y location
p = p + 1' increment point counter Loop Until p = TOTAL ' leave when all points updated $Checking:On Loop Until_KeyDown(27) ' leave when ESC key pressed System' return to operating system
Use of CONST for SWidth and SHeight, as those don't change and it saves the compiler from having to do a variable look up.
Use of _ORELSE instead of OR in those IF statements, so we can short circuit out it one side has been met, without checking the other.
Change of comparision from > sWidth -1 to >= sWidth (It's not a big change, but it all adds up one bit at a time).
Removal of second screen. What's that for? Just use the single screen and save that copy of the memory every cycle.
Refactor of the math, as per the post above to (ty + x * 4) as the offset for that _MEMPUT
If there's anything else out there to tweak, it's not showing up as something obvious to me, at the moment. Maybe it's enough to see a change on slower end computers. I dunno. Both with and without the changes, it performs reasonably well for me on my laptop.