Posts: 1,272
Threads: 119
Joined: Apr 2022
Reputation:
100
I'm writing a few image manipulation routines and need them to be as fast as possible. I'm using bit manipulation to extract pixel color information like so:
(all of the variables have been declared properly beforehand)
_MEMGET m, o, p ' get pixel at offset within image memory block
a = 4278190080 AND p ' extract alpha
r = 16711680 AND p ' extract red
g = 65280 AND p ' extract green
b = 255 AND p ' get blue level (0 to 255)
a = _SHR(a, 24) ' get alpha level (0 to 255) (divide by 16777215)
r = _SHR(r, 16) ' get red level (0 to 255) (divide by 65536)
g = _SHR(g, 8) ' get green level (0 to 255) (divide by 256)
I need some insight from the developers regarding _ALPHA32(), _RED32(), _GREEN32(), and _BLUE32(). Is my bit manipulation faster than using these statements or are they handled internally using the same method (or an even faster method I'm unaware of)?
Loving the memory manipulation statements by the way. Man, have I been missing out by not using them regularly.
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Posts: 303
Threads: 10
Joined: Apr 2022
Reputation:
44
The functions alpha/red/green/blue work the same way as what you're doing and will be equivalent in speed. The only minor difference is that your `AND` for the alpha value is unnecessary as all the values you're `AND`ing away will be shifted away by the `_SHR(a, 24)`, so because of that `_ALPHA32()` only does a shift.
That said, them having equivalent speed is only true if you're inlining that logic everywhere you use it. If you plan to put the logic into a QB64 `FUNCTION`then there is overhead to calling QB64 functions that the built-in functions do not have. In absolute time it could easily be 100x slower simply because the `FUNCTION` would do so little actual work.
I would also note that the type's of the variables matter for this question. Your code only works if the types are unsigned, because `_ShR` will do 64-bit sign extension if the original value is a signed negative. In contrast, the QB64 functions will work correctly regardless of the signed/unsigned nature of the original value (really this only matters for the `_Alpha32()`, since the other colors are always positive values after the `AND`).
Posts: 1,272
Threads: 119
Joined: Apr 2022
Reputation:
100
(01-28-2024, 06:51 AM)DSMan195276 Wrote: The functions alpha/red/green/blue work the same way as what you're doing and will be equivalent in speed. The only minor difference is that your `AND` for the alpha value is unnecessary as all the values you're `AND`ing away will be shifted away by the `_SHR(a, 24)`, so because of that `_ALPHA32()` only does a shift.
That said, them having equivalent speed is only true if you're inlining that logic everywhere you use it. If you plan to put the logic into a QB64 `FUNCTION`then there is overhead to calling QB64 functions that the built-in functions do not have. In absolute time it could easily be 100x slower simply because the `FUNCTION` would do so little actual work.
I would also note that the type's of the variables matter for this question. Your code only works if the types are unsigned, because `_ShR` will do 64-bit sign extension if the original value is a signed negative. In contrast, the QB64 functions will work correctly regardless of the signed/unsigned nature of the original value (really this only matters for the `_Alpha32()`, since the other colors are always positive values after the `AND`). Excellent, thank you for the quick feedback. I'll use the statements then since they are already as fast as can be and as a bonus handle types better than I was.
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Posts: 2,698
Threads: 327
Joined: Apr 2022
Reputation:
217
A better way may be to just get values directly.
$Checking:off
Alpha = _MEMGET(m, o +3, _unsigned _byte)
Red = _MEMGET(m, o + 2, unsigned byte)
Green... o + 1
Blue ... o
$Checking:on
No shifting, no calculations. Just a direct value read.
(When then goes to the speed question: is it faster to just read 4 values directly, or to only read one and then math it?)
I'd *think* a simple direct read would be faster, but with cacheing and all nowadays... I'd swear to nothing!
Posts: 1,272
Threads: 119
Joined: Apr 2022
Reputation:
100
(01-28-2024, 07:16 AM)SMcNeill Wrote: A better way may be to just get values directly.
$Checking:off
Alpha = _MEMGET(m, o +3, _unsigned _byte)
Red = _MEMGET(m, o + 2, unsigned byte)
Green... o + 1
Blue ... o
$Checking:on
No shifting, no calculations. Just a direct value read.
(When then goes to the speed question: is it faster to just read 4 values directly, or to only read one and then math it?)
I'd *think* a simple direct read would be faster, but with cacheing and all nowadays... I'd swear to nothing! Leave it Steve to think even deeper. Very interesting and of course so simple I completely overlooked it.
One thing I noticed here is that because of the way you are reading offset values it appears that the bytes are ordered in BGRA. Is that correct Steve, was my bit manipulation efforts reading in the data backwards?
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Posts: 2,698
Threads: 327
Joined: Apr 2022
Reputation:
217
BGRA is correct. If you haven't seen my videos on _MEM, they cover all that for us in the Little Endians and Big Chiefs video.
Posts: 1,272
Threads: 119
Joined: Apr 2022
Reputation:
100
(01-28-2024, 05:32 PM)SMcNeill Wrote: BGRA is correct. If you haven't seen my videos on _MEM, they cover all that for us in the Little Endians and Big Chiefs video. Ok, I need to do some viewing then. That's explains some anomalies I was getting, LOL.
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
Posts: 2,698
Threads: 327
Joined: Apr 2022
Reputation:
217
It all breaks down to basically two ways to write a number: Little Endian vs Big Endian.
In binary, what value is: 00000001?
It's *either* 1 *OR* 128!
Little endian stores values in ascending bit power. 2^0, 2 ^ 1, 2 ^ 2... to 2 ^ 7.
Big endian stores values in descending bit power. 2 ^ 7, 2 ^ 6, 2 ^ 5.... to 2 ^ 7
It's all about how you read/write the data.
In English, we read left to right, top to bottom.
In Arabic, it's read right to left, top to bottom.
In Japanese, they read top to bottom, right to left.
None is any better than the other -- all those languages end up storing the same data and information-- but it's important to know how to read and get that information successfully.
Little Endian and Big Endian is that same basic concept. Is it left to right, or right to left?
In this case, it's BGRA. In some other apps/programs, you may find ARGB Which is why you always need a manual, reference, or the old barefoot hobo senior programmer around to tell you which format to use, while he sips on his coffee, lounges lazily and eats a doughnot, and earns three times what everyone else in the office does.
Posts: 2,698
Threads: 327
Joined: Apr 2022
Reputation:
217
01-28-2024, 06:30 PM
(This post was last modified: 01-28-2024, 06:32 PM by SMcNeill.)
The best non-technical way I've came up with over the years to help illustrate the difference between Little Endian and Big Endian involves a simple Classroom exercise. If you'e still teaching (I don't know if you've retired yet, or semi-retired and just substitute, or whatnot), the next time you go into a computer classroom, ask your students to stand up and count to 5 on their fingers.
The vast majority will start with their right index finger, then their right middle finger, and so on...
But there'll be two kids in the class who start with the left index finger, then left middle finger, and so on...
And then there's always that one kid in class who holds up his leg, then points his ankle towards the ceiling....
Right-handed people vs left-handed people vs propitary software.
None are particularly *wrong*, but if you want your video recognition software to understand what they're counting, then you have to set it up to read and understand it in the format that those users provide the data. (And that's why APPLE's stuff is such a PITA to work with. Steve Jobs was that oddball leg-raising guy who had his own properitary way of trying to do everything. )
Posts: 1,272
Threads: 119
Joined: Apr 2022
Reputation:
100
Thanks for explaining this. I understand little vs big endian from my assembler dabbling days back in the Z80, 6809e, 6502, and x86 days with each processor handling bit presentation differently. I assumed RGB was presented in bit form the same way as the QB64PE commands are set up. I don't know why this didn't cross my mind though when I was getting some strange results in final images. Age I guess.
Your classroom exercise would have been great to use. I left teaching back in 2018 unfortunately.
I incorporated your suggestion into the following subroutine that converts an image to gray scale. Works well.
Code: (Select All) ' _______________________________________________________________________________________________________
'/ \
SUB __GrayScale (i AS LONG) ' __GrayScale |
' ___________________________________________________________________________________________________|___
'/ \
'| Converts an image to gray scale without affecting original alpha values. |
'| Uses ITU-R 601-2 Luma Transformation (L = R * 299/1000 + G * 587/1000 + B * 114/1000) |
'| |
'| i - the image to work with |
'\_______________________________________________________________________________________________________/
DIM m AS _MEM ' memory block holding image data
DIM o AS _OFFSET ' 4 byte pixel location within memory block
DIM t AS _OFFSET ' total number of pixels in image
DIM a AS _UNSIGNED _BYTE ' alpha level of each pixel
DIM r AS _UNSIGNED _BYTE ' red value of each pixel
DIM g AS _UNSIGNED _BYTE ' green value of each pixel
DIM b AS _UNSIGNED _BYTE ' blue value of each pixel
DIM p AS _UNSIGNED LONG ' new grayscale pixel value
$CHECKING:OFF
m = _MEMIMAGE(i) ' set memory block to image data
t = m.SIZE \ 4 ' calculate number of pixels in image
o = m.OFFSET ' start address of memory block
DO ' begin pixel count
_MEMGET m, o + 3, a ' get alpha (0 to 255)
_MEMGET m, o + 2, r ' get red (0 to 255)
_MEMGET m, o + 1, g ' get green (0 to 255)
_MEMGET m, o, b ' get blue (0 to 255)
p = INT(r * .299 + g * .587 + b * .114) ' calculate gray level
p = _RGBA32(p, p, p, a) ' create new pixel
_MEMPUT m, o, p ' update pixel in memory block
o = o + 4 ' next 4 byte pixel location in memory block
t = t - 1 ' decrement pixels remaining
LOOP UNTIL t = 0 ' leave when no pixels remain
_MEMFREE m ' free memory block
$CHECKING:ON
END SUB
New to QB64pe? Visit the QB64 tutorial to get started.
QB64 Tutorial
|