Integer memory storage - Printable Version

Integer memory storage - Printable Version

+- QB64 Phoenix Edition (https://qb64phoenix.com/forum)
+-- Forum: Chatting and Socializing (https://qb64phoenix.com/forum/forumdisplay.php?fid=11)
+--- Forum: General Discussion (https://qb64phoenix.com/forum/forumdisplay.php?fid=2)
+--- Thread: Integer memory storage (/showthread.php?tid=2152)

Integer memory storage - bobalooie - 11-11-2023

How does QB64PE handle storage in RAM memory of the various integer data types on a 64-bit machine? Obviously _INTEGER64 is the natural integer size, but I am curious about the smaller integers.

RE: Integer memory storage - bplus - 11-11-2023

i was curious too so:

Code: (Select All)
a% = 1

a2~% = 2

b& = 3

b2~& = 4

c&& = 5

c2~&& = 6

Print Len(a%); Len(a2~%), Len(b&); Len(b2~&), Len(c&&); Len(c2~&&)

RE: Integer memory storage - SMcNeill - 11-12-2023

https://qb64phoenix.com/qb64wiki/index.php/TYPE <-- There's what you want, more or less.

An INTEGER is 2 bytes, so an array of DIM foo(10) AS INTEGER is 22 bytes. (From 0 to 10 is 11 elements, with each being 2 bytes in size. Don't forget that 0 element!)

_BITs are 1 byte. (Unless you're dealing with arrays, in which case they're then bit packed as tightly as possible, so DIM foo(100) AS _BIT is only _CEIL(101 / 8) bytes in size.)
_BYTEs are 1 byte.
INTEGERs are 2 bytes.
LONGs are 4 bytes.
_INTEGER64s are 8 bytes.

With that said, there's some background overhead associated with every variable in QB64, which tracks its offset in memory, what type is it, if it's available or freed, ect, ect. From what I recall, without looking it up, I *believe* that behind-the-scene data is 22-bytes per variable/array. So, if you're trying to micromanage memory usage down to account for the last byte your program uses, you'd be looking at having to account for that +22 per variable still in scope and valid. (Remember, SUBs and FUNCTIONs clean up after themselves and free those variables and all that background data as well, so those are just temporary mem users.)

RE: Integer memory storage - mnrvovrfc - 11-12-2023

Array of bits. I gather an array of 2-byte integers could be the same way, as long as the size is a multiple of 32. I don't know if code things still depend on a single data item being dropped directly into a CPU register. It should be, if not then why Freebasic's developers redefined ```INTEGER``` keyword as 64-bit integer? (shrugs)

I have learned that when declaring UDT's in a 16-bit system, things run "more efficiently" if the total byte size were a multiple of eight. So it had me declaring something like this as final element:

Code: (Select All)
FILLER AS STRING * 77

I don't know but today that looks foolish to me. Blush

RE: Integer memory storage - SMcNeill - 11-12-2023

(11-12-2023, 01:05 AM)mnrvovrfc Wrote: Array of bits. I gather an array of 2-byte integers could be the same way, as long as the size is a multiple of 32. I don't know if code things still depend on a single data item being dropped directly into a CPU register. It should be, if not then why Freebasic's developers redefined ```INTEGER``` keyword as 64-bit integer? (shrugs)

I have learned that when declaring UDT's in a 16-bit system, things run "more efficiently" if the total byte size were a multiple of eight. So it had me declaring something like this as final element:

Code: (Select All)
FILLER AS STRING * 77

I don't know but today that looks foolish to me.

It often has to do with your OS and data packing/alignment.

On 32-bit OSes, data aligns on 4-byte intervals. On 64-bit OSes, data aligns on 8-byte intervals. The reason is simple -- that's how many bytes one register holds, so it's just grabbing your data and using it, rather than having to split it up into multiple reads.

For example:

TYPE Foo
x AS INTEGER
y AS LONG
z AS STRING * 3
END TYPE

Now, on a 32-bit OS, it tends to store that data as:

TYPE Foo
x AS INTEGER
FILLER AS STRING * 2
y AS LONG
z AS STRING * 3
FILLER AS STRING *1
END TYPE

Now, with one read from memory, you can get the data for x, y, or z, by reading a 4-byte chunk of information and making use of the portion you want.

As it was previously, you'd have multiple reads to get your data. If you wanted info on y, you head to read 2 bytes with x, then 2 bytes with z, then add those bytes together to make y... And that's a lot more reading and assembling and processing and it's just a mess.

On 64-bit OS, that type tends to be aligned on 8-byte bounds such as:

TYPE Foo
x AS INTEGER
y AS LONG
FILLER AS STRING * 2
z AS STRING * 3
FILLER AS STRING * 5
END TYPE

One pass to get the element in question, without any of that grab-a-chunk-here grab-a-chunk-there mess involved.

*****************************

Note: The reason why INTEGER is 2-bytes is because DOS -- which QB45 and BASIC was written for back in the day -- was a 16-bit operating system. FreeBasic devs decided they didn't need to try and keep backwards compatibility so much, as they wanted to move forward with modern features, so they apparently redefined INTEGER to be the size of what we'd consider a modern INTEGER to be on our systems -- 64-bit for most modern OSes.

QB64's goal has always been backwards compatibility as much as possible, so we still keep the INTEGER type as 2-bytes, just as QB45 did.

RE: Integer memory storage - bobalooie - 11-12-2023

(11-12-2023, 12:49 AM)SMcNeill Wrote: So, if you're trying to micromanage memory usage down to account for the last byte your program uses, you'd be looking at having to account for that +22 per variable still in scope and valid.

I'm not interested in micromanaging memory, nothing that I will ever write with PE will challenge the capacity of today's computers. I'm working on an electrical analysis program that needs some internal flags. I got to wondering how PE internally handles bits and bytes when the CPU is 64 bits wide.

My programming history goes all the way back to the IBM 1130 (16 bit words, 16k core memory, FORTRAN on punch cards.) and 8-bit micros. I suppose I am always a bit conscious of data storage concerns.

RE: Integer memory storage - DSMan195276 - 11-13-2023

(11-12-2023, 01:19 AM)SMcNeill Wrote: On 32-bit OSes, data aligns on 4-byte intervals. On 64-bit OSes, data aligns on 8-byte intervals.

The key is not the CPU bitness but rather the cacheline size, which is typically 32 bytes or 64 bytes (on modernish CPUs). The CPU reads memory from RAM in cacheline-sized chunks, so if your data is within a single cacheline then it only requires one read. Additionally (on x86), as long as the data is within a single cacheline it doesn't matter what the alignment of the data is, that doesn't impact the speed. Point being, it's good to align your data since it ensures it will always start on a cacheline, but you don't actually need to get more granular than 32 bytes to achieve that.

For inserting padding, the question is more the alignment requirements of the individual types, which are the same on 32-bit vs. 64-bit and typically just the size of the type. Like @mnrvovrfc mentioned though, UDTs don't actually get any padding added, so because of that you can make them significantly more efficient by ensuring their size is a multiple of 32 bytes.

RE: Integer memory storage - bobalooie - 11-14-2023

(11-13-2023, 05:05 AM)DSMan195276 Wrote:
(11-12-2023, 01:19 AM)SMcNeill Wrote: On 32-bit OSes, data aligns on 4-byte intervals. On 64-bit OSes, data aligns on 8-byte intervals.
The key is not the CPU bitness but rather the cacheline size, which is typically 32 bytes or 64 bytes (on modernish CPUs). The CPU reads memory from RAM in cacheline-sized chunks, so if your data is within a single cacheline then it only requires one read. Additionally (on x86), as long as the data is within a single cacheline it doesn't matter what the alignment of the data is, that doesn't impact the speed. Point being, it's good to align your data since it ensures it will always start on a cacheline, but you don't actually need to get more granular than 32 bytes to achieve that.

For inserting padding, the question is more the alignment requirements of the individual types, which are the same on 32-bit vs. 64-bit and typically just the size of the type. Like @mnrvovrfc mentioned though, UDTs don't actually get any padding added, so because of that you can make them significantly more efficient by ensuring their size is a multiple of 32 bytes.

So, if I have a program that uses two _BIT variables as flags, I suppose each variable would occupy its own cacheline in memory.

RE: Integer memory storage - DSMan195276 - 11-14-2023

(11-14-2023, 04:13 PM)bobalooie Wrote: So, if I have a program that uses two _BIT variables as flags, I suppose each variable would occupy its own cacheline in memory.

No, likely the opposite. Very generally speaking, your variables will get packed into memory one after another (subject to any alignment requirements). If you declare 8 LONG variables in a row , they will likely end up as one 32-byte chunk of data in memory (for arrays this is always the case). It's not quite that simple due to some stuff QB64 does, but that's roughly accurate.

The CPU then reads from RAM in cacheline sized chunks, so all the variables within those 32-byte or 64-byte ranges get read together. If your LONGs are next to each other, then it will read 8 or 16 of them in one go (each LONG is 4-bytes, so 8 of them is 32-bytes). Whether that actually happens depends on which cachelines those individual variables fall into though, as the cachelines start and end at defined intervals. So while every LONG will be in only one cacheline (due to its alignment), you might end up with 3 LONGs at the end of one cacheline, and then 5 LONGs at the start of the next cacheline.

Ex. The cachelines are bytes 0-31, 32-63, 64-95, 96-127, etc. If you declare 2 LONG variables, they might get placed at bytes 8-11 and 12-15, and that would mean they're both in the first cacheline and get read from RAM together. It's also possible they will get placed at 60-63 and 64-67, in that situation they're still next to eachother but in different cachelines, and thus wouldn't be read from memory at the same time.