11-13-2023, 05:05 AM
(11-12-2023, 01:19 AM)SMcNeill Wrote: On 32-bit OSes, data aligns on 4-byte intervals. On 64-bit OSes, data aligns on 8-byte intervals.The key is not the CPU bitness but rather the cacheline size, which is typically 32 bytes or 64 bytes (on modernish CPUs). The CPU reads memory from RAM in cacheline-sized chunks, so if your data is within a single cacheline then it only requires one read. Additionally (on x86), as long as the data is within a single cacheline it doesn't matter what the alignment of the data is, that doesn't impact the speed. Point being, it's good to align your data since it ensures it will always start on a cacheline, but you don't actually need to get more granular than 32 bytes to achieve that.
For inserting padding, the question is more the alignment requirements of the individual types, which are the same on 32-bit vs. 64-bit and typically just the size of the type. Like @mnrvovrfc mentioned though, UDTs don't actually get any padding added, so because of that you can make them significantly more efficient by ensuring their size is a multiple of 32 bytes.