03-11-2024, 06:21 AM
Exactly what @a740g said. It seems the likely issue you're seeing is that because no optimization is applied to the generated source (unless you enable it in the `Compiler Settings`), the `_shl` function is not inlined by the compiler, that makes it significantly slower than the `* 4` which is just turned into a single shift instruction. You can see it in the disassembly, the first highlighted line is the shift generated for the `y = x * 4`, the second highlighted line is the call to the `_SHL` function when doing `_SHL(x, 2)`.