mirror of
https://github.com/boostorg/uuid.git
synced 2026-01-19 04:42:16 +00:00
* Re-format the test code for MSVC bug 981648. * Improved generated x86 code for SSE4.1 and later targets. Prefer movdqu to lddqu on CPUs supporting SSE4.1 and later. lddqu has one extra cycle latency on Skylake and later Intel CPUs, and with AVX vlddqu is not merged to the following instructions as a memory operand, which makes the code slightly larger. Legacy SSE3 lddqu is still preferred when SSE4.1 is not enabled because it is faster on Prescott and the same as movdqu on AMD CPUs. It also doesn't affect code size because movdqu cannot be converted to a memory operand as memory operands are required to be aligned in SSE. Closes https://github.com/boostorg/uuid/issues/137. * Use movdqu universally for loading UUIDs. This effectively drops the optimization for NetBurst CPUs and instead prefers code that is slightly better on Skylake and later Intel CPUs, even when the code is compiled for SSE3 and not SSE4.1.
1.2 KiB
1.2 KiB