2
0
mirror of https://github.com/boostorg/uuid.git synced 2026-01-19 04:42:16 +00:00
Commit Graph

715 Commits

Author SHA1 Message Date
Peter Dimov
095a47fe04 Change the type of pos to std::ptrdiff_t to avoid casts 2026-01-08 22:57:00 +02:00
Peter Dimov
517bfe6972 Include BoostTestJamfile in test/CMakeLists.txt 2026-01-08 18:58:47 +02:00
Peter Dimov
ea77c34803 Make uuid_from_string constexpr 2026-01-08 18:52:56 +02:00
Peter Dimov
d9d396a666 Add test_uuid_from_string_cx.cpp 2026-01-08 18:52:29 +02:00
Peter Dimov
6d5420b09b Add boost_test_jamfile to CMakeLists.txt 2026-01-08 18:40:18 +02:00
Peter Dimov
29c5f57c95 Define BOOST_UUID_REPORT_IMPLEMENTATION in test .cpp files instead of Jamfile 2026-01-08 18:19:15 +02:00
Peter Dimov
d493f92dcd Add uuid_from_string 2026-01-08 16:59:10 +02:00
Peter Dimov
c9f5a8f028 Merge pull request #188 from Lastique/feature/fix_conversion_warning
Fix `-Wsign-conversion` warning with gcc 13.3
2026-01-07 20:49:44 +02:00
Andrey Semashev
663ffa1287 Make cast formatting consistent with the rest of the file. 2026-01-07 20:13:18 +03:00
Andrey Semashev
286b66b385 Fix -Wsign-conversion warning with gcc 13.3. 2026-01-07 20:08:43 +03:00
Peter Dimov
347258c6c8 When BOOST_UUID_NO_SIMD is defined, undef all other SIMD macros, because otherwise the configuration becomes inconsistent 2026-01-06 16:32:16 +02:00
Peter Dimov
e567490082 Fix BOOST_UUID_REPORT_IMPLEMENTATION messages 2026-01-06 16:26:35 +02:00
Peter Dimov
974f7f7387 Add a job with /arch:SSE4.2 that defines BOOST_UUID_USE_SSE41 to ci.yml 2026-01-06 15:48:08 +02:00
Peter Dimov
029527c109 Disable -Wconversion for GCC 5 in test_hash_value.cpp 2026-01-05 20:27:38 +02:00
Peter Dimov
bb5f471431 Switch ARM64 and S390x jobs to 22.04 2026-01-05 19:02:03 +02:00
Peter Dimov
22660ac3ca Cosmetic fixes to configuration.adoc 2026-01-05 18:53:49 +02:00
Peter Dimov
c7caa5b94e Update revision history 2026-01-05 18:52:42 +02:00
Peter Dimov
2b387b4638 Avoid -Wsign-conversion warnings in from_chars_x86.hpp 2026-01-05 17:56:10 +02:00
Peter Dimov
f852d61bee Avoid -Wsign-conversion warnings in to_chars_x86.hpp 2026-01-05 17:44:32 +02:00
Peter Dimov
2b5d078c00 Disable -Wshadow in test_bench_random.cpp for GCC 4.x because of Boost.Timer 2026-01-05 16:54:55 +02:00
Peter Dimov
0b86240e0d Avoid -Wconversion warning in time_generator_v6.hpp under GCC 5 and below 2026-01-05 16:54:55 +02:00
Peter Dimov
eabd000a54 Avoid -Wconversion, -Wsign-conversion warnings in detail/basic_name_generator.hpp under GCC 9 and below 2026-01-05 16:54:55 +02:00
Peter Dimov
520b3632f3 Avoid -Wsign-conversion warning in detail/basic_name_generator.hpp when wchar_t is int32_t 2026-01-05 16:54:55 +02:00
Peter Dimov
da2cb7cc02 Avoid -Wshadow warning in test_tagging.cpp 2026-01-05 16:54:55 +02:00
Peter Dimov
09a6a81b6b Avoid -Wconversion warning in detail/md5.hpp 2026-01-05 16:54:55 +02:00
Peter Dimov
7340079ffd Avoid -Wsign-conversion warnings in test_time_generator_v7_2.cpp 2026-01-05 16:54:54 +02:00
Peter Dimov
b92abf895f Avoid -Wsign-conversion warnings in test_to_chars.cpp 2026-01-05 16:54:54 +02:00
Peter Dimov
40b12ae256 Disable -Wsign-conversion in test_random_generator.cpp 2026-01-05 16:54:54 +02:00
Peter Dimov
2ce9519afc Avoid -Wsign-conversion warning in detail/sha1.hpp 2026-01-05 16:54:54 +02:00
Peter Dimov
fd167bba0d Avoid -Wsign-conversion warning in time_generator_v1.hpp 2026-01-05 16:54:54 +02:00
Peter Dimov
a835ddff90 Avoid -Wsign-conversion warning in time_generator_v7.hpp 2026-01-05 16:54:54 +02:00
Peter Dimov
91ffab27d2 Enable stricter warnings (matching Unordered) in test/Jamfile.v2 2026-01-05 16:54:54 +02:00
Peter Dimov
db92124922 Reorder includes 2026-01-05 16:02:11 +02:00
Peter Dimov
0038762216 Merge pull request #186 from Lastique/feature/from_chars_simd
Add SIMD implementation of `from_chars`
2026-01-05 15:53:07 +02:00
Andrey Semashev
0e23b235fc Use load/store helpers from endian.hpp in to/from_chars_x86.hpp.
The load/store helpers use memcpy internally, which is a more correct
way to load and store integers from/to unaligned memory and with
potential type punning. In particular, it should silence UBSAN errors
about unaligned memory accesses in SIMD algorithms.
2026-01-05 14:13:10 +03:00
Andrey Semashev
3698f8df2c Added a missing include. 2026-01-05 14:13:10 +03:00
Andrey Semashev
9dde4978fd Use memcpy/memset/memcmp functions from cstring.hpp in endian.hpp.
This benefits integer reads/writes from using compiler intrinsics,
when possible.
2026-01-05 14:13:10 +03:00
Andrey Semashev
d358c39a67 Use __builtin_memcpy/memcmp in cstring.hpp.
The builtins are sometimes more strongly optimized than the libc function
calls. They also don't need the <cstring> include.

Added unqualified memcpy function that simply calls either the builtin or
the libc function. This function is intended to be a drop-in replacement
for the libc memcpy calls, where constexpr friendliness is not important.
It is still marked as constexpr to allow mentioning them in other constexpr
functions. To avoid early checks whether its body can be evaluated in the
context of a constant expression, it is defined as a dummy template.

Marked all functions as noexcept.
2026-01-05 14:13:10 +03:00
Andrey Semashev
02574368fc Added GitHub Actions job on Rocketlake ISA. 2026-01-05 14:13:10 +03:00
Andrey Semashev
831a9e6eab Added more tests for from_chars verifying unexpected end of input.
The added tests check unexpected end of input on even and odd character
positions, since these are handled separately in SIMD.
2026-01-05 14:13:10 +03:00
Andrey Semashev
7e50b1aaa7 Added running from_chars tests with SIMD disabled.
Also added to/from_chars tests to CMakeLists.txt.
2026-01-05 14:13:10 +03:00
Andrey Semashev
b7535347ec Allow users to enable 512-bit vectors in to_chars_x86.hpp.
Following from_chars_x86.hpp, allow users to explicitly enable 512-bit
vectors in to_chars by defining BOOST_UUID_TO_FROM_CHARS_X86_USE_ZMM.
This is primarily to allow for experimenting and tuning performance on
newer CPU microarchitectures.
2026-01-05 14:13:10 +03:00
Andrey Semashev
3920cc584c Added x86 SIMD implementation of from_chars.
This adds SSE4.1, AVX2, AVX-512v1 and AVX10.1 implementations of the
from_chars algorithm. The generic implementation is moved to its own
header and constexpr is relaxed to only enabled when is_constant_evaluated
is supported.

The performance effect on Intel Golden Cove (Core i7-12700K), gcc 13.3,
in millions of successful from_chars() calls per second:

Char     | Generic | SSE4.1          | AVX2            | AVX512v1        | AVX10.1
=========+=========+=================+=================+=================+================
char     |  38.571 | 560.645 (14.5x) | 501.505 (13.0x) | 540.038 (14.0x) | 480.778 (12.5x)
char16_t |  37.998 | 479.308 (12.6x) | 425.728 (11.2x) | 416.379 (11.0x) | 392.326 (10.3x)
char32_t |  50.327 | 391.313 (7.78x) | 359.312 (7.14x) | 349.849 (6.95x) | 333.979 (6.64x)

The AVX2 version is slightly slower than SSE4.1 because on Intel
microarchitectures the VEX-coded vpblendvb instruction is slower than
the legacy SSE4.1 pblendvb. The code contains workarounds for this, which
have slight performance overhead compared to SSE4.1 version, but are still
faster than using vpblendvb. Alternatively, the performance could be
improved by using asm blocks to force using pblendvb in AVX2 code, but this
may potentially cause SSE/AVX transition penalties if the target vector
register happens to have "dirty" upper bits. There's no way to ensure this
doesn't happen, so this is not implemented. AVX512v1 claws back some
performance and uses less instructions (i.e. smaller code size).

The AVX10.1 version is slower as it uses vpermi2b instruction from AVX512_VBMI,
which is relatively slow on Intel. It allows for reducing the number of
instructions even further and the number of vector constants as well. The
instruction is faster on AMD Zen 4 and should offer better performance compared
to AVX512v1 code path, although it wasn't tested. This code path is disabled
by default, unless BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B is defined, which
can be used to test and tune performance on AMD and newer Intel CPU
microarchitectures. Thus, by default, AVX10.1 performance should be roughly
equivalent to AVX512v1, barring compiler (mis)optimizations.

The unsuccessful parsing case depends on where the error happens, as the
generic version may terminate sooner if the error is detected at the
beginning of the input string, while the SIMD version performs roughly
the same amount of work but faster. Here are some examples for 8-bit
character types (for larger types the numbers are more or less comparable):

Error              | Generic  | SSE4.1          | AVX2            | AVX512v1        | AVX10.1
===================+==========+=================+=================+=================+================
EOI at 35 chars    |   43.629 | 356.562 (8.17x) | 326.311 (7.48x) | 322.377 (7.39x) | 308.155 (7.06x)
EOI at 1 char      | 2645.783 | 444.769 (0.17x) | 400.275 (0.15x) | 404.826 (0.15x) | 403.730 (0.15x)
Missing dash at 23 |   73.878 | 514.303 (6.96x) | 474.694 (6.43x) | 507.949 (6.88x) | 474.077 (6.42x)
Missing dash at 8  |  223.921 | 516.641 (2.31x) | 472.737 (2.11x) | 506.242 (2.26x) | 473.718 (2.12x)
Illegal char at 35 |   47.373 | 368.002 (7.77x) | 333.233 (7.03x) | 318.242 (6.72x) | 301.659 (6.37x)
Illegal char at 0  | 1729.087 | 421.511 (0.24x) | 385.217 (0.22x) | 374.047 (0.22x) | 351.944 (0.20x)

The above table is collected with BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B
defined.

In general, only the very early errors tend to perform worse in the SIMD
version and the majority of cases are still faster.

Besides BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B, the implementation also has
BOOST_UUID_TO_FROM_CHARS_X86_USE_ZMM control macro, which, if defined, enables
usage of 512-bit registers for convertting from 32-bit character types to 8-bit
integers. This code path is also slower than the 256-bit path on Golden Cove,
and therefore is disabled. The macro is provided primarily to allow for tuning
and experimentation with newer CPU microarchitectures, where the 512-bit path
may become beneficial. All of the above performance numbers were produced
without it.
2026-01-05 14:13:10 +03:00
Andrey Semashev
d0c74979a9 Separated Skylake-X level of AVX-512 to a new config macro.
The new BOOST_UUID_USE_AVX512_V1 config macro indicates presence of
AVX-512F, VL, CD, BW and DQ extensions, which are supported by Intel
Skylake-X and similar processors. BOOST_UUID_USE_AVX10_1 is still
retained and indicates support for full AVX10.1 set. For now, it only
adds support for VBMI, but this list may grow in the future as new
extensions are being utilized.
2026-01-05 14:13:10 +03:00
Andrey Semashev
f7718fd7cc Removed load_unaligned_si128 helper function.
This helper was used to simplify support for older CPUs, to select
between _mm_loadu_si128 and _mm_lddqu_si128 intrinsics. That code
has long been removed, and we now always use _mm_loadu_si128 to load
data. Use the intrinsic directly everywhere.
2026-01-05 14:13:10 +03:00
Andrey Semashev
1f7875c97e Added a simd_vector utility to complify SIMD constants definition.
The simd_vector template is a wrapper around an array of elements that
can automatically read that arrays as a SIMD vector. This reduces the amount
of reinterpret_casts in SIMD code that uses constants.
2026-01-05 14:13:10 +03:00
Andrey Semashev
d3b72c2b71 Extracted from_chars_result to a separate header. 2026-01-05 14:13:10 +03:00
Andrey Semashev
79ecd9f563 Suppress 'conditional expression is constant' warning on MSVC. 2026-01-05 14:13:10 +03:00
Peter Dimov
9684559de7 Add Windows jobs to ci.yml using different /arch: values 2026-01-04 20:04:54 +02:00
Peter Dimov
83ab39d277 Update documentation 2026-01-04 04:10:34 +02:00