2
0
mirror of https://github.com/boostorg/uuid.git synced 2026-01-19 04:42:16 +00:00

283 Commits

Author SHA1 Message Date
Peter Dimov
b2c72679ba Update test_uuid_from_string_2 2026-01-14 19:59:07 +02:00
Peter Dimov
bee48932a6 Add test_uuid_from_string_3.cpp 2026-01-14 19:48:35 +02:00
Peter Dimov
aa6c1a7985 std::back_insert_iterator's value_type is void smh 2026-01-13 17:46:30 +02:00
Peter Dimov
6b0dc88eef Add test_to_chars_3.cpp 2026-01-13 15:44:22 +02:00
Andrey Semashev
b135a5d816 Compile tests without running in CI if the CPU lacks required features.
This allows for testing that the ISA-specific code at least compiles,
even if running the tests isn't possible.

The support is only added to b2, CMake still always compiles and runs
the tests to keep using boost_test_jamfile for easier maintenance. In
the future, similar support can be added to CMake as well.
2026-01-10 12:15:59 +03:00
Peter Dimov
8febad40ed Make string_generator support string-like types like uuid_from_string does 2026-01-09 21:44:26 +02:00
Peter Dimov
517bfe6972 Include BoostTestJamfile in test/CMakeLists.txt 2026-01-08 18:58:47 +02:00
Peter Dimov
d9d396a666 Add test_uuid_from_string_cx.cpp 2026-01-08 18:52:29 +02:00
Peter Dimov
6d5420b09b Add boost_test_jamfile to CMakeLists.txt 2026-01-08 18:40:18 +02:00
Peter Dimov
29c5f57c95 Define BOOST_UUID_REPORT_IMPLEMENTATION in test .cpp files instead of Jamfile 2026-01-08 18:19:15 +02:00
Peter Dimov
d493f92dcd Add uuid_from_string 2026-01-08 16:59:10 +02:00
Peter Dimov
029527c109 Disable -Wconversion for GCC 5 in test_hash_value.cpp 2026-01-05 20:27:38 +02:00
Peter Dimov
2b5d078c00 Disable -Wshadow in test_bench_random.cpp for GCC 4.x because of Boost.Timer 2026-01-05 16:54:55 +02:00
Peter Dimov
da2cb7cc02 Avoid -Wshadow warning in test_tagging.cpp 2026-01-05 16:54:55 +02:00
Peter Dimov
7340079ffd Avoid -Wsign-conversion warnings in test_time_generator_v7_2.cpp 2026-01-05 16:54:54 +02:00
Peter Dimov
b92abf895f Avoid -Wsign-conversion warnings in test_to_chars.cpp 2026-01-05 16:54:54 +02:00
Peter Dimov
40b12ae256 Disable -Wsign-conversion in test_random_generator.cpp 2026-01-05 16:54:54 +02:00
Peter Dimov
91ffab27d2 Enable stricter warnings (matching Unordered) in test/Jamfile.v2 2026-01-05 16:54:54 +02:00
Andrey Semashev
831a9e6eab Added more tests for from_chars verifying unexpected end of input.
The added tests check unexpected end of input on even and odd character
positions, since these are handled separately in SIMD.
2026-01-05 14:13:10 +03:00
Andrey Semashev
7e50b1aaa7 Added running from_chars tests with SIMD disabled.
Also added to/from_chars tests to CMakeLists.txt.
2026-01-05 14:13:10 +03:00
Andrey Semashev
3920cc584c Added x86 SIMD implementation of from_chars.
This adds SSE4.1, AVX2, AVX-512v1 and AVX10.1 implementations of the
from_chars algorithm. The generic implementation is moved to its own
header and constexpr is relaxed to only enabled when is_constant_evaluated
is supported.

The performance effect on Intel Golden Cove (Core i7-12700K), gcc 13.3,
in millions of successful from_chars() calls per second:

Char     | Generic | SSE4.1          | AVX2            | AVX512v1        | AVX10.1
=========+=========+=================+=================+=================+================
char     |  38.571 | 560.645 (14.5x) | 501.505 (13.0x) | 540.038 (14.0x) | 480.778 (12.5x)
char16_t |  37.998 | 479.308 (12.6x) | 425.728 (11.2x) | 416.379 (11.0x) | 392.326 (10.3x)
char32_t |  50.327 | 391.313 (7.78x) | 359.312 (7.14x) | 349.849 (6.95x) | 333.979 (6.64x)

The AVX2 version is slightly slower than SSE4.1 because on Intel
microarchitectures the VEX-coded vpblendvb instruction is slower than
the legacy SSE4.1 pblendvb. The code contains workarounds for this, which
have slight performance overhead compared to SSE4.1 version, but are still
faster than using vpblendvb. Alternatively, the performance could be
improved by using asm blocks to force using pblendvb in AVX2 code, but this
may potentially cause SSE/AVX transition penalties if the target vector
register happens to have "dirty" upper bits. There's no way to ensure this
doesn't happen, so this is not implemented. AVX512v1 claws back some
performance and uses less instructions (i.e. smaller code size).

The AVX10.1 version is slower as it uses vpermi2b instruction from AVX512_VBMI,
which is relatively slow on Intel. It allows for reducing the number of
instructions even further and the number of vector constants as well. The
instruction is faster on AMD Zen 4 and should offer better performance compared
to AVX512v1 code path, although it wasn't tested. This code path is disabled
by default, unless BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B is defined, which
can be used to test and tune performance on AMD and newer Intel CPU
microarchitectures. Thus, by default, AVX10.1 performance should be roughly
equivalent to AVX512v1, barring compiler (mis)optimizations.

The unsuccessful parsing case depends on where the error happens, as the
generic version may terminate sooner if the error is detected at the
beginning of the input string, while the SIMD version performs roughly
the same amount of work but faster. Here are some examples for 8-bit
character types (for larger types the numbers are more or less comparable):

Error              | Generic  | SSE4.1          | AVX2            | AVX512v1        | AVX10.1
===================+==========+=================+=================+=================+================
EOI at 35 chars    |   43.629 | 356.562 (8.17x) | 326.311 (7.48x) | 322.377 (7.39x) | 308.155 (7.06x)
EOI at 1 char      | 2645.783 | 444.769 (0.17x) | 400.275 (0.15x) | 404.826 (0.15x) | 403.730 (0.15x)
Missing dash at 23 |   73.878 | 514.303 (6.96x) | 474.694 (6.43x) | 507.949 (6.88x) | 474.077 (6.42x)
Missing dash at 8  |  223.921 | 516.641 (2.31x) | 472.737 (2.11x) | 506.242 (2.26x) | 473.718 (2.12x)
Illegal char at 35 |   47.373 | 368.002 (7.77x) | 333.233 (7.03x) | 318.242 (6.72x) | 301.659 (6.37x)
Illegal char at 0  | 1729.087 | 421.511 (0.24x) | 385.217 (0.22x) | 374.047 (0.22x) | 351.944 (0.20x)

The above table is collected with BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B
defined.

In general, only the very early errors tend to perform worse in the SIMD
version and the majority of cases are still faster.

Besides BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B, the implementation also has
BOOST_UUID_TO_FROM_CHARS_X86_USE_ZMM control macro, which, if defined, enables
usage of 512-bit registers for convertting from 32-bit character types to 8-bit
integers. This code path is also slower than the 256-bit path on Golden Cove,
and therefore is disabled. The macro is provided primarily to allow for tuning
and experimentation with newer CPU microarchitectures, where the 512-bit path
may become beneficial. All of the above performance numbers were produced
without it.
2026-01-05 14:13:10 +03:00
Peter Dimov
326e5db863 Merge pull request #185 from Lastique/feature/from_chars_result_op_bool
Add `from_chars_result::operator bool()`
2025-12-29 13:08:27 +02:00
Andrey Semashev
2508c5434e Added running IO and to_chars tests with SIMD disabled. 2025-12-27 23:53:21 +03:00
Andrey Semashev
839c431152 Added x86 SIMD implementation of to_chars.
Moved the generic to_chars implementation to a separate header and made
to_chars.hpp select the implementation based on the enabled SIMD ISA
extensions. Added an x86 implementation leveraging SSSE3 and later
vector extensions. Added detection of the said extensions to config.hpp.

The performance effect on Intel Golden Cove (Core i7-12700K), gcc 13.3,
in millions of to_chars() calls per second with a 16-byte aligned output buffer:

Char     | Generic | SSE4.1           | AVX2             | AVX-512
=========+=========+==================+==================+=================
char     | 203.190 | 1059.322 (5.21x) | 1053.352 (5.18x) | 1058.089 (5.21x)
char16_t | 184.003 |  848.356 (4.61x) | 1009.489 (5.49x) | 1011.122 (5.50x)
char32_t | 202.425 |  484.801 (2.39x) |  676.338 (3.34x) |  462.770 (2.29x)

The core of the SIMD implementation is using 128-bit vectors, larger vectors
are only used to convert to the target character types. This means that for
1-byte character types all vector implementations are basically the same
(barring the extra ISA flexibility added by AVX) and for 2-byte character
types AVX2 and AVX-512 are basically the same.

For 4-byte character types, AVX-512 showed worse performance than SSE4.1 and
AVX2 on the test system. It isn't clear why that is, but it is possible that
the CPU throttles 512-bit instructions so much that the performance drops
below a 256-bit equivalent. Perhaps, there are just not enough 512-bit
instructions for the CPU to power up the full 512-bit pipeline. Therefore,
the AVX-512 code path for 4-byte character types is currently disabled and
the AVX2 path is used instead (which makes AVX2 and AVX-512 versions basically
equivalent). The AVX-512 path can be enabled again if new CPU microarchitectures
appear that will benefit from it.

Higher alignment values of the output buffer were also tested, but they did not
meaningfully improve performance.
2025-12-27 23:52:15 +03:00
Andrey Semashev
31bc20e30e Added from_chars_result::operator bool(). 2025-12-26 21:58:49 +03:00
Peter Dimov
c8f97785a4 Prefer SIMD/runtime performance over constexpr-ness when __builtin_is_constant_evaluated is not available 2025-12-26 19:27:20 +02:00
Peter Dimov
c963533f73 Make to_chars constexpr 2025-12-25 19:20:58 +02:00
Peter Dimov
e78661b1e2 Suppress msvc-14.1 warnings in test_hash_value_cx 2025-12-25 13:48:11 +02:00
Peter Dimov
09dd0dd608 Make hash_value constexpr 2025-12-25 12:41:38 +02:00
Peter Dimov
bd765b558c Make uuid::is_nil, swap, and relational operators constexpr 2025-12-25 06:44:07 +02:00
Peter Dimov
5a5797d465 Add more constexpr to class uuid 2025-12-24 17:45:11 +02:00
Peter Dimov
da0ce8406a Disable test_string_generator_cx2 under GCC 5 2025-12-23 11:37:33 +02:00
Peter Dimov
83d3da399c Add test_string_generator_cx2.cpp 2025-12-23 04:21:49 +02:00
Peter Dimov
a851cf33ca Add test_string_generator_2.cpp 2025-12-23 04:15:37 +02:00
Peter Dimov
6622230ae1 Disable test_string_generator_cx for GCC 5 2025-12-23 02:07:39 +02:00
Peter Dimov
5e2b31baa4 Update test_string_generator_cx.cpp 2025-12-22 22:01:13 +02:00
Peter Dimov
fc1de844d7 Add test_from_chars_cx2.cpp 2025-12-22 17:20:29 +02:00
Peter Dimov
96bc0067bc Add test_from_chars_cx.cpp 2025-12-22 17:01:37 +02:00
Peter Dimov
d6c401e62e Add boost::uuids::from_chars 2025-12-21 21:35:23 +02:00
Peter Dimov
6180c2600d Add missing includes 2025-12-21 16:34:25 +02:00
Peter Dimov
bbbb20427c Link test_random_generator.cpp to Boost::move 2025-09-19 14:40:10 +03:00
Peter Dimov
2b160eb689 Merge pull request #179 from ivanpanch/patch-1
Fix mistakes
2025-09-09 16:52:45 +03:00
Peter Dimov
c6d8a9a1f0 Make constants constexpr under C++14 2025-09-09 02:59:29 +03:00
Peter Dimov
5496dd3bbe Add boost/uuid/constants.hpp 2025-09-09 02:59:29 +03:00
Peter Dimov
a33efc09a2 Disable test_string_generator_cx for GCC 6 as well 2025-09-01 03:15:26 +03:00
Peter Dimov
284f847b7c Add test_string_generator_cx.cpp 2025-09-01 02:51:33 +03:00
Peter Dimov
1c2cc1fae2 Revert "Add RFC-9562 compliant Max UUID (section 5.10)"
This reverts commit bf16d95746.
2025-08-16 20:18:51 +03:00
ivanpanch
7fad3a42f0 Update test_msvc_simd_bug981648_main.cpp 2025-08-09 20:44:05 +02:00
ivanpanch
c44c8b4f7e Update quick.cpp 2025-08-09 20:28:38 +02:00
James E. King III
bf16d95746 Add RFC-9562 compliant Max UUID (section 5.10) 2025-08-09 14:28:05 -04:00