boost/uuid - uuid - Hak8or git mirror Repos

boost/uuid

mirror of https://github.com/boostorg/uuid.git synced 2026-01-19 04:42:16 +00:00

Author	SHA1	Message	Date
Peter Dimov	e567490082	Fix BOOST_UUID_REPORT_IMPLEMENTATION messages	2026-01-06 16:26:35 +02:00
Peter Dimov	974f7f7387	Add a job with /arch:SSE4.2 that defines BOOST_UUID_USE_SSE41 to ci.yml	2026-01-06 15:48:08 +02:00
Peter Dimov	029527c109	Disable -Wconversion for GCC 5 in test_hash_value.cpp	2026-01-05 20:27:38 +02:00
Peter Dimov	bb5f471431	Switch ARM64 and S390x jobs to 22.04	2026-01-05 19:02:03 +02:00
Peter Dimov	22660ac3ca	Cosmetic fixes to configuration.adoc	2026-01-05 18:53:49 +02:00
Peter Dimov	c7caa5b94e	Update revision history	2026-01-05 18:52:42 +02:00
Peter Dimov	2b387b4638	Avoid -Wsign-conversion warnings in from_chars_x86.hpp	2026-01-05 17:56:10 +02:00
Peter Dimov	f852d61bee	Avoid -Wsign-conversion warnings in to_chars_x86.hpp	2026-01-05 17:44:32 +02:00
Peter Dimov	2b5d078c00	Disable -Wshadow in test_bench_random.cpp for GCC 4.x because of Boost.Timer	2026-01-05 16:54:55 +02:00
Peter Dimov	0b86240e0d	Avoid -Wconversion warning in time_generator_v6.hpp under GCC 5 and below	2026-01-05 16:54:55 +02:00
Peter Dimov	eabd000a54	Avoid -Wconversion, -Wsign-conversion warnings in detail/basic_name_generator.hpp under GCC 9 and below	2026-01-05 16:54:55 +02:00
Peter Dimov	520b3632f3	Avoid -Wsign-conversion warning in detail/basic_name_generator.hpp when wchar_t is int32_t	2026-01-05 16:54:55 +02:00
Peter Dimov	da2cb7cc02	Avoid -Wshadow warning in test_tagging.cpp	2026-01-05 16:54:55 +02:00
Peter Dimov	09a6a81b6b	Avoid -Wconversion warning in detail/md5.hpp	2026-01-05 16:54:55 +02:00
Peter Dimov	7340079ffd	Avoid -Wsign-conversion warnings in test_time_generator_v7_2.cpp	2026-01-05 16:54:54 +02:00
Peter Dimov	b92abf895f	Avoid -Wsign-conversion warnings in test_to_chars.cpp	2026-01-05 16:54:54 +02:00
Peter Dimov	40b12ae256	Disable -Wsign-conversion in test_random_generator.cpp	2026-01-05 16:54:54 +02:00
Peter Dimov	2ce9519afc	Avoid -Wsign-conversion warning in detail/sha1.hpp	2026-01-05 16:54:54 +02:00
Peter Dimov	fd167bba0d	Avoid -Wsign-conversion warning in time_generator_v1.hpp	2026-01-05 16:54:54 +02:00
Peter Dimov	a835ddff90	Avoid -Wsign-conversion warning in time_generator_v7.hpp	2026-01-05 16:54:54 +02:00
Peter Dimov	91ffab27d2	Enable stricter warnings (matching Unordered) in test/Jamfile.v2	2026-01-05 16:54:54 +02:00
Peter Dimov	db92124922	Reorder includes	2026-01-05 16:02:11 +02:00
Peter Dimov	0038762216	Merge pull request #186 from Lastique/feature/from_chars_simd Add SIMD implementation of `from_chars`	2026-01-05 15:53:07 +02:00
Andrey Semashev	0e23b235fc	Use load/store helpers from endian.hpp in to/from_chars_x86.hpp. The load/store helpers use memcpy internally, which is a more correct way to load and store integers from/to unaligned memory and with potential type punning. In particular, it should silence UBSAN errors about unaligned memory accesses in SIMD algorithms.	2026-01-05 14:13:10 +03:00
Andrey Semashev	3698f8df2c	Added a missing include.	2026-01-05 14:13:10 +03:00
Andrey Semashev	9dde4978fd	Use memcpy/memset/memcmp functions from cstring.hpp in endian.hpp. This benefits integer reads/writes from using compiler intrinsics, when possible.	2026-01-05 14:13:10 +03:00
Andrey Semashev	d358c39a67	Use __builtin_memcpy/memcmp in cstring.hpp. The builtins are sometimes more strongly optimized than the libc function calls. They also don't need the <cstring> include. Added unqualified memcpy function that simply calls either the builtin or the libc function. This function is intended to be a drop-in replacement for the libc memcpy calls, where constexpr friendliness is not important. It is still marked as constexpr to allow mentioning them in other constexpr functions. To avoid early checks whether its body can be evaluated in the context of a constant expression, it is defined as a dummy template. Marked all functions as noexcept.	2026-01-05 14:13:10 +03:00
Andrey Semashev	02574368fc	Added GitHub Actions job on Rocketlake ISA.	2026-01-05 14:13:10 +03:00
Andrey Semashev	831a9e6eab	Added more tests for from_chars verifying unexpected end of input. The added tests check unexpected end of input on even and odd character positions, since these are handled separately in SIMD.	2026-01-05 14:13:10 +03:00
Andrey Semashev	7e50b1aaa7	Added running from_chars tests with SIMD disabled. Also added to/from_chars tests to CMakeLists.txt.	2026-01-05 14:13:10 +03:00
Andrey Semashev	b7535347ec	Allow users to enable 512-bit vectors in to_chars_x86.hpp. Following from_chars_x86.hpp, allow users to explicitly enable 512-bit vectors in to_chars by defining BOOST_UUID_TO_FROM_CHARS_X86_USE_ZMM. This is primarily to allow for experimenting and tuning performance on newer CPU microarchitectures.	2026-01-05 14:13:10 +03:00
Andrey Semashev	3920cc584c	Added x86 SIMD implementation of from_chars. This adds SSE4.1, AVX2, AVX-512v1 and AVX10.1 implementations of the from_chars algorithm. The generic implementation is moved to its own header and constexpr is relaxed to only enabled when is_constant_evaluated is supported. The performance effect on Intel Golden Cove (Core i7-12700K), gcc 13.3, in millions of successful from_chars() calls per second: Char \| Generic \| SSE4.1 \| AVX2 \| AVX512v1 \| AVX10.1 =========+=========+=================+=================+=================+================ char \| 38.571 \| 560.645 (14.5x) \| 501.505 (13.0x) \| 540.038 (14.0x) \| 480.778 (12.5x) char16_t \| 37.998 \| 479.308 (12.6x) \| 425.728 (11.2x) \| 416.379 (11.0x) \| 392.326 (10.3x) char32_t \| 50.327 \| 391.313 (7.78x) \| 359.312 (7.14x) \| 349.849 (6.95x) \| 333.979 (6.64x) The AVX2 version is slightly slower than SSE4.1 because on Intel microarchitectures the VEX-coded vpblendvb instruction is slower than the legacy SSE4.1 pblendvb. The code contains workarounds for this, which have slight performance overhead compared to SSE4.1 version, but are still faster than using vpblendvb. Alternatively, the performance could be improved by using asm blocks to force using pblendvb in AVX2 code, but this may potentially cause SSE/AVX transition penalties if the target vector register happens to have "dirty" upper bits. There's no way to ensure this doesn't happen, so this is not implemented. AVX512v1 claws back some performance and uses less instructions (i.e. smaller code size). The AVX10.1 version is slower as it uses vpermi2b instruction from AVX512_VBMI, which is relatively slow on Intel. It allows for reducing the number of instructions even further and the number of vector constants as well. The instruction is faster on AMD Zen 4 and should offer better performance compared to AVX512v1 code path, although it wasn't tested. This code path is disabled by default, unless BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B is defined, which can be used to test and tune performance on AMD and newer Intel CPU microarchitectures. Thus, by default, AVX10.1 performance should be roughly equivalent to AVX512v1, barring compiler (mis)optimizations. The unsuccessful parsing case depends on where the error happens, as the generic version may terminate sooner if the error is detected at the beginning of the input string, while the SIMD version performs roughly the same amount of work but faster. Here are some examples for 8-bit character types (for larger types the numbers are more or less comparable): Error \| Generic \| SSE4.1 \| AVX2 \| AVX512v1 \| AVX10.1 ===================+==========+=================+=================+=================+================ EOI at 35 chars \| 43.629 \| 356.562 (8.17x) \| 326.311 (7.48x) \| 322.377 (7.39x) \| 308.155 (7.06x) EOI at 1 char \| 2645.783 \| 444.769 (0.17x) \| 400.275 (0.15x) \| 404.826 (0.15x) \| 403.730 (0.15x) Missing dash at 23 \| 73.878 \| 514.303 (6.96x) \| 474.694 (6.43x) \| 507.949 (6.88x) \| 474.077 (6.42x) Missing dash at 8 \| 223.921 \| 516.641 (2.31x) \| 472.737 (2.11x) \| 506.242 (2.26x) \| 473.718 (2.12x) Illegal char at 35 \| 47.373 \| 368.002 (7.77x) \| 333.233 (7.03x) \| 318.242 (6.72x) \| 301.659 (6.37x) Illegal char at 0 \| 1729.087 \| 421.511 (0.24x) \| 385.217 (0.22x) \| 374.047 (0.22x) \| 351.944 (0.20x) The above table is collected with BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B defined. In general, only the very early errors tend to perform worse in the SIMD version and the majority of cases are still faster. Besides BOOST_UUID_FROM_CHARS_X86_USE_VPERMI2B, the implementation also has BOOST_UUID_TO_FROM_CHARS_X86_USE_ZMM control macro, which, if defined, enables usage of 512-bit registers for convertting from 32-bit character types to 8-bit integers. This code path is also slower than the 256-bit path on Golden Cove, and therefore is disabled. The macro is provided primarily to allow for tuning and experimentation with newer CPU microarchitectures, where the 512-bit path may become beneficial. All of the above performance numbers were produced without it.	2026-01-05 14:13:10 +03:00
Andrey Semashev	d0c74979a9	Separated Skylake-X level of AVX-512 to a new config macro. The new BOOST_UUID_USE_AVX512_V1 config macro indicates presence of AVX-512F, VL, CD, BW and DQ extensions, which are supported by Intel Skylake-X and similar processors. BOOST_UUID_USE_AVX10_1 is still retained and indicates support for full AVX10.1 set. For now, it only adds support for VBMI, but this list may grow in the future as new extensions are being utilized.	2026-01-05 14:13:10 +03:00
Andrey Semashev	f7718fd7cc	Removed load_unaligned_si128 helper function. This helper was used to simplify support for older CPUs, to select between _mm_loadu_si128 and _mm_lddqu_si128 intrinsics. That code has long been removed, and we now always use _mm_loadu_si128 to load data. Use the intrinsic directly everywhere.	2026-01-05 14:13:10 +03:00
Andrey Semashev	1f7875c97e	Added a simd_vector utility to complify SIMD constants definition. The simd_vector template is a wrapper around an array of elements that can automatically read that arrays as a SIMD vector. This reduces the amount of reinterpret_casts in SIMD code that uses constants.	2026-01-05 14:13:10 +03:00
Andrey Semashev	d3b72c2b71	Extracted from_chars_result to a separate header.	2026-01-05 14:13:10 +03:00
Andrey Semashev	79ecd9f563	Suppress 'conditional expression is constant' warning on MSVC.	2026-01-05 14:13:10 +03:00
Peter Dimov	9684559de7	Add Windows jobs to ci.yml using different /arch: values	2026-01-04 20:04:54 +02:00
Peter Dimov	83ab39d277	Update documentation	2026-01-04 04:10:34 +02:00
Peter Dimov	d5cf5c4656	Simplify operator>>	2026-01-04 04:06:55 +02:00
Peter Dimov	b24e19d64e	Merge pull request #187 from Lastique/feature/update_iostream_ops Update iostream operators	2026-01-04 04:02:54 +02:00
Andrey Semashev	211d84bdb1	Use stream character type when calling to_chars in operator<<. This avoids potential character code conversion in ostream and instead produces native character type directly in to_chars, which is likely much faster.	2026-01-03 03:50:38 +03:00
Andrey Semashev	45910f2ace	Use from_chars in operator>>. This removes code duplication with from_chars and allows for reusing a faster implementation of from_chars in operator>>. Also, align the input character buffer for more efficient memory accesses.	2026-01-03 03:49:48 +03:00
Peter Dimov	326e5db863	Merge pull request #185 from Lastique/feature/from_chars_result_op_bool Add `from_chars_result::operator bool()`	2025-12-29 13:08:27 +02:00
Peter Dimov	6ce513ef20	Merge pull request #184 from Lastique/feature/to_chars_simd Add SIMD implementation of `to_chars`	2025-12-28 12:09:20 +02:00
Andrey Semashev	454de03dbd	Updated docs with the new SIMD macros, added a release note for SIMD to_chars. Also clarified the meaning of BOOST_UUID_USE_AVX10_1 in the docs as the previous wording could be taken that it indicates support for a subset of AVX-512 that is supported in Skylake-X.	2025-12-27 23:53:21 +03:00
Andrey Semashev	ef9c903055	Reorder macos parameters in GitHub Actions CI for better readability in web UI.	2025-12-27 23:53:21 +03:00
Andrey Semashev	e6fe1c45d9	Added GitHub Actions jobs for AVX2-enabled target.	2025-12-27 23:53:21 +03:00
Andrey Semashev	2508c5434e	Added running IO and to_chars tests with SIMD disabled.	2025-12-27 23:53:21 +03:00
Andrey Semashev	84afcf6372	Align buffers for to_chars for better performance of SIMD implementation.	2025-12-27 23:53:21 +03:00

1 2 3 4 5 ...

704 Commits