Commit Graph

244 Commits

Author SHA1 Message Date
Andrey Semashev
cc763cb48e Reworked absolute() to fix appending root directory.
Because of the changed semantics of appending operations in v4, path
composition in absolute() would produce incorrect results because at some
point it would append root directory and therefore discard root name
that was potentially added before. The updated implementation fixes that,
and also fixes the case when the input path is already absolute and
starts with a root directory, and the base path has a root name.
Previously, the returned path would contain the root name from the
base path, while the correct thing to do is to return the input path
as is.
2021-11-05 23:41:31 +03:00
Andrey Semashev
df972e9a5d Remove unused constants on Windows to silence clang warnings. 2021-10-26 19:10:01 +03:00
Andrey Semashev
b4c39093cc Reimplemented create_directories for compatibility with v4 paths.
The new implementation is prepared for the removal of the implicit
trailing dots in v4 path. It also no longer uses recursion
internally and therefore is better protected against stack overflows.

As a side effect of this rewrite, create_directories no longer reports
error if the input path consists entirely of dot and dot-dot elements.
This is in line with C++20 std::filesystem behavior.
2021-10-17 21:38:28 +03:00
Andrey Semashev
9794725bda Added a workaround for Linux headers older than 2.6.19.
linux/magic.h was introduced in Linux kernel 2.6.19, building Boost.Filesystem
with older kernel headers would fail because of this. Only include the header
when it is found and fallback to our local constant definitions when it's not.
2021-08-03 14:45:46 +03:00
Andrey Semashev
87d3c1fd8a Fix weakly_canonical on Windows if the path contains non-existing elements.
Windows APIs such as GetFileAttributesW perform lexical path normalization
internally, which means e.g. "C:\a\.." resolves to an existing path
even if "C:\a" doesn't. This breaks depection of the longest sequence
of existing path elements in weakly_canonical and results in an error
in canonical that is called on that sequence.

As a workaround, perform forward iteration on Windows, so that we
stop on the first path element that doesn't exist.

Also, while at it, corrected error code reported from weakly_canonical
when status fails with an error.

Closes https://github.com/boostorg/filesystem/issues/201.
2021-07-28 20:05:17 +03:00
Andrey Semashev
003f002399 Disable posix_fadvise for Android API < 21.
Closes https://github.com/boostorg/filesystem/issues/198.
2021-07-12 15:50:18 +03:00
Andrey Semashev
2dda038306 Reworked function pointers use and definitions.
Instead of using atomic<> to access global function pointers, use raw
pointers and atomic_ref to access them safely in multi-threaded builds.
This allows to ensure constant initialization of the function pointers,
even in C++03. This also solves the problem of undefined dynamic
initialization order that we previously tried to solve with the
init_priority attribute. The attribute turns out to not work if the
pointers were raw pointers (in single-threaded builds). It is also
not supported by Intel Compiler and possibly other, which required
us to avoid using the function pointer for fill_random.

The resulting code should be simpler and more portable. In order to
check for C++20 std::atomic_ref availability, the older check for <atomic>
header was replaced with a check for std::atomic_ref. If not available,
we're using Boost.Atomic, as before.
2021-06-14 22:09:15 +03:00
Andrey Semashev
08e7a20785 Added runtime detection of getrantom Linux system call.
Fall back to reading /dev/(u)random if getrandom fails with ENOSYS.

Also, extracted the portability macros for atomics to a separate header
to be able to use them in unique_path.cpp. Rearranged function pointers
initialization to decouple the initializer object from the particular
system calls.

For getrandom, the ENOSYS failure is only cached if the compiler supports
specifying global object initialization priority, which is needed to ensure
that the function pointer is initialized before the syscall initializer
in a different TU. If the compiler does not support this feature, just
always attempt getrandom first.
2021-06-14 03:39:03 +03:00
Andrey Semashev
3e8c8b15f9 Added runtime detection of the statx system call on Linux.
This can be useful if the syscall is present at compile time but fails with
ENOSYS at run time (for example, in Docker containers that restrict the syscall,
even if available on the host).

Additionally, marked statx syscall wrappers with attributes to disable MSAN
for them. It was reported that MSAN on clang 10 is showing errors
accessing uninitialized data in stx_mask, which must be initialized by the
syscall.

Related to https://github.com/boostorg/filesystem/issues/172
Related to https://github.com/boostorg/filesystem/issues/185
2021-06-13 21:10:10 +03:00
Andrey Semashev
26a07aad53 Allow creating symlinks on Windows in non-elevated mode.
If Windows is running in Developer mode, it is possible to specify
SYMBOLIC_LINK_FLAG_ALLOW_UNPRIVILEGED_CREATE flag to CreateSymbolicLinkW
so that the call doesn't require elevated privileges.

While at it, explicitly separated implementation of create_symlink and
create_directory_symlink for POSIX and Windows.
2021-06-09 19:09:24 +03:00
Andrey Semashev
4b5023c94b Use preferred separator for root directory in (weakly_)canonical.
Using preferred separators in paths on Windows works around "file not found"
errors returned by GetFileAttributesW, when a forward slash is used in some paths.
Specifically, this can happen with UNC paths and paths starting with the Win32
filesystem prefix ("\\?\").

Closes https://github.com/boostorg/filesystem/issues/87.
Closes https://github.com/boostorg/filesystem/issues/187.
2021-06-06 22:56:16 +03:00
Andrey Semashev
62a598e3dd Reduced absolute_path_max limit.
The new value is closer to max path size limits that is defined on various
systems.
2021-06-06 17:27:53 +03:00
Andrey Semashev
a252f15f06 Use substitute names to obtain the target of a reparse point.
The print name can be empty for some reparse points (e.g. mount points
created by Box cloud storage driver and directory junctions created by
junction.exe). It is supposed to be mostly used for presenting a "simple"
path to the user and not to actually locate the file.

The substitute name is the actionable replacement path, but it is in
NT path format and can potentially point to unmounted volumes and
UNC resources. The implementation attempts to convert the NT path
to Win32 path by matching commonly known patterns against the NT path.
If no pattern matches, we create a Win32 path by converting the NT path
prefix to "\\?\".

Related to https://github.com/boostorg/filesystem/issues/187.
2021-06-06 04:20:24 +03:00
Andrey Semashev
af6ac28b57 Added ERROR_BAD_NET_NAME to the list of errors indicating "file not found".
ERROR_BAD_NET_NAME is returned on Windows 10 21H1 x64 when a non-existent
share is accessed: "\\no-host\no-share".
2021-05-30 22:27:04 +03:00
Andrey Semashev
0eb5290401 Added weakly_canonical overloads taking base path as an argument.
This can be useful when current_path is not supported by the system.
2021-05-29 18:40:51 +03:00
Andrey Semashev
0cdb5a7d87 Use a safer check for dot and dot-dot paths in weakly_canonical.
Also, renamed a few variables to avoid name clashes and improve code
consistency.
2021-05-29 04:34:54 +03:00
Andrey Semashev
a7ff5b43f3 Implemented a limit on the number of symlinks resolved in canonical().
This protects from an infinite loop in case if symlinks form a loop.

The limit is currently system-dependent, with a lower bound of 40.
2021-05-29 03:09:29 +03:00
Andrey Semashev
60ceb77b0f Extracted small path buffer size to a global constant. 2021-05-29 01:46:23 +03:00
Andrey Semashev
267b945993 Fail current_path on Windows CE with ERROR_NOT_SUPPORTED.
Windows CE does not support current directory.
2021-05-28 17:48:43 +03:00
Andrey Semashev
a12c413adf Start with double the small buffer in current_path fallback.
When current_path on POSIX falls back to the dynamically allocated
buffer for the resulting path, start with double the size of the
small stack buffer that was used initially.
2021-05-28 17:32:51 +03:00
Andrey Semashev
a42613369f Optimized canonical() wrt. symlinks containing dot elements. 2021-05-28 16:10:25 +03:00
Andrey Semashev
616dab9b8c Update root in canonical if resolved link is absolute with a different root.
When canonical() resolves symlink, it is possible that a symlink resolves
to an absolute path with a different root. We need to update the root
path so that when we restart symlink resolution the check for the
root path still works.

Also, slightly refactored the canonical() implementation to reduce code
size and possibly optimize the generated code.
2021-05-28 16:03:28 +03:00
Andrey Semashev
84440dd46f Prefer a buffer larger than the file in the read/write loop.
This allows to avoid an extra read syscall to detect the end of the file
if the file fits in the buffer exactly.
2021-05-24 13:37:15 +03:00
Andrey Semashev
a3745c8ba1 Removed unused macro defininition. 2021-05-21 00:56:47 +03:00
Andrey Semashev
3744ed73d4 Fixed compilation on 32-bit Windows and added support for multi-stream files.
There was a missing calling convention specification in the CopyFileEx
callback. This was not a problem in 64-bit builds since there is no
stdcall convention in 64-bit x86.

When CopyFileEx copies multi-stream files, the callback is executed for each
stream separately. Each stream is represented with a separate file handle,
so we have to flush buffers when the stream is fully copied, rather than the
whole file.
2021-05-20 23:27:29 +03:00
Andrey Semashev
26955d8a9f Changed handling of copy_options::synchronize(_data) on Windows.
Use FlushFileBuffers to force any buffered data written to the permanent
storage. The previously used COPY_FILE_NO_BUFFERING flag only guarantees
that no data is left in the OS filesystem cache, but does not ensure
that any device buffers are flushed.
2021-05-20 20:30:14 +03:00
Andrey Semashev
7651a8e90c Check for EINPROGRESS on closing the target file descriptor in copy_file.
This error code will be allowed in future POSIX revisions, according to
https://www.austingroupbugs.net/view.php?id=529#c1200.
2021-05-20 13:18:08 +03:00
Andrey Semashev
0bbc79b884 Remove Boost.Atomic dependency in single-threaded builds.
In single-threaded builds we can assume no thread synchronization
is necessary and avoid the dependency on Boost.Atomic. The dependency
caused single-threaded build failures because Boost.Atomic requires
multithreading to be enabled.

The CMake build currently does not support single-threaded builds, so
the dependency is left present there.

Closes https://github.com/boostorg/filesystem/issues/188.
2021-05-20 02:35:34 +03:00
Andrey Semashev
e01ae41298 Renamed max_send_size to max_batch_size for code clarity. 2021-05-19 13:57:17 +03:00
Andrey Semashev
dc2a162e5e Use a small stack buffer if heap memory allocation fails in read/write copy_file.
This allows the copy_file operation to succeed if the memory allocator reports
failure.
2021-05-19 13:43:54 +03:00
Andrey Semashev
d44b4ce865 Use a variable buffer size for read/write loop.
The buffer size is now selected based on the file size and filesystem block
size and is limited with min and max. This allows to reduce memory consumption
and possibly increase performance when copying smaller files.
2021-05-19 10:54:03 +03:00
Andrey Semashev
a59bce0708 Removed unused include. 2021-05-19 03:19:08 +03:00
Andrey Semashev
dc65ed5213 Added definitions of filesystem type magic constants. 2021-05-19 03:19:08 +03:00
Andrey Semashev
129d847f8f Added a link to LKML discussion re copy_file_range and procfs/sysfs/etc. 2021-05-19 02:37:28 +03:00
Andrey Semashev
88c2a2df8c Check the source filesystem type before using sendfile/copy_file_range.
Some filesystems have regular files with generated content. Such files have
arbitrary size, including zero, but have actual content. Linux system calls
sendfile or copy_file_range will not copy contents of such files, so we must
use a read/write loop to handle them.

Check the type of the source filesystem before using sendfile or
copy_file_range and fallback to the read/write loop if it matches one of
the blacklisted filesystems: procfs, sysfs, tracefs or debugfs.

Also, added a test to verify that copy_file works on procfs.
2021-05-19 01:43:22 +03:00
Andrey Semashev
9a35774ede Call posix_fadvise to indicate that source file will be read sequentially. 2021-05-19 00:48:03 +03:00
Andrey Semashev
b27ad65326 Increased the minimum buffer size in read/write loop in copy_file.
Also, take into account the target filesystem block size, if available.
2021-05-19 00:22:31 +03:00
Andrey Semashev
4b9052f1e0 Fallback to read/write loop if sendfile/copy_file_range fail.
Since sendfile and copy_file_range can fail for some filesystems
(e.g. eCryptFS), we have to fallback to the read/write loop in copy_file
implementation. Additionally, since we implement the fallback now,
fallback to sendfile if copy_file_range fails with EXDEV and use
copy_file_range on older kernels that don't implement it for
cross-filesystem copying. This may be beneficial if copy_file_range
is used within a filesystem, and is performed on a remote server NFS or CIFS).

Also, it was discovered that copy_file_range can also fail with EOPNOTSUPP
when it is performed on an NFSv4 filesystem and the remote server does
not support COPY operation. This happens on some patched kernels in RHEL/CentOS.

Lastly, to make sure the copy_file_data pointer is accessed atomically,
it is now declared as an atomic value. If std::atomic is unavailable,
Boost.Atomic is used.

Fixes https://github.com/boostorg/filesystem/issues/184.
2021-05-18 23:16:02 +03:00
Andrey Semashev
59e3644803 Added definition of COPY_FILE_NO_BUFFERING for Cygwin, MinGW and MinGW-w64. 2021-05-17 21:35:48 +03:00
Andrey Semashev
f5ebcfcd49 Don't indicate error in copy_file if close fails with EINTR. 2021-05-17 21:26:37 +03:00
Andrey Semashev
3c8408995f Added copy_options::synchronize_data and copy_options::synchronize.
These options allow to synchronize the copied data and attributes with
the permanent storage. Note that by default on POSIX systems copy_file
used to synchronize data in previous releases, and this commit changes
this. The caller now has to explicitly request syncing, as it has
significant performance implications.

Closes https://github.com/boostorg/filesystem/issues/186.
2021-05-17 20:33:57 +03:00
Andrey Semashev
be900df3e6 Added EINTR handling on close(2).
At least HP-UX is known to leave the file descriptor open if close() returns
EINTR. On other systems (Linux, BSD, Solaris, AIX) the file descriptor
is closed in the same situation, and closing it again may potentially close
the wrong descriptor if it is reused by another thread. We introduce
close_fd internal helper to abstract away these platform differences.
2021-05-17 18:39:46 +03:00
Andrey Semashev
8c676eaf8f Avoid comparing pointers to a literal zero. 2021-05-17 17:40:20 +03:00
Andrey Semashev
92262db736 Added EINTR handling for fsync/fdatasync. 2021-05-17 17:22:00 +03:00
Andrey Semashev
9dadc8c90f Minor code cleanup. 2021-05-16 23:41:31 +03:00
Andrey Semashev
05de74a000 Added config macros for disabling use of some system APIs.
By defining these new config macros the user can configure the library
to avoid using some system APIs even if they are detected as available
by the library build scripts. This can be useful in case if the API
is known to consistently fail at runtime on the target system.

Related to https://github.com/boostorg/filesystem/issues/172.
2021-05-16 20:44:09 +03:00
Andrey Semashev
c03249c375 Reformatted code for more consistent look and better readability. 2021-04-24 22:37:57 +03:00
Andrey Semashev
83429c9bfd Check file status for status_error in create_directories.
create_directories used to ignore errors returned by status()
calls issued internally. The operation would likely fail anyway,
but the error codes returned by create_directories would be incorrect.
Also, it is better to terminate the operation as early as possible
when an error is encountered.

Reported in https://github.com/boostorg/filesystem/issues/182.
2021-03-29 20:20:34 +03:00
Andrey Semashev
b4d606cdd0 Reduced preprocessor conditions. 2020-12-23 11:10:50 +03:00
whitequark
c6e5bdafce Update WASI platform support. 2020-12-23 11:10:50 +03:00