filesystem/doc/faq.htm

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <title>
      Boost Filesystem FAQ
    </title>
<style type="text/css">
 body {
  background-color: #FFFFFF;
 }
 p.c1 {font-weight: bold}
</style>
  </head>
  <body>
    <h1>
      <img border="0" src="../../../boost.png" align="middle" alt=
      "Boost C++ libraries logo" width="277" height="86"> Filesystem
      FAQ
    </h1>

    <p class="c1">
      Why base the generic-path string format on POSIX?
    </p>
    <p>
      <a href="design.htm#POSIX-01">[POSIX-01]</a> is an ISO Standard.
      It is the basis for the most familiar path-string formats,
      including the URL portion of URI's and the native Windows format.
      It is ubiquitous and familiar.&nbsp; On many systems, it is very
      easy to implement because it is either the native operating
      system format (Unix and Windows) or via a operating system
      supplied POSIX library (z/OS, OS/390, and many more.)
    </p>
    <p class="c1">
      Why not use a full URI (Universal Resource Identifier) based
      path?
    </p>

    <p>
      <a href="design.htm#URI">URI's</a> would promise more than the
      Filesystem Library can actually deliver, since URI's extend far
      beyond what most operating systems consider a file or a
      directory.&nbsp; Thus for the primary "portable script-style file
      system operations" requirement of the Filesystem Library, full
      URI's appear to be over-specification.
    </p>
    <p class="c1">
      Why isn't <i>path</i> a base class with derived
      <i>directory_path</i> and <i>file_path</i> classes?
    </p>

    <p>
      Why bother?&nbsp; The behavior of all three classes is
      essentially identical. Several early versions did require users
      to identify each path as a file or directory path, and this
      seemed to increase errors and decrease code readability. There
      was no apparent upside benefit.
    </p>
    <p class="c1">
      Why are fully specified paths called <i>complete</i> rather than
      <i><a name="absolute" id="absolute">absolute</a></i>?
    </p>

    <p>
      To avoid long-held assumptions (what do you mean, <i>"/foo"</i>
      isn't absolute on some systems?) by programmers used to
      single-rooted filesystems. Using an unfamiliar name for the
      concept and related functions causes programmers to read the
      specs rather than just assuming the meaning is known.
    </p>
    <p class="c1">
      Why not support a concept of specific kinds of file systems, such
      as posix_file_system or windows_file_system.
    </p>
    <p>
      Portability is one of the most important requirements for the
      library.&nbsp;Gaining some advantage by using features specific
      to particular operating systems is not a requirement. There
      doesn't appear to be much need for the ability to manipulate,
      say, a classic Mac OS path while running on an OpenVMS machine.
    </p>

    <p>
      Furthermore, concepts like "file system" are very slippery. What
      happens when a NTFS or FAT file system is mounted in directory on
      a machine running a POSIX-like operating system, for example?
      Some of the POSIX API's may return very un-POSIX like results.
    </p>
    <p class="c1">
      Why not supply a 'handle' type, and let the file and directory
      operations traffic in it?
    </p>
    <p>
      It isn't clear there is any feasible way to meet the "portable
      script-style file system operations" requirement with such a
      system. File systems exist where operations are usually performed
      on some non-string handle type. The classic Mac OS has been
      mentioned explicitly as a case where trafficking in paths isn't
      always natural.&nbsp;&nbsp;&nbsp;
    </p>

    <p>
      The case for the "handle" (opaque data type to identify a file)
      style may be strongest for directory iterator value type.&nbsp;
      (See Jesse Jones' Jan 28, 2002, Boost postings). However, as
      class path has evolved, it seems sufficient even as the directory
      iterator value type.
    </p>
    <p class="c1">
      Why are the operations.hpp non-member functions so low-level?
    </p>
    <p>
      To provide a toolkit from which higher-level functionality can be
      created.
    </p>

    <p>
      An extended attempt to add convenience functions on top of, or as
      a replacement for, the low-level functionality failed because
      there is no widely acceptable set of simple semantics for most
      convenience functions considered.&nbsp; Attempts to provide
      alternate semantics via either run-time options or compile-time
      polices became overly complicated in relation to the value
      delivered, or became contentious.&nbsp; OTOH, the specific
      functionality needed for several trial applications was very easy
      for the user to construct from the lower-level toolkit
      functions.&nbsp; See <a href=
      "design.htm#Abandoned_Designs">Failed Attempts</a>.
    </p>
    <p class="c1">
      Isn't it inconsistent then to provide a few convenience
      functions?
    </p>

    <p>
      Yes, but experience with both this library, POSIX, and Windows
      indicates the utility of certain convenience functions, and that
      it is possible to provide simple, yet widely acceptable,
      semantics for them. For example, remove_all.
    </p>
    <p class="c1">
      Why are there basic_directory_iterator&lt;&gt; overloads for
      operations.hpp predicate functions? Isn't two ways to do the same
      thing poor design?
    </p>
    <p>
      Yes, two ways to do the same thing is often a poor design
      practice. But the iterator versions are often much more
      efficient. Calling status() during iteration over a directory
      containing 15,000 files took 6 seconds for the path overload, and
      1 second for the iterator overload, for tests on a freshly booted
      machine. Times were .90 seconds and .30 seconds, for tests after
      prior use of the directory. This performance gain is large enough
      to justify deviating from preferred design practices. Neither
      overload alone meets all needs.
    </p>

    <p class="c1">
      Why are library functions so picky about errors?
    </p>
    <p>
      Safety. The default is to be safe rather than sorry. This is
      particularly important given the reality that on many computer
      systems files and directories are <a href="#global">globally
      shared</a> resources, and thus subject to unexpected errors.
    </p>
    <p class="c1">
      Why are errors reported by exception rather than return code or
      error notification variable?
    </p>

    <p>
      Safety.&nbsp;Return codes or error notification variables are
      often ignored by programmers.&nbsp; Exceptions are much harder to
      ignore, provided desired default behavior (program termination)
      if not caught, yet allow error recovery if desired. Non-throwing
      versions of functions are provided where experience indicates the
      need.
    </p>
    <p class="c1">
      Why are attributes accessed via named functions rather than
      property maps?
    </p>
    <p>
      For commonly used attributes (existence, directory or file,
      emptiness), simple syntax and guaranteed presence outweigh other
      considerations. Because access to many other attributes is
      inherently system dependent, property maps are viewed as the best
      hope for access and modification, but it is better design to
      provide such functionality in a separate library. (Historical
      note: even the apparently simple attribute "read-only" turned out
      to be so system depend as to be disqualified as a "guaranteed
      presence" operation.)
    </p>

    <p class="c1">
      Why isn't there a set_current_directory function?
    </p>
    <p>
      Global variables are considered harmful [<a href=
      "design.htm#Wulf-Shaw-73">wulf-shaw-73</a>]. While we can't
      prevent people from shooting themselves in the foot, we aren't
      about to hand them a loaded gun pointed right at their big toe.
    </p>
    <p class="c1">
      Why aren't <a name="wide-character_names" id=
      "wide-character_names">wide-character names</a> supported? Why
      not std::wstring or even a templated type?
    </p>

    <p>
      They <u>are</u> supported, starting with version 1.33. See
      <a href="i18n.html">Internationalization</a>.
    </p>
    <p class="c1">
      Why isn't automatic name portability error detection provided?
    </p>
    <p>

      A number (at least six) of designs for name validity error
      detection were evaluated, including at least four complete
      implementations.&nbsp; While the details for rejection differed,
      all of the more powerful name validity checking designs distorted
      other otherwise simple aspects of the library. Even the simple
      name checking provided in prior library versions was a constant
      source of user complaints. While name checking can be helpful, it
      isn't important enough to justify added a lot of additional
      complexity.
    </p>
    <p class="c1">
      Why are paths sometimes manipulated by member functions and
      sometimes by non-member functions?
    </p>
    <p>
      The design rule is that purely lexical operations are supplied as
      <i>class basic_path</i> member functions, while operations
      performed by the operating system are provided as free functions.
    </p>

    <p class="c1">
      Why is path <i>normalized form</i> different from <i>canonical
      form</i>?
    </p>
    <p>
      On operating systems such as POSIX which allow symbolic links to
      directories, the normalized form of a path can represent a
      different location than the canonical form. See <a href=
      "design.htm#symbolic-link-use-case">use case</a> from Walter
      Landry.
    </p>

    <hr>
    <p>
      Revised
      <!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->
			06 February, 2006
      <!--webbot bot="Timestamp" endspan i-checksum="40411" -->
    </p>
    <p>
      © Copyright Beman Dawes, 2002
    </p>
    <p>

      Use, modification, and distribution are subject to the Boost
      Software License, Version 1.0. (See accompanying file <a href=
      "../../../LICENSE_1_0.txt">LICENSE_1_0.txt</a> or copy at
      <a href="http://www.boost.org/LICENSE_1_0.txt">www.boost.org/LICENSE_1_0.txt</a>)
    </p>
  </body>
</html>