Boost.Build v2 architecture

This document is work-in progress. Don't expect much from it yet.

Dependency scanning

Targets

There are two user-visible kinds of targets in Boost.Build. First are "abstract" — they correspond to things declared by user, for example, projects and executable files. The primary thing about abstract target is that it's possible to request them to be build with a particular values of some properties. Each combination of properties may possible yield different set of real file, so abstract target do not have a direct correspondence with files.

File targets, on the contary, are associated with concrete files. Dependency graphs for abstract targets with specific properties are constructed from file targets. User has no was to create file targets, however it can specify rules that detect file type for sources, and also rules for transforming between file targets of different types. That information is used in constructing dependency graph, as desribed in the next section. Note:File targets are not the same as targets in Jam sense; the latter are created from file targets at the latest possible moment. Note:"File target" is a proposed name for what we call virtual targets. It it more understandable by users, but has one problem: virtual targets can potentially be "phony", and not correspond to any file.

Dependency scanning

Dependency scanning is the process of finding implicit dependencies due to "include" statements and similar things. It has to take into account two things:

Whether includes in a particular file need to be taken into account depends on actions that use that file. For example, if the action is "copy file", then includes should be ignored. Another example is when a file is compiled with two different include paths on different toolsets.
It is possible to include generated header. In which case, it may not yet exist at the time when we scan dependencies.

Dependency scanning is implemented by objects called scanners. See documentation for the "scanner" module to detail.

Regarding the first problem, we really have no choice. We can't treat the same actual target differently depending on from where it is used. Therefore, when handling of includes differers depending on actions, we have to duplicate targets and assign different properties to it.

For the reason, when actualizing a virtual target we optionally pass the needed scanner to the "virtual-target.actualize" method. When no scanner is passed, a new actual target is created, with it's dependencies and updating actions set accordingly. When a particular scanner is specified, a new actual target is created. That target will depend on target created without scanner. In effect, this will allow to use different scanners for the same file.

Generated sources

Let me explain what I find the right semantic, first without any subvariants. We have a target "a.cpp" which includes "a_parser.h", we have to search through all include directories, checking:

If there's such file there, or
If there's a target of the same name, bound to that dir via LOCATE variable.

Jam allows to do 1 via SEARCH variable, but that's not enough. Why can't we do simpler: first check if there's target of the same name? I.e. including of "a_parser.h" will already pick generated "a_parser.h", regardless of search paths? Hmm... just because there's no reason to assume that. For example, one can have an action which generated some "dummy" header, for system which don't have the native one. Naturally, we don't want to depend on that generated headers. To implement proposed semantic we'd need builtin support.

There are two design choices. Suppose we have files a.cpp and b.cpp, and each one includes header.h, generated by some action. Dependency graph created by classic jam would look like:

    a.cpp -----> <scanner1>header.h  [search path: d1, d2, d3]


                      <d2>header.h  --------> header.y
                      [generated in d2]
               
    b.cpp -----> <scanner2>header.h [ search path: d1, d2, d4]

In this case, Jam thinks all header.h target are not realated. The right dependency graph might be:

    a.cpp ---- 
              \
               \     
                >---->  <d2>header.h  --------> header.y
               /       [generated in d2]
              / 
    b.cpp ----

    a.cpp -----> <scanner1>header.h  [search path: d1, d2, d3]
                              |
                           (includes)
                              V
                      <d2>header.h  --------> header.y
                      [generated in d2]
                              ^
                          (includes)  
                              |
    b.cpp -----> <scanner2>header.h [ search path: d1, d2, d4]

The first alternative was use for some time. The problem however is: what include paths should be used when scanning header.h? Originally, two different sets of include paths were used. The second alternative does not have this problem, so it's implemented now.

Includes between generated sources

Suppose file "a.cpp" includes "a.h" and both are generated by some action. Initially, neither file exists, so when classic jam constructs dependency graph, the include is not found. As the result, jam might attempt to compile a.cpp before creating a.h, and compilation will fail.

The solution in Boost.Jam is to perform additional dependency scans after targets are updated. This break separation between build stages in jam — which some people consider a good thing — but I'm not aware of any better solution.

In order to understand the rest of this section, you better read some details about jam dependency scanning, available at this link.

Whenever a target is updated, Boost.Jam rescans it for includes. Consider this graph, created before any actions are run.

        A -------> C ----> C.pro
             /
        B --/         C-includes   ---> D

Both A and B have dependency on C and C-includes (the latter is not shown). Say during building we've tried to create A, then tried to create C and successfully created C. The B node wasn't seen yet. The C target is rescanned, which creates new internal node. If we had those includes from the start, we'd add this node to the list of A dependencies and B dependencies. As it stands, we need to add it now.

We determine what should be done with C-includes-2, add C-includes-2 to A's dependencies, and build the target. Unfortunately, we cannot do the same with B, since we don't know that B is parent of C until we visit B. So we add a special flag to C telling that it was rescanned. When we process B, we'll add new dependency node to B's dependencies. this point of time the target is requested by some parents. So parents were not yet visited. Both visited and unvisited parents have What shall we do when using subvariants. For user, subvariants must be more or less transparent. If without subvariant a header was generated to a certain directory, everything must work. Suppose that file a.cpp belongs to a dependency graph of main target a. Include paths are

     "/usr/include" "/home/t" "."

We start by finding all places where headers that are part of a's dependency graph are generated. We insert those places to the include paths, immediately after ".". For example, we might end with:

     "/usr/include" "/home/t" "." "build"

As a result:

File "a.cpp" will be correctly compiled. Note that it's already necessary to adjust paths to ensure this. We'll have to add target paths for all generated headers, because determining the exact set of additional include path for each source -- i.e the set of headers that it uses --- will be hard.
With the proposed SEARCH_FOR_TARGET rule, dependency on generated header will work magically --- it would find the "a_parser.h" target bound via LOCATE_TARGET to "build" and we'll call INCLUDE on that found target, instread of creating a completely unrelated one.

The remainder of this document is not indended to be read at all. This will be rearranged in future.

File targets

As described above, file targets corresponds to files that Boost.Build manages. User's may be concerned about file targets in three ways: when declaring file target types, when declaring transformations between types, and when determining where file target will be placed. File targets can also be connected with actions, that determine how the target is created. Both file targets and actions are implemented in the virtual-target module.

Types

A file target can be given a file, which determines what transformations can be applied to the file. The type.register rule declares new types. File type can also be assigned a scanner, which is used to find implicit dependencies. See dependency scanning below.

Target paths

To distinguish targets build with different properties, they are put in different directories. Rules for determining target paths are given below:

All targets are placed under directory corresponding to the project where they are defined.
Each non free, non incidental property cause an additional element to be added to the target path. That element has the form <feature-name>-<feature-value> for ordinary features and <feature-value> for implicit ones. [Note about composite features].
If the set of free, non incidental properties is different from the set of free, non incidental properties for the project in which the main target that uses the target is defined, a part of the form main_target-<name> is added to the target path. Note:It would be nice to completely track free features also, but this appears to be complex and not extremely needed.

For example, we might have these paths:

    debug/optimization-off
    debug/main-target-a

Last modified: June 30, 2003

© Copyright Vladimir Prus 2002-2003. Permission to copy, use, modify, sell and distribute this document is granted provided this copyright notice appears in all copies. This document is provided ``as is'' without express or implied warranty, and with no claim as to its suitability for any purpose.