mirror of
https://github.com/boostorg/ublas.git
synced 2026-02-14 13:12:14 +00:00
1605 lines
58 KiB
HTML
1605 lines
58 KiB
HTML
<html>
|
||
|
||
<head>
|
||
<meta http-equiv="Content-Type"
|
||
content="text/html; charset=iso-8859-1">
|
||
<meta name="GENERATOR" content="Microsoft FrontPage Express 2.0">
|
||
<title>uBLAS Overview</title>
|
||
</head>
|
||
|
||
<body bgcolor="#FFFFFF">
|
||
|
||
<h1><img src="c++boost.gif" alt="c++boost.gif" align="center"
|
||
width="277" height="86"> uBLAS Overview</h1>
|
||
|
||
<h2>Rationale</h2>
|
||
|
||
<p><cite>It would be nice if every kind of numeric software could
|
||
be written in C++ without loss of efficiency, but unless
|
||
something can be found that achieves this without compromising
|
||
the C++ type system it may be preferable to rely on Fortran,
|
||
assembler or architecture-specific extensions (Bjarne
|
||
Stroustrup).</cite></p>
|
||
|
||
<p>This C++ library is directed towards scientific computing on
|
||
the level of basic linear algebra constructions with matrices and
|
||
vectors and their corresponding abstract operations. The primary
|
||
design goals were:</p>
|
||
|
||
<ul type="disc">
|
||
<li>mathematical notation</li>
|
||
<li>efficiency</li>
|
||
<li>functionality</li>
|
||
<li>compatibility</li>
|
||
</ul>
|
||
|
||
<p>Another intention was to evaluate, if the abstraction penalty
|
||
resulting from the use of such matrix and vector classes is
|
||
acceptable.</p>
|
||
|
||
<h2>Resources</h2>
|
||
|
||
<p>The development of this library was guided by a couple of
|
||
similar efforts:</p>
|
||
|
||
<ul type="disc">
|
||
<li><a href="http://www.netlib.org/blas/index.html">BLAS</a>
|
||
by Jack Dongarra et al.</li>
|
||
<li><a href="http://www.oonumerics.org/blitz/">Blitz++</a> by
|
||
Todd Veldhuizen</li>
|
||
<li><a href="http://www.acl.lanl.gov/pooma/">POOMA</a> by
|
||
Scott Haney et al.</li>
|
||
<li><a href="http://www.lsc.nd.edu/research/mtl/">MTL</a> by
|
||
Jeremy Siek et al.</li>
|
||
</ul>
|
||
|
||
<p>BLAS seems to be the most widely used library for basic linear
|
||
algebra constructions, so it could be called a de-facto standard.
|
||
Its interface is procedural, the individual functions are
|
||
somewhat abstracted from simple linear algebra operations. Due to
|
||
the fact that is has been implemented using Fortran and its
|
||
optimizations, it also seems to be one of the fastest libraries
|
||
available. As we decided to design and implement our library in
|
||
an object-oriented way, the technical approaches are distinct.
|
||
However anyone should be able to express BLAS abstractions in
|
||
terms of our library operators and to compare the efficiency of
|
||
the implementations.</p>
|
||
|
||
<p>Blitz++ is an impressive library implemented in C++. Its main
|
||
design seems to be oriented towards multidimensional arrays and
|
||
their associated operators including tensors. The author of
|
||
Blitz++ states, that the library achieves performance on par or
|
||
better than corresponding Fortran code due to his implementation
|
||
technique using expression templates and template metaprograms.
|
||
However we see some reasons, to develop an own design and
|
||
implementation approach. We do not know whether anybody tries to
|
||
implement traditional linear algebra and other numerical
|
||
algorithms using Blitz++. We also presume that even today Blitz++
|
||
needs the most advanced C++ compiler technology due to its
|
||
implementation idioms. On the other hand, Blitz++ convinced us,
|
||
that the use of expression templates is mandatory to reduce the
|
||
abstraction penalty to an acceptable limit.</p>
|
||
|
||
<p>POOMA's design goals seem to parallel Blitz++'s in many parts
|
||
. It extends Blitz++'s concepts with classes from the domains of
|
||
partial differential equations and theoretical physics. The
|
||
implementation supports even parallel architectures.</p>
|
||
|
||
<p>MTL is another approach supporting basic linear algebra
|
||
operations in C++. Its design mainly seems to be influenced by
|
||
BLAS and the C++ Standard Template Library. We share the insight
|
||
that a linear algebra library has to provide functionality
|
||
comparable to BLAS. On the other hand we think, that the concepts
|
||
of the C++ standard library have not yet been proven to support
|
||
numerical computations as needed. As another difference MTL
|
||
currently does not seem to use expression templates. This may
|
||
result in one of two consequences: a possible loss of
|
||
expressiveness or a possible loss of performance.</p>
|
||
|
||
<h2>Concepts</h2>
|
||
|
||
<h3>Mathematical Notation</h3>
|
||
|
||
<p>The usage of mathematical notation may ease the development of
|
||
scientific algorithms. So a C++ library implementing basic linear
|
||
algebra concepts carefully should overload selected C++ operators
|
||
on matrix and vector classes.</p>
|
||
|
||
<p>We decided to use operator overloading for the following
|
||
primitives:</p>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">Description</th>
|
||
<th align="left">Operator</th>
|
||
</tr>
|
||
<tr>
|
||
<td>Indexing of vectors and matrices</td>
|
||
<td><code>vector::operator(size_t i);<br>
|
||
matrix::operator(size_t i, size_t j);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Assignment of vectors and matrices</td>
|
||
<td><code>vector::operator = (const vector_expression
|
||
&);<br>
|
||
vector::operator += (const vector_expression &);<br>
|
||
vector::operator -= (const vector_expression &);<br>
|
||
vector::operator *= (const scalar_expression &);<br>
|
||
matrix::operator = (const matrix_expression &);<br>
|
||
matrix::operator += (const matrix_expression &);<br>
|
||
matrix::operator -= (const matrix_expression &);<br>
|
||
matrix::operator *= (const scalar_expression &);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Unary operations on vectors and matrices</td>
|
||
<td><code>vector_expression operator - (const
|
||
vector_expression &);<br>
|
||
matrix_expression operator - (const matrix_expression
|
||
&);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Binary operations on vectors and matrices</td>
|
||
<td><code>vector_expression operator + (const
|
||
vector_expression &, const vector_expression &);<br>
|
||
vector_expression operator - (const vector_expression
|
||
&, const vector_expression &);<br>
|
||
matrix_expression operator + (const matrix_expression
|
||
&, const matrix_expression &);<br>
|
||
matrix_expression operator - (const matrix_expression
|
||
&, const matrix_expression &);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Multiplication of vectors and matrices with a scalar</td>
|
||
<td><code>vector_expression operator * (const
|
||
scalar_expression &, const vector_expression &);<br>
|
||
vector_expression operator * (const vector_expression
|
||
&, const scalar_expression &);<br>
|
||
matrix_expression operator * (const scalar_expression
|
||
&, const matrix_expression &);<br>
|
||
matrix_expression operator * (const matrix_expression
|
||
&, const scalar_expression &);</code></td>
|
||
</tr>
|
||
</table>
|
||
|
||
<p>We decided to use no operator overloading for the following
|
||
other primitives:</p>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">Description</th>
|
||
<th align="left">Function</th>
|
||
</tr>
|
||
<tr>
|
||
<td>Left multiplication of vectors with a matrix</td>
|
||
<td><code>vector_expression prod<</code><code><em>vector_type</em></code><code>>
|
||
(const matrix_expression &, const vector_expression
|
||
&);<br>
|
||
vector_expression prod (const matrix_expression &,
|
||
const vector_expression &);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Right multiplication of vectors with a matrix</td>
|
||
<td><code>vector_expression prod<</code><code><em>vector_type</em></code><code>>
|
||
(const vector_expression &, const matrix_expression
|
||
&);<br>
|
||
vector_expression prod (const vector_expression &,
|
||
const matrix_expression &);<br>
|
||
</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Multiplication of matrices</td>
|
||
<td><code>matrix_expression prod<</code><code><em>matrix_type</em></code><code>>
|
||
(const matrix_expression &, const matrix_expression
|
||
&);<br>
|
||
matrix_expression prod (const matrix_expression &,
|
||
const matrix_expression &);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Inner product of vectors</td>
|
||
<td><code>scalar_expression inner_prod (const
|
||
vector_expression &, const vector_expression &);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Outer product of vectors</td>
|
||
<td><code>matrix_expression outer_prod (const
|
||
vector_expression &, const vector_expression &);</code></td>
|
||
</tr>
|
||
<tr>
|
||
<td>Transpose of a matrix</td>
|
||
<td><code>matrix_expression trans (const
|
||
matrix_expression &);</code></td>
|
||
</tr>
|
||
</table>
|
||
|
||
<h3>Efficiency</h3>
|
||
|
||
<p>To achieve the goal of efficiency for numerical computing, one
|
||
has to overcome two difficulties in formulating abstractions with
|
||
C++, namely temporaries and virtual function calls. Expression
|
||
templates solve these problems, but tend to slow down compilation
|
||
times.</p>
|
||
|
||
<h4>Eliminating Temporaries</h4>
|
||
|
||
<p>Abstract formulas on vectors and matrices normally compose a
|
||
couple of unary and binary operations. The conventional way of
|
||
evaluating such a formula is first to evaluate every leaf
|
||
operation of a composition into a temporary and next to evaluate
|
||
the composite resulting in another temporary. This method is
|
||
expensive in terms of time especially for small and space
|
||
especially for large vectors and matrices. The approach to solve
|
||
this problem is to use lazy evaluation as known from modern
|
||
functional programming languages. The principle of this approach
|
||
is to evaluate a complex expression element wise and to assign it
|
||
directly to the target.</p>
|
||
|
||
<p>Two interesting and dangerous facts result.</p>
|
||
|
||
<p>First one may get serious side effects using element wise
|
||
evaluation on vectors or matrices. Consider the matrix vector
|
||
product <em>x = A x</em>. Evaluation of <em>A</em><sub><em>1</em></sub><em>x</em>
|
||
and assignment to <em>x</em><sub><em>1</em></sub> changes the
|
||
right hand side, so that the evaluation of <em>A</em><sub><em>2</em></sub><em>x
|
||
</em>returns a wrong result. Our solution for this problem is to
|
||
evaluate the right hand side of an assignment into a temporary
|
||
and then to assign this temporary to the left hand side. To allow
|
||
further optimizations, we provide a corresponding member function
|
||
for every assignment operator. By using this member function a
|
||
programmer can confirm, that the left and right hand sides of an
|
||
assignment are independent, so that element wise evaluation and
|
||
direct assignment to the target is safe.</p>
|
||
|
||
<p>Next one can get worse computational complexity under certain
|
||
cirumstances. Consider the chained matrix vector product <em>A (B
|
||
x)</em>. Conventional evaluation of <em>A (B x)</em> is
|
||
quadratic. Deferred evaluation of <em>B x</em><sub><em>i</em></sub>
|
||
is linear. As every element <em>B x</em><sub><em>i</em></sub> is
|
||
needed linearly depending of the size, a completely deferred
|
||
evaluation of the chained matrix vector product <em>A (B x)</em>
|
||
is cubic. In such cases one needs to reintroduce temporaries in
|
||
the expression.</p>
|
||
|
||
<h4>Eliminating Virtual Function Calls</h4>
|
||
|
||
<p>Lazy expression evaluation normally leads to the definition of
|
||
a class hierarchy of terms. This results in the usage of dynamic
|
||
polymorphism to access single elements of vectors and matrices,
|
||
which is also known to be expensive in terms of time. A solution
|
||
was found a couple of years ago independently by David
|
||
Vandervoorde and Todd Veldhuizen and is commonly called
|
||
expression templates. Expression templates contain lazy
|
||
evaluation and replace dynamic polymorphism with static, i.e.
|
||
compile time polymorphism. Expression templates heavily depend on
|
||
the famous Barton-Nackman trick, also coined 'curiously defined
|
||
recursive templates' by Jim Coplien.</p>
|
||
|
||
<p>Expression templates form the base of our implementation.</p>
|
||
|
||
<h4>Compilation times</h4>
|
||
|
||
<p>It is also a well known fact, that expression templates
|
||
challenge currently available compilers. We were able to
|
||
significantly reduce the amount of needed expression templates
|
||
using the Barton-Nackman trick consequently.</p>
|
||
|
||
<p>We also decided to support a dual conventional implementation
|
||
(i.e. not using expression templates) with extensive bounds and
|
||
type checking of vector and matrix operations to support the
|
||
development cycle. Switching from debug mode to release mode is
|
||
controlled by the <code>NDEBUG</code> preprocessor symbol of <code><cassert></code>.</p>
|
||
|
||
<h3>Functionality</h3>
|
||
|
||
<p>Every C++ library supporting linear algebra will be measured
|
||
against the long-standing Fortran package BLAS. We now describe
|
||
how BLAS calls may be mapped onto our classes. </p>
|
||
|
||
<h4>Blas Level 1</h4>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">BLAS Call</th>
|
||
<th align="left">Mapped Library Expression</th>
|
||
<th align="left">Mathematical Description</th>
|
||
<th align="left">Comment</th>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_asum</code></td>
|
||
<td><code>norm_1 (x)</code> </td>
|
||
<td><em>sum |x</em><sub><em>i</em></sub><em>|</em></td>
|
||
<td>Computes the sum norm of a vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_nrm2</code></td>
|
||
<td><code>norm_2 (x)</code></td>
|
||
<td><em>sqrt (sum |x</em><sub><em>i</em></sub>|<sup><em>2</em></sup><em>)</em></td>
|
||
<td>Computes the euclidean norm of a vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>i_amax</code></td>
|
||
<td><code>norm_inf (x)<br>
|
||
norm_inf_index (x)</code></td>
|
||
<td><em>max |x</em><sub><em>i</em></sub><em>|</em></td>
|
||
<td>Computes the maximum norm of a vector.<br>
|
||
BLAS computes the index of the first element having this
|
||
value.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_dot<br>
|
||
_dotu<br>
|
||
_dotc</code></td>
|
||
<td><code>inner_prod (x, y)</code>or<code><br>
|
||
inner_prod (conj (x), y)</code></td>
|
||
<td><em>x</em><sup><em>T</em></sup><em> y</em> or<br>
|
||
<em>x</em><sup><em>H</em></sup><em> y</em></td>
|
||
<td>Computes the inner product of two vectors. <br>
|
||
BLAS implements certain loop unrollment.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>dsdot<br>
|
||
sdsdot</code></td>
|
||
<td><code>a + prec_inner_prod (x, y)</code></td>
|
||
<td><em>a + x</em><sup><em>T</em></sup><em> y</em></td>
|
||
<td>Computes the inner product in double precision. </td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_copy</code></td>
|
||
<td><code>x = y <br>
|
||
y.assign (x)</code></td>
|
||
<td><em>x <- y</em></td>
|
||
<td>Copies one vector to another. <br>
|
||
BLAS implements certain loop unrollment.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_swap</code></td>
|
||
<td><code>swap (x, y)</code></td>
|
||
<td><em>x <-> y</em></td>
|
||
<td>Swaps two vectors. <br>
|
||
BLAS implements certain loop unrollment.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_scal<br>
|
||
csscal<br>
|
||
zdscal</code></td>
|
||
<td><code>x *= a</code></td>
|
||
<td><em>x <- a x</em></td>
|
||
<td>Scales a vector. <br>
|
||
BLAS implements certain loop unrollment.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_axpy</code></td>
|
||
<td><code>y += a * x</code></td>
|
||
<td><em>y <- a x + y</em></td>
|
||
<td>Adds a scaled vector. <br>
|
||
BLAS implements certain loop unrollment.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_rot<br>
|
||
_rotm<br>
|
||
csrot<br>
|
||
zdrot</code></td>
|
||
<td><code>t.assign (a * x + b * y), <br>
|
||
y.assign (- b * x + a * y),<br>
|
||
x.assign (t)</code></td>
|
||
<td><em>(x, y) <- (a x + b y, -b x + a y)</em></td>
|
||
<td>Applies a plane rotation.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_rotg<br>
|
||
_rotmg</code></td>
|
||
<td> </td>
|
||
<td><em>(a, b) <- <br>
|
||
(? a / sqrt (a</em><sup><em>2</em></sup> + <em>b</em><sup><em>2</em></sup><em>),
|
||
<br>
|
||
? b / sqrt (a</em><sup><em>2</em></sup> + <em>b</em><sup><em>2</em></sup><em>))
|
||
</em>or<em><br>
|
||
(1, 0) <- (0, 0)</em></td>
|
||
<td>Constructs a plane rotation.</td>
|
||
</tr>
|
||
</table>
|
||
|
||
<h4>Blas Level 2</h4>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">BLAS Call</th>
|
||
<th align="left">Mapped Library Expression</th>
|
||
<th align="left">Mathematical Description</th>
|
||
<th align="left">Comment</th>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_t_mv</code></td>
|
||
<td><code>x = prod (A, x)</code> or<code><br>
|
||
x = prod (trans (A), x)</code> or<code><br>
|
||
x = prod (herm (A), x)</code></td>
|
||
<td><em>x <- A x </em>or<em><br>
|
||
x <- A</em><sup><em>T</em></sup><em> x </em>or<em><br>
|
||
x <- A</em><sup><em>H</em></sup><em> x</em></td>
|
||
<td>Computes the product of a matrix with a vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_t_sv</code></td>
|
||
<td><code>y = solve (A, x, tag) </code>or<br>
|
||
<code>inplace_solve (A, x, tag) </code>or<br>
|
||
<code>y = solve (trans (A), x, tag) </code>or<code><br>
|
||
inplace_solve (trans (A), x, tag) </code>or<font
|
||
face="Courier New"><code><br>
|
||
</code></font><code>y = solve (herm (A), x, tag)</code>or<code><br>
|
||
inplace_solve (herm (A), x, tag)</code></td>
|
||
<td><em>y <- A</em><sup><em>-1</em></sup><em> x </em>or<em><br>
|
||
x <- A</em><sup><em>-1</em></sup><em> x </em>or<em><br>
|
||
y <- A</em><sup><em>T</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
x </em>or<em><br>
|
||
x <- A</em><sup><em>T</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
x </em>or<em><br>
|
||
y <- A</em><sup><em>H</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
x</em> or<em><br>
|
||
x <- A</em><sup><em>H</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
x</em></td>
|
||
<td>Solves a system of linear equations with triangular
|
||
form, i.e. <em>A </em>is triangular.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_g_mv<br>
|
||
_s_mv<br>
|
||
_h_mv</code></td>
|
||
<td><code>y = a * prod (A, x) + b * y </code>or<code><br>
|
||
y = a * prod (trans (A), x) + b * y </code>or<code><br>
|
||
y = a * prod (herm (A), x) + b * y</code></td>
|
||
<td><em>y <- a A x + b y </em>or<em><br>
|
||
y <- a A</em><sup><em>T</em></sup><em> x + b y<br>
|
||
y <- a A</em><sup><em>H</em></sup><em> x + b y</em></td>
|
||
<td>Adds the scaled product of a matrix with a vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_g_r<br>
|
||
_g_ru<br>
|
||
_g_rc</code></td>
|
||
<td><code>A += a * outer_prod (x, y)</code> or<code><br>
|
||
A += a * outer_prod (x, conj (y))</code></td>
|
||
<td><em>A <- a x y</em><sup><em>T</em></sup><em> + A </em>or<em><br>
|
||
A <- a x y</em><sup><em>H</em></sup><em> + A</em></td>
|
||
<td>Performs a rank <em>1</em> update.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_s_r<br>
|
||
_h_r</code></td>
|
||
<td><code>A += a * outer_prod (x, x)</code> or<code><br>
|
||
A += a * outer_prod (x, conj (x))</code></td>
|
||
<td><em>A <- a x x</em><sup><em>T</em></sup><em> + A</em>
|
||
or<em><br>
|
||
A <- a x x</em><sup><em>H</em></sup><em> + A</em></td>
|
||
<td>Performs a symmetric or hermitian rank <em>1</em>
|
||
update.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_s_r2<br>
|
||
_h_r2</code></td>
|
||
<td><code>A += a * outer_prod (x, y) +<br>
|
||
a * outer_prod (y, x)) </code>or<code><br>
|
||
A += a * outer_prod (x, conj (y)) +<br>
|
||
conj (a) * outer_prod (y, conj (x)))</code></td>
|
||
<td><em>A <- a x y</em><sup><em>T</em></sup><em> + a y
|
||
x</em><sup><em>T</em></sup><em> + A</em> or<em><br>
|
||
A <- a x y</em><sup><em>H</em></sup><em> + a</em><sup><em>-</em></sup><em>
|
||
y x</em><sup><em>H</em></sup><em> + A</em> </td>
|
||
<td>Performs a symmetric or hermitian rank <em>2</em>
|
||
update.</td>
|
||
</tr>
|
||
</table>
|
||
|
||
<h4>Blas Level 3</h4>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">BLAS Call</th>
|
||
<th align="left">Mapped Library Expression</th>
|
||
<th align="left">Mathematical Description</th>
|
||
<th align="left">Comment</th>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_t_mm</code></td>
|
||
<td><code>B = a * prod (A, B) </code>or<br>
|
||
<code>B = a * prod (trans (A), B) </code>or<br>
|
||
<code>B = a * prod (A, trans (B)) </code>or<br>
|
||
<code>B = a * prod (trans (A), trans (B)) </code>or<br>
|
||
<code>B = a * prod (herm (A), B) </code>or<br>
|
||
<code>B = a * prod (A, herm (B)) </code>or<br>
|
||
<code>B = a * prod (herm (A), trans (B)) </code>or<br>
|
||
<code>B = a * prod (trans (A), herm (B)) </code>or<br>
|
||
<code>B = a * prod (herm (A), herm (B))</code></td>
|
||
<td><em>B <- a op (A) op (B) </em>with<br>
|
||
<em>op (X) = X </em>or<br>
|
||
<em>op (X) = X</em><sup><em>T</em></sup><em> </em>or<br>
|
||
<em>op (X) = X</em><sup><em>H</em></sup></td>
|
||
<td>Computes the scaled product of two matrices.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_t_sm</code></td>
|
||
<td><code>C = solve (A, B, tag) </code>or<br>
|
||
<code>inplace_solve (A, B, tag) </code>or<br>
|
||
<code>C = solve (trans (A), B, tag) </code>or<code><br>
|
||
inplace_solve (trans (A), B, tag) </code>or<code><br>
|
||
C = solve (herm (A), B, tag)</code> or<code><br>
|
||
inplace_solve (herm (A), B, tag)</code></td>
|
||
<td><em>C <- A</em><sup><em>-1</em></sup><em> B </em>or<em><br>
|
||
B <- A</em><sup><em>-1</em></sup><em> B </em>or<em><br>
|
||
C <- A</em><sup><em>T</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
B </em>or<em><br>
|
||
B <- A</em><sup><em>-1</em></sup><em> B </em>or<em><br>
|
||
C <- A</em><sup><em>H</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
B</em> or<em><br>
|
||
B <- A</em><sup><em>H</em></sup><sup><sup><em>-1</em></sup></sup><em>
|
||
B</em></td>
|
||
<td>Solves a system of linear equations with triangular
|
||
form, i.e. <em>A </em>is triangular.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_g_mm<br>
|
||
_s_mm<br>
|
||
_h_mm</code></td>
|
||
<td><code>C = a * prod (A, B) + b * C </code>or<br>
|
||
<code>C = a * prod (trans (A), B) + b * C </code>or<br>
|
||
<code>C = a * prod (A, trans (B)) + b * C </code>or<br>
|
||
<code>C = a * prod (trans (A), trans (B)) + b * C </code>or<br>
|
||
<code>C = a * prod (herm (A), B) + b * C </code>or<br>
|
||
<code>C = a * prod (A, herm (B)) + b * C </code>or<br>
|
||
<code>C = a * prod (herm (A), trans (B)) + b * C </code>or<br>
|
||
<code>C = a * prod (trans (A), herm (B)) + b * C </code>or<br>
|
||
<code>C = a * prod (herm (A), herm (B)) + b * C</code></td>
|
||
<td><em>C <- a op (A) op (B) + b C </em>with<br>
|
||
<em>op (X) = X </em>or<br>
|
||
<em>op (X) = X</em><sup><em>T</em></sup><em> </em>or<br>
|
||
<em>op (X) = X</em><sup><em>H</em></sup></td>
|
||
<td>Adds the scaled product of two matrices.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_s_rk<br>
|
||
_h_rk</code></td>
|
||
<td><code>B = a * prod (A, trans (A)) + b * B </code>or<br>
|
||
<code>B = a * prod (trans (A), A) + b * B </code>or<br>
|
||
<code>B = a * prod (A, herm (A)) + b * B </code>or<br>
|
||
<code>B = a * prod (herm (A), A) + b * B</code></td>
|
||
<td><em>B <- a A A</em><sup><em>T</em></sup><em> + b B
|
||
</em>or<em><br>
|
||
B <- a A</em><sup><em>T</em></sup><em> A + b B </em>or<br>
|
||
<em>B <- a A A</em><sup><em>H</em></sup><em> + b B </em>or<em><br>
|
||
B <- a A</em><sup><em>H</em></sup><em> A + b B </em></td>
|
||
<td>Performs a symmetric or hermitian rank <em>k</em>
|
||
update.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>_s_r2k<br>
|
||
_h_r2k</code></td>
|
||
<td><code>C = a * prod (A, trans (B)) +<br>
|
||
a * prod (B, trans (A)) + b * C </code>or<br>
|
||
<code>C = a * prod (trans (A), B) +<br>
|
||
a * prod (trans (B), A) + b * C </code>or<br>
|
||
<code>C = a * prod (A, herm (B)) +<br>
|
||
conj (a) * prod (B, herm (A)) + b * C </code>or<br>
|
||
<code>C = a * prod (herm (A), B) +<br>
|
||
conj (a) * prod (herm (B), A) + b * C</code></td>
|
||
<td><em>C <- a A B</em><sup><em>T</em></sup><em> + a B
|
||
A</em><sup><em>T</em></sup><em> + b C </em>or<em><br>
|
||
C <- a A</em><sup><em>T</em></sup><em> B + a B</em><sup><em>T</em></sup><em>A
|
||
+ b C </em>or<em><br>
|
||
C <- a A B</em><sup><em>H</em></sup><em> + a</em><sup><em>-</em></sup><em>
|
||
B A</em><sup><em>H</em></sup><em> + b C</em> or<em><br>
|
||
C <- a A</em><sup><em>H</em></sup><em> B + a</em><sup><em>-</em></sup><em>
|
||
B</em><sup><em>H</em></sup><em> A + b C</em></td>
|
||
<td>Performs a symmetric or hermitian rank <em>2 k</em>
|
||
update.</td>
|
||
</tr>
|
||
</table>
|
||
|
||
<h4>Storage Layouts</h4>
|
||
|
||
<p>The library supports conventional dense, packed and basic
|
||
sparse vector and matrix storage layouts. The description of the
|
||
most common constructions of vectors and matrices comes next.</p>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">Construction</th>
|
||
<th align="left">Comment</th>
|
||
</tr>
|
||
<tr>
|
||
<td><code>vector<T,<br>
|
||
std::vector<T> > <br>
|
||
v (size)</code></td>
|
||
<td>Constructs a dense vector, storage is provided by a
|
||
standard vector.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>vector<T,<br>
|
||
unbounded_array<T> > <br>
|
||
v (size)</code></td>
|
||
<td>Constructs a dense vector, storage is provided by a
|
||
heap-based array.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>vector<T,<br>
|
||
bounded_array<T, N> > <br>
|
||
v (size)</code></td>
|
||
<td>Constructs a dense vector, storage is provided by a
|
||
stack-based array.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>unit_vector<T> <br>
|
||
v (size, index)</code></td>
|
||
<td>Constructs the <code>index</code>-th canonical unit
|
||
vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>zero_vector<T> <br>
|
||
v (size)</code></td>
|
||
<td>Constructs a zero vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>sparse_vector<T,<br>
|
||
std::map<std::size_t, T> > <br>
|
||
v (size, non_zeros)</code></td>
|
||
<td>Constructs a sparse vector, storage is provided by a
|
||
standard map.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>sparse_vector<T,<br>
|
||
map_array<std::size_t, T> > <br>
|
||
v (size, non_zeros)</code></td>
|
||
<td>Constructs a sparse vector, storage is provided by a
|
||
map array.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>compressed_vector<T> <br>
|
||
v (size, non_zeros)</code></td>
|
||
<td>Constructs a compressed vector.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>coordinate_vector<T> <br>
|
||
v (size, non_zeros)</code></td>
|
||
<td>Constructs a coordinate vector.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>vector_range<V> <br>
|
||
vr (v, range)</code></td>
|
||
<td>Constructs a sub vector of a dense, packed or sparse
|
||
vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>vector_slice<V> <br>
|
||
vs (v, slice)</code></td>
|
||
<td>Constructs a sub vector of a dense, packed or sparse
|
||
vector.<br>
|
||
This class usually can be used to emulate BLAS vector
|
||
slices.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix<T,<br>
|
||
row_major,<br>
|
||
std::vector<T> > <br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a dense matrix, orientation is row major,
|
||
storage is provided by a standard vector.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix<T,<br>
|
||
column_major,<br>
|
||
std::vector<T> ><br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a dense matrix, orientation is column
|
||
major, storage is provided by a standard vector.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix<T,<br>
|
||
row_major, <br>
|
||
unbounded_array<T> ><br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a dense matrix, orientation is row major,
|
||
storage is provided by a heap-based array.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix<T,<br>
|
||
column_major,<br>
|
||
unbounded_array<T> ><br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a dense matrix, orientation is column
|
||
major, storage is provided by a heap-based array.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix<T,<br>
|
||
row_major,<br>
|
||
bounded_array<T, N1 * N2> ><br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a dense matrix, orientation is row major,
|
||
storage is provided by a stack-based array.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix<T,<br>
|
||
column_major,<br>
|
||
bounded_array<T, N1 * N2> ><br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a dense matrix, orientation is column
|
||
major, storage is provided by a stack-based array.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>identity_matrix<T><br>
|
||
m (size)</code></td>
|
||
<td>Constructs an identity matrix.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>zero_matrix<T><br>
|
||
m (size1, size2)</code></td>
|
||
<td>Constructs a zero matrix.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>triangular_matrix<T,<br>
|
||
row_major, F, A><br>
|
||
m (size)</code></td>
|
||
<td>Constructs a packed triangular matrix, orientation is
|
||
row major.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>triangular_matrix<T,<br>
|
||
column_major, F, A><br>
|
||
m (size)</code></td>
|
||
<td>Constructs a packed triangular matrix, orientation is
|
||
column major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>banded_matrix<T,<br>
|
||
row_major, A><br>
|
||
m (size1, size2, lower, upper)</code></td>
|
||
<td>Constructs a packed banded matrix, orientation is row
|
||
major.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>banded_matrix<T,<br>
|
||
column_major, A><br>
|
||
m (size1, size2, lower, upper)</code></td>
|
||
<td>Constructs a packed banded matrix, orientation is
|
||
column major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>symmetric_matrix<T,<br>
|
||
row_major, F, A><br>
|
||
m (size)</code></td>
|
||
<td>Constructs a packed symmetric matrix, orientation is
|
||
row major.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>symmetric_matrix<T,<br>
|
||
column_major, F, A><br>
|
||
m (size)</code></td>
|
||
<td>Constructs a packed symmetric matrix, orientation is
|
||
column major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>hermitian_matrix<T,<br>
|
||
row_major, F, A><br>
|
||
m (size)</code></td>
|
||
<td>Constructs a packed hermitian matrix, orientation is
|
||
row major.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>hermitian_matrix<T,<br>
|
||
column_major, F, A><br>
|
||
m (size)</code></td>
|
||
<td>Constructs a packed hermitian matrix, orientation is
|
||
column major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>sparse_matrix<T, <br>
|
||
row_major,<br>
|
||
std::map<std::size_t, T> ><br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a sparse matrix, orientation is row major,
|
||
storage is provided by a standard map.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>sparse_matrix<T, <br>
|
||
column_major,<br>
|
||
std::map<std::size_t, T> ><br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a sparse matrix, orientation is column
|
||
major, storage is provided by a standard map.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>sparse_matrix<T, <br>
|
||
row_major,<br>
|
||
map_array<std::size_t, T> > <br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a sparse matrix, orientation is row major,
|
||
storage is provided by a map array.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>sparse_matrix<T, <br>
|
||
column_major,<br>
|
||
map_array<std::size_t, T> > <br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a sparse matrix, orientation is column
|
||
major, storage is provided by a map array.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>compressed_matrix<T, <br>
|
||
row_major> <br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a compressed matrix, orientation is row
|
||
major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>compressed_matrix<T, <br>
|
||
column_major> <br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a compressed matrix, orientation is column
|
||
major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>coordinate_matrix<T, <br>
|
||
row_major> <br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a coordinate matrix, orientation is row
|
||
major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>coordinate_matrix<T, <br>
|
||
column_major> <br>
|
||
m (size1, size2, non_zeros)</code></td>
|
||
<td>Constructs a coordinate matrix, orientation is column
|
||
major.<br>
|
||
The storage layout usually is BLAS compliant.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix_row<M> <br>
|
||
mr (m, i)</code></td>
|
||
<td>Constructs a sub vector of a dense, packed or sparse
|
||
matrix, containing the i-th row.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix_column<M> <br>
|
||
mc (m, j)</code></td>
|
||
<td>Constructs a sub vector of a dense, packed or sparse
|
||
matrix, containing the j-th column.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix_range<M> <br>
|
||
mr (m, range1, range2)</code></td>
|
||
<td>Constructs a sub matrix of a dense, packed or sparse
|
||
matrix.<br>
|
||
This class usually can be used to emulate BLAS leading
|
||
dimensions.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>matrix_slice<M> <br>
|
||
ms (m, slice1, slice2)</code></td>
|
||
<td>Constructs a sub matrix of a dense, packed or sparse
|
||
matrix.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>triangular_adaptor<M, F><br>
|
||
ta (m)</code></td>
|
||
<td>Constructs a triangular view of a dense, packed or
|
||
sparse matrix.<br>
|
||
This class usually can be used to generate corresponding
|
||
BLAS matrix types.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>banded_adaptor<M><br>
|
||
ba (m, lower, upper)</code></td>
|
||
<td>Constructs a banded view of a dense, packed or sparse
|
||
matrix.<br>
|
||
This class usually can be used to generate corresponding
|
||
BLAS matrix types.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>symmetric_adaptor<M><br>
|
||
sa (m)</code></td>
|
||
<td>Constructs a symmetric view of a dense, packed or
|
||
sparse matrix.<br>
|
||
This class usually can be used to generate corresponding
|
||
BLAS matrix types.</td>
|
||
</tr>
|
||
<tr>
|
||
<td><code>hermitian_adaptor<M><br>
|
||
ha (m)</code></td>
|
||
<td>Constructs a hermitian view of a dense, packed or
|
||
sparse matrix.<br>
|
||
This class usually can be used to generate corresponding
|
||
BLAS matrix types.</td>
|
||
</tr>
|
||
</table>
|
||
|
||
<h3>Compatibility</h3>
|
||
|
||
<p>For compatibility reasons we provide array like indexing for
|
||
vectors and matrices, although this could be expensive for
|
||
matrices due to the needed temporary proxy objects.</p>
|
||
|
||
<p>To support the most widely used C++ compilers our design and
|
||
implementation do not depend on partial template specialization
|
||
essentially.</p>
|
||
|
||
<p>The library presumes standard compliant allocation through <code>operator
|
||
new </code>and <code>operator delete</code>. So programs which
|
||
are intended to run under MSVC 6.0 should set a correct new
|
||
handler throwing a <code>std::bad_alloc</code> exception via <code>_set_new_handler</code>
|
||
to detect out of memory conditions.</p>
|
||
|
||
<p>To get the most performance out of the box with MSVC 6.0, you
|
||
should change the preprocessor definition of <code>BOOST_UBLAS_INLINE
|
||
</code>to <code>__forceinline </code>in the header file
|
||
config.hpp. But we suspect this optimization to be fragile.</p>
|
||
|
||
<h2>Reference</h2>
|
||
|
||
<ul>
|
||
<li><a href="expression.htm">Expression Concepts</a><ul
|
||
type="circle">
|
||
<li><a href="expression.htm#scalar_expression">Scalar
|
||
Expression</a></li>
|
||
<li><a href="expression.htm#vector_expression">Vector
|
||
Expression</a></li>
|
||
<li><a href="expression.htm#matrix_expression">Matrix
|
||
Expression</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="container.htm">Container Concepts</a><ul
|
||
type="circle">
|
||
<li><a href="container.htm#vector">Vector</a></li>
|
||
<li><a href="container.htm#matrix">Matrix</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="iterator.htm">Iterator Concepts</a><ul
|
||
type="circle">
|
||
<li><a
|
||
href="iterator.htm#indexed_bidirectional_iterator">Indexed
|
||
Bidirectional Iterator</a></li>
|
||
<li><a
|
||
href="iterator.htm#indexed_random_access_iterator">Indexed
|
||
Random Access Iterator</a></li>
|
||
<li><a
|
||
href="iterator.htm#indexed_bidirectional_cr_iterator">Indexed
|
||
Bidirectional Column/Row Iterator</a></li>
|
||
<li><a
|
||
href="iterator.htm#indexed_random_access_cr_iterator">Indexed
|
||
Random Access Column/Row Iterator</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="storage.htm">Storage</a><ul type="circle">
|
||
<li><a href="storage.htm#unbounded_array">Unbounded
|
||
Array</a></li>
|
||
<li><a href="storage.htm#bounded_array">Bounded Array</a></li>
|
||
<li><a href="storage.htm#range">Range</a></li>
|
||
<li><a href="storage.htm#slice">Slice</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="storage_sparse.htm">Sparse Storage</a><ul
|
||
type="circle">
|
||
<li><a href="storage_sparse.htm#map_array">Map Array</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="vector.htm">Vector</a><ul type="circle">
|
||
<li><a href="vector.htm#vector">Vector</a></li>
|
||
<li><a href="vector.htm#unit_vector">Unit Vector</a></li>
|
||
<li><a href="vector.htm#zero_vector">Zero Vector</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="vector_sparse.htm">Sparse Vector</a><ul
|
||
type="circle">
|
||
<li><a href="vector_sparse.htm#sparse_vector">Sparse
|
||
Vector</a></li>
|
||
<li><a href="vector_sparse.htm#compressed_vector">Compressed
|
||
Vector</a></li>
|
||
<li><a href="vector_sparse.htm#coordinate_vector">Coordinate
|
||
Vector</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="vector_proxy.htm">Vector Proxies</a><ul
|
||
type="circle">
|
||
<li><a href="vector_proxy.htm#vector_range">Vector
|
||
Range</a></li>
|
||
<li><a href="vector_proxy.htm#vector_slice">Vector
|
||
Slice</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="vector_expression.htm">Vector Expressions</a><ul
|
||
type="circle">
|
||
<li><a href="vector_expression.htm#vector_expression">Vector
|
||
Expression</a></li>
|
||
<li><a href="vector_expression.htm#vector_references">Vector
|
||
References</a></li>
|
||
<li><a href="vector_expression.htm#vector_operations">Vector
|
||
Operations</a></li>
|
||
<li><a href="vector_expression.htm#vector_reductions">Vector
|
||
Reductions</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="matrix.htm">Matrix</a><ul type="circle">
|
||
<li><a href="matrix.htm#matrix">Matrix</a></li>
|
||
<li><a href="matrix.htm#identity_matrix">Identity
|
||
Matrix</a></li>
|
||
<li><a href="matrix.htm#zero_matrix">Zero Matrix</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="triangular.htm">Triangular Matrix</a><ul
|
||
type="circle">
|
||
<li><a href="triangular.htm#triangular_matrix">Triangular
|
||
Matrix</a></li>
|
||
<li><a href="triangular.htm#triangular_adaptor">Triangular
|
||
Adaptor</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="symmetric.htm">Symmetric Matrix</a><ul
|
||
type="circle">
|
||
<li><a href="symmetric.htm#symmetric_matrix">Symmetric
|
||
Matrix</a></li>
|
||
<li><a href="symmetric.htm#symmetric_adaptor">Symmetric
|
||
Adaptor</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="hermitian.htm">Hermitian Matrix</a><ul
|
||
type="circle">
|
||
<li><a href="hermitian.htm#hermitian_matrix">Hermitian
|
||
Matrix</a></li>
|
||
<li><a href="hermitian.htm#hermitian_adaptor">Hermitian
|
||
Adaptor</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="banded.htm">Banded Matrix</a><ul type="circle">
|
||
<li><a href="banded.htm#banded_matrix">Banded Matrix</a></li>
|
||
<li><a href="banded.htm#banded_adaptor">Banded
|
||
Adaptor</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="matrix_sparse.htm">Sparse Matrix</a><ul
|
||
type="circle">
|
||
<li><a href="matrix_sparse.htm#sparse_matrix">Sparse
|
||
Matrix</a></li>
|
||
<li><a href="matrix_sparse.htm#compressed_matrix">Compressed
|
||
Matrix</a></li>
|
||
<li><a href="matrix_sparse.htm#coordinate_matrix">Coordinate
|
||
Matrix</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="matrix_proxy.htm">Matrix Proxies</a><ul
|
||
type="circle">
|
||
<li><a href="matrix_proxy.htm#matrix_row">Matrix Row</a></li>
|
||
<li><a href="matrix_proxy.htm#matrix_column">Matrix
|
||
Column</a></li>
|
||
<li><a href="matrix_proxy.htm#vector_range">Vector
|
||
Range</a></li>
|
||
<li><a href="matrix_proxy.htm#vector_slice">Vector
|
||
Slice</a></li>
|
||
<li><a href="matrix_proxy.htm#matrix_range">Matrix
|
||
Range</a></li>
|
||
<li><a href="matrix_proxy.htm#matrix_slice">Matrix
|
||
Slice</a></li>
|
||
</ul>
|
||
</li>
|
||
<li><a href="matrix_expression.htm">Matrix Expressions</a><ul
|
||
type="circle">
|
||
<li><a href="matrix_expression.htm#matrix_expression">Matrix
|
||
Expression</a></li>
|
||
<li><a href="matrix_expression.htm#matrix_references">Matrix
|
||
References</a></li>
|
||
<li><a href="matrix_expression.htm#matrix_operations">Matrix
|
||
Operations</a></li>
|
||
<li><a href="matrix_expression.htm#matrix_reductions">Matrix
|
||
Reductions</a></li>
|
||
<li><a
|
||
href="matrix_expression.htm#matrix_vector_operations">Matrix
|
||
Vector Operations</a></li>
|
||
<li><a
|
||
href="matrix_expression.htm#matrix_matrix_operations">Matrix
|
||
Matrix Operations</a></li>
|
||
</ul>
|
||
</li>
|
||
</ul>
|
||
|
||
<h2>Benchmark Results</h2>
|
||
|
||
<p>The following tables contain results of one of our benchmarks.
|
||
This benchmark compares a native C implementation ('C array') and
|
||
some library based implementations. The safe variants based on
|
||
the library assume aliasing, the fast variants do not use
|
||
temporaries and are functionally equivalent to the native C
|
||
implementation. Besides the generic vector and matrix classes the
|
||
benchmark utilizes special classes <code>c_vector</code> and <code>c_matrix</code>,
|
||
which are intended to avoid every overhead through genericity.</p>
|
||
|
||
<p>The benchmark program was compiled with MSVC 6.0 and run on an
|
||
Intel Pentium II with 333 Mhz. First we comment the results for
|
||
double vectors and matrices of dimension 3 and 3 x 3,
|
||
respectively.</p>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">Operation</th>
|
||
<th align="left">Implementation</th>
|
||
<th align="left">Elapsed [s]</th>
|
||
<th align="left">MFLOP/s</th>
|
||
<th align="left">Comment</th>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="4">inner_prod</td>
|
||
<td>C array</td>
|
||
<td align="right">0.1 </td>
|
||
<td align="right">47.6837 </td>
|
||
<td rowspan="4">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_vector</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">79.4729 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<unbounded_array></td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">43.3488 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<std::vector></td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">43.3488 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">vector + vector</td>
|
||
<td>C array</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">114.441 </td>
|
||
<td rowspan="7">Abstraction penalty: factor 2</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_vector safe</td>
|
||
<td align="right">0.22 </td>
|
||
<td align="right">26.0093 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_vector fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<unbounded_array> safe</td>
|
||
<td align="right">1.05 </td>
|
||
<td align="right">5.44957 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<unbounded_array> fast</td>
|
||
<td align="right">0.16 </td>
|
||
<td align="right">35.7628 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<std::vector> safe</td>
|
||
<td align="right">1.16 </td>
|
||
<td align="right">4.9328 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<std::vector> fast</td>
|
||
<td align="right">0.16 </td>
|
||
<td align="right">35.7628 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">outer_prod</td>
|
||
<td>C array</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">85.8307 </td>
|
||
<td rowspan="7">Abstraction penalty: factor 2</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector safe</td>
|
||
<td align="right">0.22 </td>
|
||
<td align="right">23.4084 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">46.8167 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> safe</td>
|
||
<td align="right">0.38 </td>
|
||
<td align="right">13.5522 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> fast</td>
|
||
<td align="right">0.16 </td>
|
||
<td align="right">32.1865 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
safe</td>
|
||
<td align="right">0.5 </td>
|
||
<td align="right">10.2997 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">46.8167 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">prod (matrix, vector)</td>
|
||
<td>C array</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">71.5256 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> safe</td>
|
||
<td align="right">0.33 </td>
|
||
<td align="right">13.0046 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
safe</td>
|
||
<td align="right">0.38 </td>
|
||
<td align="right">11.2935 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
fast</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">85.8307 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">matrix + matrix</td>
|
||
<td>C array</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">46.8167 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix safe</td>
|
||
<td align="right">0.17 </td>
|
||
<td align="right">30.2932 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">46.8167 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> safe</td>
|
||
<td align="right">0.44 </td>
|
||
<td align="right">11.7042 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> fast</td>
|
||
<td align="right">0.16 </td>
|
||
<td align="right">32.1865 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> safe</td>
|
||
<td align="right">0.6 </td>
|
||
<td align="right">8.58307 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> fast</td>
|
||
<td align="right">0.17 </td>
|
||
<td align="right">30.2932 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">prod (matrix, matrix)</td>
|
||
<td>C array</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> safe</td>
|
||
<td align="right">0.22 </td>
|
||
<td align="right">19.507 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> safe</td>
|
||
<td align="right">0.27 </td>
|
||
<td align="right">15.8946 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">39.0139 </td>
|
||
</tr>
|
||
</table>
|
||
|
||
<p>We notice a twofold performance loss for small vectors and
|
||
matrices: first the general abstraction penalty for using
|
||
classes, and then a small loss when using the generic vector and
|
||
matrix classes. The difference w.r.t. alias assumptions is also
|
||
significant.</p>
|
||
|
||
<p>Next we comment the results for double vectors and matrices of
|
||
dimension 100 and 100 x 100, respectively.</p>
|
||
|
||
<table border="1">
|
||
<tr>
|
||
<th align="left">Operation</th>
|
||
<th align="left">Implementation</th>
|
||
<th align="left">Elapsed [s]</th>
|
||
<th align="left">MFLOP/s</th>
|
||
<th align="left">Comment</th>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="4">inner_prod</td>
|
||
<td>C array</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">113.869 </td>
|
||
<td rowspan="4">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_vector</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">94.8906 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<unbounded_array></td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">113.869 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<std::vector></td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">94.8906 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">vector + vector</td>
|
||
<td>C array</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">114.441 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_vector safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_vector fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<unbounded_array> safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<unbounded_array> fast</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">95.3674 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<std::vector> safe</td>
|
||
<td align="right">0.17 </td>
|
||
<td align="right">33.6591 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>vector<std::vector> fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">outer_prod</td>
|
||
<td>C array</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">114.441 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector safe</td>
|
||
<td align="right">0.28 </td>
|
||
<td align="right">20.4359 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> safe</td>
|
||
<td align="right">0.27 </td>
|
||
<td align="right">21.1928 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> fast</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">95.3674 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
safe</td>
|
||
<td align="right">0.28 </td>
|
||
<td align="right">20.4359 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
fast</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">52.0186 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">prod (matrix, vector)</td>
|
||
<td>C array</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">51.7585 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">51.7585 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix, c_vector fast</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">113.869 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">51.7585 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array>,
|
||
vector<unbounded_array> fast</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">94.8906 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
safe</td>
|
||
<td align="right">0.1 </td>
|
||
<td align="right">56.9344 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector>, vector<std::vector>
|
||
fast</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">94.8906 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">matrix + matrix</td>
|
||
<td>C array</td>
|
||
<td align="right">0.22 </td>
|
||
<td align="right">26.0093 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix safe</td>
|
||
<td align="right">0.49 </td>
|
||
<td align="right">11.6776 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix fast</td>
|
||
<td align="right">0.22 </td>
|
||
<td align="right">26.0093 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> safe</td>
|
||
<td align="right">0.39 </td>
|
||
<td align="right">14.6719 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> fast</td>
|
||
<td align="right">0.22 </td>
|
||
<td align="right">26.0093 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> safe</td>
|
||
<td align="right">0.44 </td>
|
||
<td align="right">13.0046 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> fast</td>
|
||
<td align="right">0.27 </td>
|
||
<td align="right">21.1928 </td>
|
||
</tr>
|
||
<tr>
|
||
<td rowspan="7">prod (matrix, matrix)</td>
|
||
<td>C array</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">94.8906 </td>
|
||
<td rowspan="7">No significant abstraction penalty</td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix safe</td>
|
||
<td align="right">0.06 </td>
|
||
<td align="right">94.8906 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>c_matrix fast</td>
|
||
<td align="right">0.05 </td>
|
||
<td align="right">113.869 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">51.7585 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<unbounded_array> fast</td>
|
||
<td align="right">0.17 </td>
|
||
<td align="right">33.4908 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> safe</td>
|
||
<td align="right">0.11 </td>
|
||
<td align="right">51.7585 </td>
|
||
</tr>
|
||
<tr>
|
||
<td>matrix<std::vector> fast</td>
|
||
<td align="right">0.16 </td>
|
||
<td align="right">35.584 </td>
|
||
</tr>
|
||
</table>
|
||
|
||
<p>For larger vectors and matrices the general abstraction
|
||
penalty for using classes seems to decrease, the small loss when
|
||
using generic vector and matrix classes seems to remain. The
|
||
difference w.r.t. alias assumptions remains visible, too.</p>
|
||
|
||
<hr>
|
||
|
||
<p>Copyright (<28>) 2000-2002 Joerg Walter, Mathias Koch <br>
|
||
Permission to copy, use, modify, sell and distribute this
|
||
document is granted provided this copyright notice appears in all
|
||
copies. This document is provided ``as is'' without express or
|
||
implied warranty, and with no claim as to its suitability for any
|
||
purpose.</p>
|
||
|
||
<p>Last revised: 8/3/2002</p>
|
||
</body>
|
||
</html>
|