mirror of
https://github.com/boostorg/serialization.git
synced 2026-01-24 06:22:08 +00:00
395 lines
18 KiB
HTML
395 lines
18 KiB
HTML
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<!--
|
|
(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
|
|
Use, modification and distribution is subject to the Boost Software
|
|
License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
|
|
http://www.boost.org/LICENSE_1_0.txt)
|
|
-->
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link rel="stylesheet" type="text/css" href="../../../boost.css">
|
|
<link rel="stylesheet" type="text/css" href="style.css">
|
|
<title>Serialization - Special Considerations</title>
|
|
</head>
|
|
<body link="#0000ff" vlink="#800080">
|
|
<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
|
|
<tr>
|
|
<td valign="top" width="300">
|
|
<h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
|
|
</td>
|
|
<td valign="top">
|
|
<h1 align="center">Serialization</h1>
|
|
<h2 align="center">Special Considerations</h2>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<hr>
|
|
<dl class="page-index">
|
|
<dt><a href="#objecttracking">Object Tracking</a>
|
|
<dt><a href="#export">Exporting Class Serialization</a>
|
|
<dt><a href="#classinfo">Class Information</a>
|
|
<dt><a href="#portability">Archive Portability</a>
|
|
<dl class="page-index">
|
|
<dt><a href="#numerics">Numerics</a>
|
|
<dt><a href="#traits">Traits</a>
|
|
</dl>
|
|
<dt><a href="#binary_archives">Binary Archives</a>
|
|
<dt><a href="#xml_archives">XML Archives</a>
|
|
<dt><a href="exceptions.html">Archive Exceptions</a>
|
|
<dt><a href="exception_safety.html">Exception Safety</a>
|
|
</dl>
|
|
|
|
<h3><a name="objecttracking">Object Tracking</a></h3>
|
|
Depending on how the class is used and other factors, serialized objects
|
|
may be tracked by memory address. This prevents the same object from being
|
|
written to or read from an archive multiple times. These stored addresses
|
|
can also be used to delete objects created during a loading process
|
|
that has been interrupted by throwing of an exception.
|
|
<p>
|
|
This could cause problems in
|
|
progams where the copies of different objects are saved from the same address.
|
|
<pre><code>
|
|
template<class Archive>
|
|
void save(boost::basic_oarchive & ar, const unsigned int version) const
|
|
{
|
|
for(int i = 0; i < 10; ++i){
|
|
A x = a[i];
|
|
ar << x;
|
|
}
|
|
}
|
|
</code></pre>
|
|
In this case, the data to be saved exists on the stack. Each iteration
|
|
of the loop updates the value on the stack. So although the data changes
|
|
each iteration, the address of the data doesn't. If a[i] is an array of
|
|
objects being tracked by memory address, the library will skip storing
|
|
objects after the first as it will be assumed that objects at the same address
|
|
are really the same object.
|
|
<p>
|
|
To help detect such cases, output archive operators expect to be passed
|
|
<code style="white-space: normal">const</code> reference arguments.
|
|
<p>
|
|
Given this, the above code will invoke a compile time assertion.
|
|
The obvious fix in this example is to use
|
|
<pre><code>
|
|
template<class Archive>
|
|
void save(boost::basic_oarchive & ar, const unsigned int version) const
|
|
{
|
|
for(int i = 0; i < 10; ++i){
|
|
ar << a[i];
|
|
}
|
|
}
|
|
</code></pre>
|
|
which will compile and run without problem.
|
|
The usage of <code style="white-space: normal">const</code> by the output archive operators
|
|
will ensure that the process of serialization doesn't
|
|
change the state of the objects being serialized. An attempt to do this
|
|
would constitute augmentation of the concept of saving of state with
|
|
some sort of non-obvious side effect. This would almost surely be a mistake
|
|
and a likely source of very subtle bugs.
|
|
<p>
|
|
Unfortunately, implementation issues currently prevent the detection of this kind of
|
|
error when the data item is wrapped as a name-value pair.
|
|
<p>
|
|
A similar problem can occur when different objects are loaded to and address
|
|
which is different from the final location:
|
|
<pre><code>
|
|
template<class Archive>
|
|
void load(boost::basic_oarchive & ar, const unsigned int version) const
|
|
{
|
|
for(int i = 0; i < 10; ++i){
|
|
A x;
|
|
ar >> x;
|
|
std::m_set.insert(x);
|
|
}
|
|
}
|
|
</code></pre>
|
|
In this case, the address of <code>x</code> is the one that is tracked rather than
|
|
the address of the new item added to the set. Left unaddressed
|
|
this will break the features that depend on tracking such as loading object through a pointer.
|
|
Subtle bugs will be introduced into the program. This can be
|
|
addressed by altering the above code thusly:
|
|
|
|
<pre><code>
|
|
template<class Archive>
|
|
void load(boost::basic_iarchive & ar, const unsigned int version) const
|
|
{
|
|
for(int i = 0; i < 10; ++i){
|
|
A x;
|
|
ar >> x;
|
|
std::pair<std::set::const_iterator, bool> result;
|
|
result = std::m_set.insert(x);
|
|
ar.reset_object_address(& (*result.first), &x);
|
|
}
|
|
}
|
|
</code></pre>
|
|
This will adjust the tracking information to reflect the final resting place of
|
|
the moved variable and thereby rectify the above problem.
|
|
<p>
|
|
If it is known a priori that no pointer
|
|
values are duplicated, overhead associated with object tracking can
|
|
be eliminated by setting the object tracking class serialization trait
|
|
appropriately.
|
|
<p>
|
|
By default, data types designated primitive by
|
|
<a target="detail" href="traits.html#level">Implementation Level</a>
|
|
class serialization trait are never tracked. If it is desired to
|
|
track a shared primitive object through a pointer (e.g. a
|
|
<code style="white-space: normal">long</code> used as a reference count), It should be wrapped
|
|
in a class/struct so that it is an identifiable type.
|
|
The alternative of changing the implementation level of a <code style="white-space: normal">long</code>
|
|
would affect all <code style="white-space: normal">long</code>s serialized in the whole
|
|
program - probably not what one would intend.
|
|
<p>
|
|
It is possible that we may want to track addresses even though
|
|
the object is never serialized through a pointer. For example,
|
|
a virtual base class need be saved/loaded only once. By setting
|
|
this serialization trait to <code style="white-space: normal">track_always</code>, we can suppress
|
|
redundant save/load operations.
|
|
<pre><code>
|
|
BOOST_CLASS_TRACKING(my_virtual_base_class, boost::serialization::track_always)
|
|
</code></pre>
|
|
|
|
<h3><a name="export">Exporting Class Serialization</a></h3>
|
|
<a target="detail" href="traits.html#export">Elsewhere</a> in this manual, we have described
|
|
<code style="white-space: normal">BOOST_CLASS_EXPORT</code>. This is used to make the serialization library aware
|
|
that code should be instantiated for serialization of a given class even though the
|
|
class hasn't been otherwise referred to by the program.
|
|
<p>
|
|
There are several ways <code style="white-space: normal">BOOST_CLASS_EXPORT</code> could have been
|
|
implemented.
|
|
<p>
|
|
One approach would be to instantiate serialization code for all archive classes included in the library.
|
|
This would add to each executable a large amount of code that is most likely never called.
|
|
Also it would needlessly slow down compilations of any program that uses the library. Finally,
|
|
the list of archives would be "built-in" to the library which would compilicate the addition of
|
|
new or custom archive classes.
|
|
<p>
|
|
Another approach would be for the library user to somehow explicitly instantiate which archive classes
|
|
code should be instantiated for each class to be serialized. Users would have to include
|
|
header files corresponding the archive classes to be instantiated.
|
|
The list of instantiated archive classes would have to be manually kept in sync with the
|
|
archive class headers actually included. This was considered burdensome and error prone.
|
|
<p>
|
|
This implementation of <code style="white-space: normal">BOOST_CLASS_EXPORT</code> works in the
|
|
following way:
|
|
<ul>
|
|
<li>All header modules of the form <boost/archive/*archive.hpp> are required to precede
|
|
the header module <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>.
|
|
<li>The header <a href="../../../boost/serialization/export.hpp" target="export_hpp">export.hpp</a>
|
|
builds a list of archive classes whose header modules have been previously included.
|
|
It does this by checking to see which inclusion guard constants have been defined.
|
|
<li><code style="white-space: normal">BOOST_CLASS_EXPORT(my_class)</code> explicitly instantiates
|
|
serialization code for <code style="white-space: normal">my_class</code> for each archive in the list.
|
|
</ul>
|
|
Serialization code will be instantiated for a given archive class
|
|
if and only if the module that defines that archive class has been included in the program.
|
|
Given this, our program will contain all necessary code instantiations and no other.
|
|
<p>
|
|
For many styles of code organization this header sequencing requirement presents little problem.
|
|
|
|
Serialization code organized by class headers that are designed to be independent of archive
|
|
implementations will look something like the following:
|
|
<code><pre>
|
|
// A.hpp
|
|
// Note:to preserve independence from any particular archive implementation,
|
|
// no headers from <boost/archive/...> are included.
|
|
// Headers can be included in any order.
|
|
#include <boost/serialization/...>
|
|
#include <boost/serialization/export.hpp>
|
|
... // include other headers that A depends upon
|
|
|
|
class A {
|
|
...
|
|
};
|
|
|
|
BOOST_CLASS_EXPORT(A) // note: the export name of this class
|
|
</pre></code>
|
|
This style:
|
|
<ul>
|
|
<li>permits the header to include all aspects of the serialization implementation.
|
|
<li>permits the header to be included anywhere else as part of some other class declaration.
|
|
<li>reflects the concept of headers as a "library of types" which
|
|
can be used independently in other programs or other parts of the same program.
|
|
<li>reflects a fundamental principle of the serialization library design in that the
|
|
specification of serialization of any class is independent of any archive implementation.
|
|
</ul>
|
|
However, it might not always be possible or convenient to conform to the above style. Something
|
|
like the following might be required or preferred:
|
|
<code><pre>
|
|
// A.hpp
|
|
// headers can be included in any order
|
|
#include <boost/archive/text_oarchive.hpp>
|
|
#include <boost/archive/text_iarchive.hpp>
|
|
...
|
|
#include <boost/serialization/...>
|
|
...
|
|
// can't do the following because then A.hpp couldn't be included somewhere else
|
|
// #include <boost/serialization/export.hpp>
|
|
|
|
class A {
|
|
...
|
|
};
|
|
// can't do the following because export.hpp is not included !!
|
|
//BOOST_CLASS_EXPORT(A) // note: the export name of this class
|
|
</pre></code>
|
|
As noted in the comments, this would work. But
|
|
<code style="white-space: normal">#include <.../export.hpp></code> can't be used
|
|
without conflicting with other modules which use
|
|
<code style="white-space: normal">#include <.../*archive.hpp></code>. In this
|
|
case we can move the export to an implementation file:
|
|
<code><pre>
|
|
// A.cpp
|
|
#include "A.hpp"
|
|
...
|
|
// export.hpp header should be last;
|
|
#include <boost/serialization/export.hpp>
|
|
...
|
|
BOOST_CLASS_EXPORT(A)
|
|
...
|
|
</pre></code>
|
|
|
|
<h3><a name="classinfo">Class Information</a></h3>
|
|
By default, for each class serialized, class information is written to the archive.
|
|
This information includes version number, implementation level and tracking
|
|
behavior. This is necessary so that the archive can be correctly
|
|
deserialized even if a subsequent version of the program changes
|
|
some of the current trait values for a class. The space overhead for
|
|
this data is minimal. There is a little bit of runtime overhead
|
|
since each class has to be checked to see if it has already had its
|
|
class information included in the archive. In some cases, even this
|
|
might be considered too much. This extra overhead can be eliminated
|
|
by setting the
|
|
<a target="detail" href="traits.html#level">implementation level</a>
|
|
class trait to: <code style="white-space: normal">boost::serialization::object_serializable</code>.
|
|
<p>
|
|
<i>Turning off tracking and class information serialization will result
|
|
in pure template inline code that in principle could be optimised down
|
|
to a simple stream write/read.</i> Elimination of all serialization overhead
|
|
in this manner comes at a cost. Once archives are released to users, the
|
|
class serialization traits cannot be changed without invalidating the old
|
|
archives. Including the class information in the archive assures us
|
|
that they will be readable in the future even if the class definition
|
|
is revised. A light weight structure such as display pixel might be
|
|
declared in a header like this:
|
|
|
|
<pre><code>
|
|
#include <boost/serialization/serialization.hpp>
|
|
#include <boost/serialization/level.hpp>
|
|
#include <boost/serialization/tracking.hpp>
|
|
|
|
// a pixel is a light weight struct which is used in great numbers.
|
|
struct pixel
|
|
{
|
|
unsigned char red, green, blue;
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int /* version */){
|
|
ar << red << green << blue;
|
|
}
|
|
};
|
|
|
|
// elminate serialization overhead at the cost of
|
|
// never being able to increase the version.
|
|
BOOST_CLASS_IMPLEMENTATION(pixel, boost::serialization::object_serializable);
|
|
|
|
// eliminate object tracking (even if serialized through a pointer)
|
|
// at the risk of a programming error creating duplicate objects.
|
|
BOOST_CLASS_TRACKING(pixel, boost::serialization::track_never)
|
|
</code></pre>
|
|
|
|
<h3><a name="portability">Archive Portability</a></h3>
|
|
Several archive classes create their data in the form of text or portable a binary format.
|
|
It should be possible to save such an of such a class on one platform and load it on another.
|
|
This is subject to a couple of conditions.
|
|
<h4><a name="numerics">Numerics</a></h4>
|
|
The architecture of the machine reading the archive must be able hold the data
|
|
saved. For example, the gcc compiler reserves 4 bytes to store a variable of type
|
|
<code style="white-space: normal">wchar_t</code> while other compilers reserve only 2 bytes.
|
|
So its possible that a value could be written that couldn't be represented by the loading program. This is a
|
|
fairly obvious situation and easily handled by using the numeric types in
|
|
<a target="cstding" href="../../../boost/cstdint.hpp"><boost/cstdint.hpp></a>
|
|
|
|
<h4><a name="traits">Traits</a></h4>
|
|
Another potential problem is illustrated by the following example:
|
|
<pre><code>
|
|
template<class T>
|
|
struct my_wrapper {
|
|
template<class Archive>
|
|
Archive & serialize ...
|
|
};
|
|
|
|
...
|
|
|
|
class my_class {
|
|
wchar_t a;
|
|
short unsigned b;
|
|
template<<class Archive>
|
|
Archive & serialize(Archive & ar, unsigned int version){
|
|
ar & my_wrapper(a);
|
|
ar & my_wrapper(b);
|
|
}
|
|
};
|
|
</code></pre>
|
|
If <code style="white-space: normal">my_wrapper</code> uses default serialization
|
|
traits there could be a problem. With the default traits, each time a new type is
|
|
added to the archive, bookkeeping information is added. So in this example, the
|
|
archive would include such bookkeeping information for
|
|
<code style="white-space: normal">my_wrapper<wchar_t></code> and for
|
|
<code style="white-space: normal">my_wrapper<short_unsigned></code>.
|
|
Or would it? What about compilers that treat
|
|
<code style="white-space: normal">wchar_t</code> as a
|
|
synonym for <code style="white-space: normal">unsigned short</code>?
|
|
In this case there is only one distinct type - not two. If archives are passed between
|
|
programs with compilers that differ in their treatment
|
|
of <code style="white-space: normal">wchar_t</code> the load operation will fail
|
|
in a catastrophic way.
|
|
<p>
|
|
One remedy for this is to assign serialization traits to the template
|
|
<code style="white-space: normal">my_template</code> such that class
|
|
information for instantiations of this template is never serialized. This
|
|
process is described <a target="detail" href="traits.html#templates">above</a> and
|
|
has been used for <a target="detail" href="wrappers.html#nvp"><strong>Name-Value Pairs</strong></a>.
|
|
Wrappers would typically be assigned such traits.
|
|
<p>
|
|
Another way to avoid this problem is to assign serialization traits
|
|
to all specializations of the template <code style="white-space: normal">my_wrapper</code>
|
|
for all primitive types so that class information is never saved. This is what has
|
|
been done for our implementation of serializations for STL collections.
|
|
|
|
<h3><a name="binary_archives">Binary Archives</a></h3>
|
|
Standard stream i/o on some systems will expand linefeed characters to carriage-return/linefeed
|
|
on output. This creates a problem for binary archives. The easiest way to handle this is to
|
|
open streams for binary archives in "binary mode" by using the flag
|
|
<code style="white-space: normal">ios::binary</code>. If this is not done, the archive generated
|
|
will be unreadable.
|
|
<p>
|
|
Unfortunately, no way has been found to detect this error before loading the archive. Debug builds
|
|
will assert when this is detected so that may be helpful in catching this error.
|
|
|
|
<h3><a name="xml_archives">XML Archives</a></h3>
|
|
XML archives present a somewhat special case.
|
|
XML format has a nested structure that maps well to the "recursive class member visitor" pattern
|
|
used by the serialization system. However, XML differs from other formats in that it
|
|
requires a name for each data member. Our goal is to add this information to the
|
|
class serialization specification while still permiting the the serialization code to be
|
|
used with any archive. This is achived by requiring that all data serialized to an XML archive
|
|
be serialized as a <a target="detail" href="wrappers.html#nvp">name-value pair</a>.
|
|
The first member is the name to be used as the XML tag for the
|
|
data item while the second is a reference to the data item itself. Any attempt to serialize data
|
|
not wrapped in a in a <a target="detail" href="wrappers.html#nvp">name-value pair</a> will
|
|
be trapped at compile time. The system is implemented in such a way that for other archive classes,
|
|
just the value portion of the data is serialized. The name portion is discarded during compilation.
|
|
So by always using <a target="detail" href="wrappers.html#nvp">name-value pairs</a>, it will
|
|
be guarenteed that all data can be serialized to all archive classes with maximum efficiency.
|
|
|
|
<h3><a href="exceptions.html">Archive Exceptions</a></h3>
|
|
<h3><a href="exception_safety.html">Exception Safety</a></h3>
|
|
|
|
<hr>
|
|
<p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
|
|
Distributed under the Boost Software License, Version 1.0. (See
|
|
accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
</i></p>
|
|
</body>
|
|
</html>
|