Files
serialization/doc/serialization.html
Aleksey Gurtovoy bfc5d06ec6 c++boost.gif -> boost.png replacement
[SVN r25573]
2004-10-05 15:45:52 +00:00

598 lines
25 KiB
HTML

<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!--
(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
Use, modification and distribution is subject to the Boost Software
License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt)
-->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link rel="stylesheet" type="text/css" href="../../../boost.css">
<link rel="stylesheet" type="text/css" href="style.css">
<title>Serialization - Serialization of Classes</title>
</head>
<body link="#0000ff" vlink="#800080">
<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
<tr>
<td valign="top" width="300">
<h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
</td>
<td valign="top">
<h1 align="center">Serialization</h1>
<h2 align="center">Class Serialization</h2>
</td>
</tr>
</table>
<hr>
<dl class="page-index">
<dt><a href="#member">Member Function</a>
<dt><a href="#Free">Free Function</a>
<dt><a href="#Base">Base Classes</a>
<dt><a href="#Versioning">Versioning</a>
<dt><a href="#splitting">Splitting <code style="white-space: normal">serialize</code> into
<code style="white-space: normal">save/load</code></a>
<dl class="page-index">
<dt><a href="#splittingmemberfunctions">Member Functions</a>
<dt><a href="#splittingfreefunctions">Free Functions</a>
</dl>
<dt><a href="#const"><code style="white-space: normal">const</code> Members</a>
<dt><a href="#constructors">Non-Default Constructors</a>
<dt><a href="#referencemembers">Reference Members</a>
<dt><a href="#templates">Templates</a>
<dt><a href="traits.html">Class Serialization Traits</a>
<dt><a href="wrappers.html">Serialization Wrappers</a>
<dt><a href="#implementations">Serialization Implementations Included in the Library</a>
</dl>
The header file <a target="serialization_hpp"
href="../../../boost/serialization/serialization.hpp">
<code style="white-space: normal">serialization.hpp</code></a> contains public interface to the
serialization library. This entire interface consists of three overridable
function templates.
<h4><a name="member">Member Function</a></h4>
The first of these three templates is:
<pre><code>
template&lt;class Archive, class T&gt;
inline void serialize(
Archive &amp; ar,
T &amp; t,
const unsigned long int file_version
){
// invoke member function for class T
t.serialize(ar, file_version);
}
</code></pre>
It is invoked each time the data members of a class instance are to be saved to
or loaded from an archive. The default definition of this template presumes the
existence of a class member function template of the following signature:
<pre><code>
template&lt;class Archive&gt;
void serialize(Archive &amp;ar, const unsigned int version){
...
}
</code></pre>
If this is not declared, then a compile time error will occur. In order
that the member function generated by this template can be called to
append the data to an archive, it either must be public or the class must
be made accessible to the serialization library by including:
<pre><code>
friend class boost::serialization::access;
</code></pre>
in the class definition. This latter method should be preferred over the option
of making member function public. This will prevent serialization functions from
being called from outside the library. This is almost certainly an error. Unfortunately,
it may appear to function but fail in a way that is very difficult to find.
<p>
It may not be immediately obvious how this one template serves for both
saving data to an archive as well as loading data from the archive.
The key is that the <code style="white-space: normal">&amp;</code> operator is
defined as <code style="white-space: normal">&lt;&lt;</code>
for output archives and as <code style="white-space: normal">&gt;&gt;</code> input archives. The
"polymorphic" behavior of the <code style="white-space: normal">&amp;</code> permits the same template
to be used for both save and load operations. This is very convenient in that it
saves a lot of typing and guarantees that the saving and loading of class
data members are always in sync. This is the key to the whole serialization
system.
<h3><a name="free">Free Function</a></h3>
Of course we're not restricted to using the default implementation described
above. We can override the default one with our own. Doing this will
permit us to implement serialization of a class without altering
the class definition itself. We call this <strong>non-intrusive</strong>
serialization. Suppose our class is named <code style="white-space: normal">my_class</code>, the
override would be specified as:
<pre><code>
// namespace selection
template&lt;class Archive&gt;
inline void serialize(
Archive &amp; ar,
my_class &amp; t,
const unsigned long int file_version
){
...
}
</code></pre>
Note that we have called this override "non-intrusive". This is slightly
inaccurate. It does not require that the class have special functions, that
it be derived from some common base class or any other fundamental design changes.
However, it will require access to the class members that are to
be saved and loaded. If these members are <code style="white-space: normal">private</code>, it won't be
possible to serialize them. So in some instances, minor modifications to the
class to be serialized will be necessary even when using this "non-intrusive"
method. In practice this may not be such a problem as many libraries
(E.G. STL) expose enough information to permit implementation of non-intrusive
serialization with absolutly no changes to the library.
<p>
Regardless of which method is used the body of the serialize function will
specify the data to be saved/loaded by sequential application of the archive
<code style="white-space: normal">operator &amp;</code> to all the data members of the class.
<pre><code>
{
// save/load class member variables
ar &amp; member1;
ar &amp; member2;
}
</code></pre>
<h4><a name="namespaces">Namespaces for Free Function Overrides</a></h4>
The question arises as to which <code>namespace</code> free serialization functions should be part of.
<p>
The options for this depend on:
<ul>
<li>Whether or not the compiler implements Argument Dependent Lookup.
<li>whether or not the compiler implements Two Phase Lookup
<li>whether or not the type to be serialized is a dependent type.
</ul>
according to the following table:
<p>
<table border>
<tr><th align="right">ADL</th><th align="right">Two Phase<br>Lookup</th><th align="right">Dependent<br>Type T?</th><th>Namespace permitted</th></tr>
<tr><td align="right">no<td align="right">no<td align="right">-<td><code>boost::serialization</tr>
<tr><td align="right">no<td align="right">yes<td align="right">-<td>no compilers do this</tr>
<tr><td align="right">yes<td align="right">no<td align="right">-<td><code>boost::serialzation</code><br><code>namespace of T<br><code>namespace of Archive</code></tr>
<tr><td align="right">yes<td align="right">yes<td align="right">no<td><code>namespace of T<br><code>namespace of Archive</code></tr>
<tr><td align="right">yes<td align="right">yes<td align="right">yes<td><code>boost::serialization<br><code>namespace of T<br><code>namespace of Archive</code></tr>
</table>
<p>
To deal with this while maintaining portability, the test programs use the following
before specifying free function overloads:
<pre><code>
// function specializations must be defined in the appropriate
// namespace - boost::serialization
#ifdef BOOST_NO_ARGUMENT_DEPENDENT_LOOKUP
namespace boost { namespace serialization {
#endif
</code></pre>
which works for all compilers.
<p>
From Vandervoorde and Josuttis book
"C++ Templates - A Complete Guide"<a href="bibliograph.html#14">[14]</a>
page 509:
<blockquote>
<strong>dependant name</strong><br>
A name the meaning of which depends on a template parameter.
For example, A<T>::x is a dependant name when A or T is a template parameter.
The name of a function in a function call is also dependant if any of the arguments in the call
has a type that depends on a template parameter.
For example, f in f((T*)0) is dependent if T is a template parameter.
The name of a template parameter is not considered dependent, however.
</blockquote>
and page 515:
<blockquote>
<strong>two-phase lookup</strong><br>
The name lookup mechanism used for names in templates. The "two phases" are
(1) the phase during which a template definition is first encountered by a compiler, and
(2) the instantiation of a template. <i>Nondependant names</i> are looked up only in the first phase,
but during this first phase <i>nondepdendent</i> base class are not considered.
<i>Dependant</i> names with a scope qualifier(::) are looked up only in the second phase.
Dependant names without a scop qualifier may be looked up in both places, but in the
second phase only argument-dependant lookup is performed.
</blockquote>
In this library, the file <code style="white-space: normal">serialization.hpp</code>,
which calls the serialization override,
is included by including any archive classes. This would suggest that all serialization
overrides could be in any of the three possible namespaces if the serialization code is
included before the archives. However, this is not always possible. Our implementation
of "export" functionality requires just the opposite.
<p>
This is consided inelegant to say the least. Hopefully, this may be improved in the future.
<h3><a name="Base">Base Classes</a></h3>
If the class to be serialized is derived from another class, its data
should be serialized with the following syntax:
<pre><code>
{
// invoke serialization of the base class
ar &amp; boost::serialization::base_object&lt;base_class_of_T&gt;(*this);
// save/load class member variables
ar &amp; member1;
ar &amp; member2;
}
</code></pre>
Note that this is <strong>NOT</strong> the same as calling the <code style="white-space: normal">serialize</code>
function of the base class. This might seem to work but will circumvent
certain code used for tracking of objects, and registering base-derived
relationships and other bookkeeping that is required for the serialization
system to function as designed. For this reason, all <code style="white-space: normal">serialize</code>
member functions should be <code style="white-space: normal">private</code>.
<h3><a name="Versioning">Versioning</a></h3>
It will eventually occur that class definitions change after archives have
been created. When a class instance is saved, the current version
in included in the class information stored in the archive. When the class instance
is loaded from the archive, the original version number is passed as an
argument to the loading function. This permits the load function to include
logic to accommodate older definitions for the class and reconcile them
with latest version. Save functions always save the current version. So this
results in automatically converting older format archives to the newest versions.
Version numbers are maintained independently for each class. This results in
a simple system for permitting access to older files and conversion of same.
The current version of the class is assigned as a
<a href="traits.html">Class Serialization Trait</a> described later in this manual.
<pre><code>
{
// invoke serialization of the base class
ar &amp; boost::serialization::base_object&lt;base_class_of_T&gt;(*this);
// save/load class member variables
ar &amp; member1;
ar &amp; member2;
// if its a recent version of the class
if(1 &lt; file_version)
// save load recently added class members
ar &amp; member3;
}
</code></pre>
<h3><a name="Splitting">Splitting <code style="white-space: normal">serialize</code> into Save/Load</a></h3>
There are times when it is inconvenient to use the same
template for both save and load functions. For example, this might occur if versioning
gets complex.
<h4><a name="splittingmemberfunctions">Splitting Member Functions</a></h4>
For member functions this can be addressed by including
the header file <a href="../../../boost/serialization/split_member.hpp" target="split_member_hpp">
boost/serialization/split_member.hpp</a> including code like this in the class:
<pre><code>
template&lt;class Archive&gt;
void save(Archive &amp; ar, const unsigned int version) const
{
// invoke serialization of the base class
ar &lt;&lt; boost::serialization::base_object&lt;const base_class_of_T&gt;(*this);
ar &lt;&lt; member1;
ar &lt;&lt; member2;
ar &lt;&lt; member3;
}
template&lt;class Archive&gt;
void load(Archive &amp; ar, const unsigned int version)
{
// invoke serialization of the base class
ar &gt;&gt; boost::serialization::base_object&lt;base_class_of_T&gt;(*this);
ar &gt;&gt; member1;
ar &gt;&gt; member2;
if(version &gt; 0)
ar &gt;&gt; member3;
}
template&lt;class Archive&gt;
void serialize(
Archive &amp; ar,
const unsigned int file_version
){
boost::serialization::split_member(ar, *this, file_version);
}
</code></pre>
This splits the serialization into two separate functions <code style="white-space: normal">save</code>
and <code style="white-space: normal">load</code>. Since the new <code style="white-space: normal">serialize</code> template
is always the same it can be generated by invoking the macro
BOOST_SERIALIZATION_SPLIT_MEMBER() defined in the header file
<a href="../../../boost/serialization/split_member.hpp" target="split_member_hpp">
boost/serialization/split_member.hpp
</a>.
So the entire <code style="white-space: normal">serialize</code> function above can be replaced with:
<pre><code>
BOOST_SERIALIZATION_SPLIT_MEMBER()
</code></pre>
<h4><a name="splittingfreefunctions">Splitting Free Functions</a></h4>
The situation is same for non-intrusive serialization with the free
<code style="white-space: normal">serialize</code> function template.
<a name="BOOST_SERIALIZATION_SPLIT_FREE">
To use <code style="white-space: normal">save</code> and
<code style="white-space: normal">load</code> function templates rather than
<code style="white-space: normal">serialize</code>:
<pre><code>
namespace boost { namespace serialization {
template&lt;class Archive&gt;
void save(Archive &amp; ar, const my_class &amp; t, unsigned int version)
{
...
}
template&lt;class Archive&gt;
void load(Archive &amp; ar, my_class &amp; t, unsigned int version)
{
...
}
}}
</code></pre>
include the header file
<a href="../../../boost/serialization/split_free.hpp" target="split_free_hpp">
boost/serialization/split_free.hpp
</a>.
and override the free <code style="white-space: normal">serialize</code> function template:
<pre><code>
namespace boost { namespace serialization {
template&lt;class Archive&gt;
inline void serialize(
Archive &amp; ar,
my_class &amp; t,
const unsigned int file_version
){
split_free(ar, t, file_version);
}
}}
</code></pre>
To shorten typing, the above template can be replaced with
the macro:
<pre><code>
BOOST_SERIALIZATION_SPLIT_FREE(my_class)
</code></pre>
Note that although the functionality to split the <code style="white-space: normal">
serialize</code> function into <code style="white-space: normal">save/load</code>
has been provided, the usage of the <code style="white-space: normal">serialize</code>
function with the corresponding <code style="white-space: normal">&amp;</code> operator
is preferred. The key to the serialization implementation is that objects are saved
and loaded in exactly the same sequence. Using the <code style="white-space: normal">&amp;</code>
operator and <code style="white-space: normal">serialize</code>
function guarantees that this is always the case and will minimize the
occurence of hard to find errors related to synchronization of
<code style="white-space: normal">save</code> and <code style="white-space: normal">load</code>
functions.
<h3><a name="const"><code style="white-space: normal">const</code> Members</a></h3>
Saving <code style="white-space: normal">const</code> members to an archive
requires no special considerations.
Loading <code style="white-space: normal">const</code> members can be addressed by using a
<code style="white-space: normal">const_cast</code>:
<pre><code>
ar &amp; const_cast&lt;T &amp;&gt;(t);
</code></pre>
Note that this violates the spirit and intention of the <code style="white-space: normal">const</code>
keyword. <code style="white-space: normal">const</code> members are intialized when a class instance
is constructed and not changed thereafter. However, this may
be most appropriate in many cases. Ultimately, it comes down to
the question about what <code style="white-space: normal">const</code> means in the context
of serialization.
<h3><a name="constructors">Non-Default Constructors</a></h3>
The general procedure used for serialization of objects
through a pointer has been described in a
<a href="archives.html#pointeroperators">previous section</a>.
This is implemented by code in the serialization library
which is similar to the following:
<pre><code>
// load data required for construction and invoke constructor in place
template&lt;class Archive, class T&gt;
inline void load_construct_data(
Archive &amp; ar, T * t, const unsigned int file_version
){
// default just uses the default constructor to initialize
// previously allocated memory.
::new(t)T();
}
template&lt;class Archive, class T&gt;
void load_object_ptr(
Archive &amp; ar,
T * &amp; t,
const unsigned int file_version
){
t = static_cast&lt;T *&gt;(operator new(sizeof(T));
load_construct_data(ar, t, file_version);
ar &gt;&gt; * t;
}
</code></pre>
This code
<ol>
<li>allocates memory from the heap large enough to hold the
object.
<li>invokes the overridable <code style="white-space: normal">load_construct_data</code>
to initialize the object.
<li>the default <code style="white-space: normal">load_construct_data</code> invoke the
default constructor "in-place" to initialize the memory.
</ol>
which effectively creates a new object and returns its pointer.
<p>
If there is no such default constructor, the function templates
<code style="white-space: normal">load_construct_data</code> and
perhaps <code style="white-space: normal">save_construct_data</code>
will have to be overridden. Here is a simple example:
<pre><code>
class my_class {
private:
friend class boost::serialization::access;
int member;
template&lt;class Archive&gt;
void serialize(Archive &amp;ar, const unsigned int file_version){
ar &amp; member;
}
public:
my_class(int m) :
member(m)
{}
};
</code></pre>
the overrides would be:
<pre><code>
namespace boost { namespace serialization {
template&lt;class Archive&gt;
inline void save_construct_data(
Archive &amp; ar, const my_class * t, const unsigned long int file_version
){
// save data required to construct instance
ar &lt;&lt; t-&gt;member;
}
template&lt;class Archive&gt;
inline void load_construct_data(
Archive &amp; ar, my_class * t, const unsigned long int file_version
){
// retrieve data from archive required to construct new instance
int m;
ar &gt;&gt; m;
// invoke inplace constructor to initialize instance of my_class
::new(t)my_class(m);
}
}} // namespace ...
</code></pre>
In addition to the deserialization of pointers, these overrides are used
in the deserialization of STL containers whose element type has no default
constructor.
<h3><a name="referencemembers"></a>Reference Members</h3>
Classes that contain reference members will generally require
non-default constructors as references can only be set when
an instance is constructed. The example of the previous section
is slightly more complex if the class has reference members.
This raises the question of how and where the objects being
referred to are stored and how are they created. Also there is the question about
references to polymorphic base classes. Basically, these
are the same questions that arise regarding pointers. This is
no surprise as references are really a special kind of pointer.
We address these questions by serializing references as though
they were pointers.
<pre><code>
class object;
class my_class {
private:
friend class boost::serialization::access;
int member1;
object &amp; member2;
template&lt;class Archive&gt;
void serialize(Archive &amp;ar, const unsigned int file_version);
public:
my_class(int m, object &amp; o) :
member1(m),
member2(o)
{}
};
</code></pre>
the overrides would be:
<pre><code>
namespace boost { namespace serialization {
template&lt;class Archive&gt;
inline void save_construct_data(
Archive &amp; ar, const my_class * t, const unsigned int file_version
){
// save data required to construct instance
ar &lt;&lt; t.member1;
// serialize reference to object as a pointer
ar &lt;&lt; &amp; t.member2;
}
template&lt;class Archive&gt;
inline void load_construct_data(
Archive &amp; ar, my_class * t, const unsigned int file_version
){
// retrieve data from archive required to construct new instance
int m;
ar &gt;&gt; m;
// create and load data through pointer to object
// tracking handles issues of duplicates.
object * optr;
ar &gt;&gt; optr;
// invoke inplace constructor to initialize instance of my_class
::new(t)my_class(m, *optr);
}
}} // namespace ...
</code></pre>
<h3><a name="templates"></a>Templates</h3>
Implementation serialization for templates is exactly the same process
as for normal classes and requires no additional considerations. Among
other things, this implies that serialization of compositions of templates
are automatically generateded when required if serialization of the
component templates is defined. For example, this library includes
definition of serialization for <code style="white-space: normal">boost::shared_ptr&lt;T&gt;</code> and for
<code style="white-space: normal">std::list&lt;T&gt;</code>. If I have defined serialization for my own
class <code style="white-space: normal">my_t</code>, then serialization for
<code style="white-space: normal">std::list&lt; boost::shared_ptr&lt; my_t&gt; &gt;</code> is already available
for use.
<p>
See for an example that shows how this idea might be implemented for your own
class templates, see
<a href="../example/demo_auto_ptr.cpp" target="demo_auto_ptr.cpp">
demo_auto_ptr.cpp</a>.
This shows how non-intrusive serialization
for the template <code style="white-space: normal">auto_ptr</code> from the standard library
can be implemented.
<p>
A somewhat trickier addition of serialization to a standard template
can be found in the example
<a href="../../../boost/serialization/shared_ptr.hpp" target="shared_ptr_hpp">
shared_ptr.hpp
</a>
<!--
Only the most minimal change to
<a href="../../../boost/serialization/shared_count.hpp" target="shared_count_hpp">
shared_count.hpp</a>
(to gain access to some private members) was necessary to achieve this.
This should demonstrate how easy it is to non-intrusively
implement serialization to any data type or template.
-->
<p>
In the specification of serialization for templates, its common
to split <code style="white-space: normal">serialize</code>
into a <code style="white-space: normal">load/save</code> pair.
Note that the convenience macro described
<a href="#BOOST_SERIALIZATION_SPLIT_FREE">above</a>
isn't helpful in these cases as the number and kind of
template class arguments won't match those used when splitting
<code style="white-space: normal">serialize</code> for a simple class. Use the override
syntax instead.
<h2><a href="traits.html">Class Serialization Traits</a></h2>
<h2><a href="wrappers.html">Serialization Wrappers</a></h2>
<h2><a name="implementations"></a>Serialization Implementations Included in the Library</h2>
This library includes code to serialize C style arrays of other
serializable types. That is, if T is a serializable type, then the following
is automatically available and will function as expected:
<pre><code>
T t[4];
ar &lt;&lt; t;
...
ar &gt;&gt; t;
</code></pre>
The facilities described above are sufficient to implement
serialization for all STL containers. In fact, this has been done
and has been included in the library. For example, in order to use
the included serialization code for <code style="white-space: normal">std::list</code>, use:
<pre><code>
#include &lt;boost/serialization/list.hpp&gt;
</code></pre>
rather than
<pre><code>
#include &lt;list&gt;
</code></pre>
Since the former includes the latter, this all that is necessary.
The same holds true for all STL collections as well as templates
required to support them (e.g. <code style="white-space: normal">std::pair</code>).
<hr>
<p>Revised
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%d %B, %Y" startspan -->
24 January, 2004
<!--webbot bot="Timestamp" endspan i-checksum="39359" -->
</p>
<p><i>&copy; Copyright <a href="http://www.rrsd.com">Robert Ramey</a>
2002-2004. All Rights Reserved.</i></p>
</body>
</html>