mirror of
https://github.com/boostorg/python.git
synced 2026-01-20 16:52:15 +00:00
273 lines
8.8 KiB
HTML
273 lines
8.8 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0//EN"
|
|
"http://www.w3.org/TR/REC-html40/strict.dtd">
|
|
|
|
<title>Boost.Python Pickle Support</title>
|
|
|
|
<div>
|
|
|
|
<img src="../../../c++boost.gif"
|
|
alt="c++boost.gif (8819 bytes)"
|
|
align="center"
|
|
width="277" height="86">
|
|
|
|
<hr>
|
|
<h1>Boost.Python Pickle Support</h1>
|
|
|
|
Pickle is a Python module for object serialization, also known
|
|
as persistence, marshalling, or flattening.
|
|
|
|
<p>
|
|
It is often necessary to save and restore the contents of an object to
|
|
a file. One approach to this problem is to write a pair of functions
|
|
that read and write data from a file in a special format. A powerful
|
|
alternative approach is to use Python's pickle module. Exploiting
|
|
Python's ability for introspection, the pickle module recursively
|
|
converts nearly arbitrary Python objects into a stream of bytes that
|
|
can be written to a file.
|
|
|
|
<p>
|
|
The Boost Python Library supports the pickle module by emulating the
|
|
interface implemented by Jim Fulton's ExtensionClass module that is
|
|
included in the
|
|
<a href="http://www.zope.org/"
|
|
>ZOPE</a>
|
|
distribution.
|
|
This interface is similar to that for regular Python classes as
|
|
described in detail in the
|
|
<a href="http://www.python.org/doc/current/lib/module-pickle.html"
|
|
>Python Library Reference for pickle.</a>
|
|
|
|
<hr>
|
|
<h2>The Boost.Python Pickle Interface</h2>
|
|
|
|
At the user level, the Boost.Python pickle interface involves three special
|
|
methods:
|
|
|
|
<dl>
|
|
<dt>
|
|
<strong><tt>__getinitargs__</tt></strong>
|
|
<dd>
|
|
When an instance of a Boost.Python extension class is pickled, the
|
|
pickler tests if the instance has a <tt>__getinitargs__</tt> method.
|
|
This method must return a Python tuple (it is most convenient to use
|
|
a boost::python::tuple). When the instance is restored by the
|
|
unpickler, the contents of this tuple are used as the arguments for
|
|
the class constructor.
|
|
|
|
<p>
|
|
If <tt>__getinitargs__</tt> is not defined, the class constructor
|
|
will be called without arguments.
|
|
|
|
<p>
|
|
<dt>
|
|
<strong><tt>__getstate__</tt></strong>
|
|
|
|
<dd>
|
|
When an instance of a Boost.Python extension class is pickled, the
|
|
pickler tests if the instance has a <tt>__getstate__</tt> method.
|
|
This method should return a Python object representing the state of
|
|
the instance.
|
|
|
|
<p>
|
|
If <tt>__getstate__</tt> is not defined, the instance's
|
|
<tt>__dict__</tt> is pickled (if it is not empty).
|
|
|
|
<p>
|
|
<dt>
|
|
<strong><tt>__setstate__</tt></strong>
|
|
|
|
<dd>
|
|
When an instance of a Boost.Python extension class is restored by the
|
|
unpickler, it is first constructed using the result of
|
|
<tt>__getinitargs__</tt> as arguments (see above). Subsequently the
|
|
unpickler tests if the new instance has a <tt>__setstate__</tt>
|
|
method. If so, this method is called with the result of
|
|
<tt>__getstate__</tt> (a Python object) as the argument.
|
|
|
|
<p>
|
|
If <tt>__setstate__</tt> is not defined, the result of
|
|
<tt>__getstate__</tt> must be a Python dictionary. The items of this
|
|
dictionary are added to the instance's <tt>__dict__</tt>.
|
|
|
|
</dl>
|
|
|
|
If both <tt>__getstate__</tt> and <tt>__setstate__</tt> are defined,
|
|
the Python object returned by <tt>__getstate__</tt> need not be a
|
|
dictionary. The <tt>__getstate__</tt> and <tt>__setstate__</tt> methods
|
|
can do what they want.
|
|
|
|
<hr>
|
|
<h2>Pitfalls and Safety Guards</h2>
|
|
|
|
In Boost.Python extension modules with many extension classes,
|
|
providing complete pickle support for all classes would be a
|
|
significant overhead. In general complete pickle support should only be
|
|
implemented for extension classes that will eventually be pickled.
|
|
However, the author of a Boost.Python extension module might not
|
|
anticipate correctly which classes need support for pickle.
|
|
Unfortunately, the pickle protocol described above has two important
|
|
pitfalls that the end user of a Boost.Python extension module might not
|
|
be aware of:
|
|
|
|
<dl>
|
|
<dt>
|
|
<strong>Pitfall 1:</strong>
|
|
Both <tt>__getinitargs__</tt> and <tt>__getstate__</tt> are not defined.
|
|
|
|
<dd>
|
|
In this situation the unpickler calls the class constructor without
|
|
arguments and then adds the <tt>__dict__</tt> that was pickled by
|
|
default to that of the new instance.
|
|
|
|
<p>
|
|
However, most C++ classes wrapped with Boost.Python will have member
|
|
data that are not restored correctly by this procedure. To alert the
|
|
user to this problem, a safety guard is provided. If both
|
|
<tt>__getinitargs__</tt> and <tt>__getstate__</tt> are not defined,
|
|
Boost.Python tests if the class has an attribute
|
|
<tt>__dict_defines_state__</tt>. An exception is raised if this
|
|
attribute is not defined:
|
|
|
|
<pre>
|
|
RuntimeError: Incomplete pickle support (__dict_defines_state__ not set)
|
|
</pre>
|
|
|
|
In the rare cases where this is not the desired behavior, the safety
|
|
guard can deliberately be disabled. The corresponding C++ code for
|
|
this is, e.g.:
|
|
|
|
<pre>
|
|
class_builder<your_class> py_your_class(your_module, "your_class");
|
|
py_your_class.dict_defines_state();
|
|
</pre>
|
|
|
|
It is also possible to override the safety guard at the Python level.
|
|
E.g.:
|
|
|
|
<pre>
|
|
import your_bpl_module
|
|
class your_class(your_bpl_module.your_class):
|
|
__dict_defines_state__ = 1
|
|
</pre>
|
|
|
|
<p>
|
|
<dt>
|
|
<strong>Pitfall 2:</strong>
|
|
<tt>__getstate__</tt> is defined and the instance's <tt>__dict__</tt> is not empty.
|
|
|
|
<dd>
|
|
The author of a Boost.Python extension class might provide a
|
|
<tt>__getstate__</tt> method without considering the possibilities
|
|
that:
|
|
|
|
<p>
|
|
<ul>
|
|
<li>
|
|
his class is used in Python as a base class. Most likely the
|
|
<tt>__dict__</tt> of instances of the derived class needs to be
|
|
pickled in order to restore the instances correctly.
|
|
|
|
<p>
|
|
<li>
|
|
the user adds items to the instance's <tt>__dict__</tt> directly.
|
|
Again, the <tt>__dict__</tt> of the instance then needs to be
|
|
pickled.
|
|
|
|
</ul>
|
|
<p>
|
|
|
|
To alert the user to this highly unobvious problem, a safety guard is
|
|
provided. If <tt>__getstate__</tt> is defined and the instance's
|
|
<tt>__dict__</tt> is not empty, Boost.Python tests if the class has
|
|
an attribute <tt>__getstate_manages_dict__</tt>. An exception is
|
|
raised if this attribute is not defined:
|
|
|
|
<pre>
|
|
RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set)
|
|
</pre>
|
|
|
|
To resolve this problem, it should first be established that the
|
|
<tt>__getstate__</tt> and <tt>__setstate__</tt> methods manage the
|
|
instances's <tt>__dict__</tt> correctly. Note that this can be done
|
|
both at the C++ and the Python level. Finally, the safety guard
|
|
should intentionally be overridden. E.g. in C++:
|
|
|
|
<pre>
|
|
class_builder<your_class> py_your_class(your_module, "your_class");
|
|
py_your_class.getstate_manages_dict();
|
|
</pre>
|
|
|
|
In Python:
|
|
|
|
<pre>
|
|
import your_bpl_module
|
|
class your_class(your_bpl_module.your_class):
|
|
__getstate_manages_dict__ = 1
|
|
def __getstate__(self):
|
|
# your code here
|
|
def __setstate__(self, state):
|
|
# your code here
|
|
</pre>
|
|
</dl>
|
|
|
|
<hr>
|
|
<h2>Practical Advice</h2>
|
|
|
|
<ul>
|
|
<li>
|
|
Avoid using <tt>__getstate__</tt> if the instance can also be
|
|
reconstructed by way of <tt>__getinitargs__</tt>. This automatically
|
|
avoids Pitfall 2.
|
|
|
|
<p>
|
|
<li>
|
|
If <tt>__getstate__</tt> is required, include the instance's
|
|
<tt>__dict__</tt> in the Python object that is returned.
|
|
|
|
</ul>
|
|
|
|
<hr>
|
|
<h2>Examples</h2>
|
|
|
|
There are three files in <tt>boost/libs/python/example</tt> that
|
|
show how so provide pickle support.
|
|
|
|
<h3><a href="../example/pickle1.cpp"><tt>pickle1.cpp</tt></a></h3>
|
|
|
|
The C++ class in this example can be fully restored by passing the
|
|
appropriate argument to the constructor. Therefore it is sufficient
|
|
to define the pickle interface method <tt>__getinitargs__</tt>.
|
|
|
|
<h3><a href="../example/pickle2.cpp"><tt>pickle2.cpp</tt></a></h3>
|
|
|
|
The C++ class in this example contains member data that cannot be
|
|
restored by any of the constructors. Therefore it is necessary to
|
|
provide the <tt>__getstate__</tt>/<tt>__setstate__</tt> pair of
|
|
pickle interface methods.
|
|
|
|
<p>
|
|
For simplicity, the <tt>__dict__</tt> is not included in the result
|
|
of <tt>__getstate__</tt>. This is not generally recommended, but a
|
|
valid approach if it is anticipated that the object's
|
|
<tt>__dict__</tt> will always be empty. Note that the safety guards
|
|
will catch the cases where this assumption is violated.
|
|
|
|
<h3><a href="../example/pickle3.cpp"><tt>pickle3.cpp</tt></a></h3>
|
|
|
|
This example is similar to <a
|
|
href="../example/pickle2.cpp"><tt>pickle2.cpp</tt></a>. However, the
|
|
object's <tt>__dict__</tt> is included in the result of
|
|
<tt>__getstate__</tt>. This requires more code but is unavoidable
|
|
if the object's <tt>__dict__</tt> is not always empty.
|
|
|
|
<hr>
|
|
© Copyright Ralf W. Grosse-Kunstleve 2001. Permission to copy,
|
|
use, modify, sell and distribute this document is granted provided this
|
|
copyright notice appears in all copies. This document is provided "as
|
|
is" without express or implied warranty, and with no claim as to its
|
|
suitability for any purpose.
|
|
|
|
<p>
|
|
Updated: March 21, 2001
|
|
</div>
|