diff --git a/doc/pickle.html b/doc/pickle.html new file mode 100644 index 00000000..0e64d6fc --- /dev/null +++ b/doc/pickle.html @@ -0,0 +1,223 @@ + + + BPL Pickle Support + + + +c++boost.gif (8819 bytes) + + +
+

BPL Pickle Support

+ +Pickle is a Python module for object serialization, also known +as persistence, marshalling, or flattening. + +

+It is often necessary to save and restore the contents of an object to +a file. One approach to this problem is to write a pair of functions +that read and write data from a file in a special format. A powerful +alternative approach is to use Python's pickle module. Exploiting +Python's ability for introspection, the pickle module recursively +converts nearly arbitrary Python objects into a stream of bytes that +can be written to a file. + +

+The Boost Python Library supports the pickle module by emulating the +interface implemented by Jim Fulton's ExtensionClass module that is +included in the ZOPE distribution +(http://www.zope.org/). +This interface is similar to that for regular Python classes as +described in detail in the Python Library Reference for pickle: + +

+ http://www.python.org/doc/current/lib/module-pickle.html +
+ +
+

The BPL Pickle Interface

+ +At the user level, the BPL pickle interface involves three special +methods: + +
+
+__getinitargs__ +
+ When an instance of a BPL extension class is pickled, the pickler + tests if the instance has a __getinitargs__ method. This method must + return a Python tuple. When the instance is restored by the + unpickler, the contents of this tuple are used as the arguments for + the class constructor. + +

+ If __getinitargs__ is not defined, the class constructor will be + called without arguments. + +

+

+__getstate__ + +
+ When an instance of a BPL extension class is pickled, the pickler + tests if the instance has a __getstate__ method. This method should + return a Python object representing the state of the instance. + +

+ If __getstate__ is not defined, the instance's __dict__ is pickled + (if it is not empty). + +

+

+__setstate__ + +
+ When an instance of a BPL extension class is restored by the + unpickler, it is first constructed using the result of + __getinitargs__ as arguments (see above). Subsequently the unpickler + tests if the new instance has a __setstate__ method. If so, this + method is called with the result of __getstate__ (a Python object) as + the argument. + +

+ If __setstate__ is not defined, the result of __getstate__ must be + a Python dictionary. The items of this dictionary are added to + the instance's __dict__. +

+ +If both __getstate__ and __setstate__ are defined, the Python object +returned by __getstate__ need not be a dictionary. The __getstate__ and +__setstate__ methods can do what they want. + +
+

Pitfalls and Safety Guards

+ +In BPL extension modules with many extension classes, providing +complete pickle support for all classes would be a significant +overhead. In general complete pickle support should only be implemented +for extension classes that will eventually be pickled. However, the +author of a BPL extension module might not anticipate correctly which +classes need support for pickle. Unfortunately, the pickle protocol +described above has two important pitfalls that the end user of a BPL +extension module might not be aware of: + +
+
+Pitfall 1: +Both __getinitargs__ and __getstate__ are not defined. + +
+ In this situation the unpickler calls the class constructor without + arguments and then adds the __dict__ that was pickled by default to + that of the new instance. + +

+ However, most C++ classes wrapped with the BPL will have member data + that are not restored correctly by this procedure. To alert the user + to this problem, a safety guard is provided. If both __getinitargs__ + and __getstate__ are not defined, the BPL tests if the class has an + attribute __dict_defines_state__. An exception is raised if this + attribute is not defined: + +

+    RuntimeError: Incomplete pickle support (__dict_defines_state__ not set)
+
+ + In the rare cases where this is not the desired behavior, the safety + guard can deliberately be disabled. The corresponding C++ code for + this is, e.g.: + +
+    class_builder py_your_class(your_module, "your_class");
+    py_your_class.dict_defines_state();
+
+ + It is also possible to override the safety guard at the Python level. + E.g.: + +
+    import your_bpl_module
+    class your_class(your_bpl_module.your_class):
+      __dict_defines_state__ = 1
+
+ +

+

+Pitfall 2: +__getstate__ is defined and the instance's __dict__ is not empty. + +
+ The author of a BPL extension class might provide a __getstate__ + method without considering the possibilities that: + +

+

+

+ + To alert the user to this highly unobvious problem, a safety guard is + provided. If __getstate__ is defined and the instance's __dict__ is + not empty, the BPL tests if the class has an attribute + __getstate_manages_dict__. An exception is raised if this attribute + is not defined: + +

+    RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set)
+
+ + To resolve this problem, it should first be established that the + __getstate__ and __setstate__ methods manage the instances's __dict__ + correctly. Note that this can be done both at the C++ and the Python + level. Finally, the safety guard should intentionally be overridden. + E.g. in C++: + +
+    class_builder py_your_class(your_module, "your_class");
+    py_your_class.getstate_manages_dict();
+
+ + In Python: + +
+    import your_bpl_module
+    class your_class(your_bpl_module.your_class):
+      __getstate_manages_dict__ = 1
+      def __getstate__(self):
+        # your code here
+      def __setstate__(self, state):
+        # your code here
+
+
+ +
+

Practical Advice

+ + + +
+
+Author: Ralf W. Grosse-Kunstleve, March 2001 +
+