From a3f822b7d3b24fa9d179e68bfe7cc93093fe368d Mon Sep 17 00:00:00 2001 From: "Ralf W. Grosse-Kunstleve" Date: Sun, 4 Mar 2001 15:56:07 +0000 Subject: [PATCH] Documentation for pickle support. [SVN r9417] --- doc/index.html | 2 +- doc/pickle.html | 223 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 224 insertions(+), 1 deletion(-) create mode 100644 doc/pickle.html diff --git a/doc/index.html b/doc/index.html index 9ddd65d3..4550be5c 100644 --- a/doc/index.html +++ b/doc/index.html @@ -116,7 +116,7 @@ among others.
  • Advanced Topics
      -
    1. Pickling +
    2. Pickle Support
    3. class_builder<> diff --git a/doc/pickle.html b/doc/pickle.html new file mode 100644 index 00000000..0e64d6fc --- /dev/null +++ b/doc/pickle.html @@ -0,0 +1,223 @@ + + + BPL Pickle Support + + + +c++boost.gif (8819 bytes) + + +
      +

      BPL Pickle Support

      + +Pickle is a Python module for object serialization, also known +as persistence, marshalling, or flattening. + +

      +It is often necessary to save and restore the contents of an object to +a file. One approach to this problem is to write a pair of functions +that read and write data from a file in a special format. A powerful +alternative approach is to use Python's pickle module. Exploiting +Python's ability for introspection, the pickle module recursively +converts nearly arbitrary Python objects into a stream of bytes that +can be written to a file. + +

      +The Boost Python Library supports the pickle module by emulating the +interface implemented by Jim Fulton's ExtensionClass module that is +included in the ZOPE distribution +(http://www.zope.org/). +This interface is similar to that for regular Python classes as +described in detail in the Python Library Reference for pickle: + +

      + http://www.python.org/doc/current/lib/module-pickle.html +
      + +
      +

      The BPL Pickle Interface

      + +At the user level, the BPL pickle interface involves three special +methods: + +
      +
      +__getinitargs__ +
      + When an instance of a BPL extension class is pickled, the pickler + tests if the instance has a __getinitargs__ method. This method must + return a Python tuple. When the instance is restored by the + unpickler, the contents of this tuple are used as the arguments for + the class constructor. + +

      + If __getinitargs__ is not defined, the class constructor will be + called without arguments. + +

      +

      +__getstate__ + +
      + When an instance of a BPL extension class is pickled, the pickler + tests if the instance has a __getstate__ method. This method should + return a Python object representing the state of the instance. + +

      + If __getstate__ is not defined, the instance's __dict__ is pickled + (if it is not empty). + +

      +

      +__setstate__ + +
      + When an instance of a BPL extension class is restored by the + unpickler, it is first constructed using the result of + __getinitargs__ as arguments (see above). Subsequently the unpickler + tests if the new instance has a __setstate__ method. If so, this + method is called with the result of __getstate__ (a Python object) as + the argument. + +

      + If __setstate__ is not defined, the result of __getstate__ must be + a Python dictionary. The items of this dictionary are added to + the instance's __dict__. +

      + +If both __getstate__ and __setstate__ are defined, the Python object +returned by __getstate__ need not be a dictionary. The __getstate__ and +__setstate__ methods can do what they want. + +
      +

      Pitfalls and Safety Guards

      + +In BPL extension modules with many extension classes, providing +complete pickle support for all classes would be a significant +overhead. In general complete pickle support should only be implemented +for extension classes that will eventually be pickled. However, the +author of a BPL extension module might not anticipate correctly which +classes need support for pickle. Unfortunately, the pickle protocol +described above has two important pitfalls that the end user of a BPL +extension module might not be aware of: + +
      +
      +Pitfall 1: +Both __getinitargs__ and __getstate__ are not defined. + +
      + In this situation the unpickler calls the class constructor without + arguments and then adds the __dict__ that was pickled by default to + that of the new instance. + +

      + However, most C++ classes wrapped with the BPL will have member data + that are not restored correctly by this procedure. To alert the user + to this problem, a safety guard is provided. If both __getinitargs__ + and __getstate__ are not defined, the BPL tests if the class has an + attribute __dict_defines_state__. An exception is raised if this + attribute is not defined: + +

      +    RuntimeError: Incomplete pickle support (__dict_defines_state__ not set)
      +
      + + In the rare cases where this is not the desired behavior, the safety + guard can deliberately be disabled. The corresponding C++ code for + this is, e.g.: + +
      +    class_builder py_your_class(your_module, "your_class");
      +    py_your_class.dict_defines_state();
      +
      + + It is also possible to override the safety guard at the Python level. + E.g.: + +
      +    import your_bpl_module
      +    class your_class(your_bpl_module.your_class):
      +      __dict_defines_state__ = 1
      +
      + +

      +

      +Pitfall 2: +__getstate__ is defined and the instance's __dict__ is not empty. + +
      + The author of a BPL extension class might provide a __getstate__ + method without considering the possibilities that: + +

      +

        +
      • + his class is used as a base class. Most likely the __dict__ of + instances of the derived class needs to be pickled in order to + restore the instances correctly. + +

        +

      • + the user adds items to the instance's __dict__ directly. Again, + the __dict__ of the instance then needs to be pickled. +
      +

      + + To alert the user to this highly unobvious problem, a safety guard is + provided. If __getstate__ is defined and the instance's __dict__ is + not empty, the BPL tests if the class has an attribute + __getstate_manages_dict__. An exception is raised if this attribute + is not defined: + +

      +    RuntimeError: Incomplete pickle support (__getstate_manages_dict__ not set)
      +
      + + To resolve this problem, it should first be established that the + __getstate__ and __setstate__ methods manage the instances's __dict__ + correctly. Note that this can be done both at the C++ and the Python + level. Finally, the safety guard should intentionally be overridden. + E.g. in C++: + +
      +    class_builder py_your_class(your_module, "your_class");
      +    py_your_class.getstate_manages_dict();
      +
      + + In Python: + +
      +    import your_bpl_module
      +    class your_class(your_bpl_module.your_class):
      +      __getstate_manages_dict__ = 1
      +      def __getstate__(self):
      +        # your code here
      +      def __setstate__(self, state):
      +        # your code here
      +
      +
      + +
      +

      Practical Advice

      + +
        +
      • + Avoid using __getstate__ if the instance can also be reconstructed + by way of __getinitargs__. This automatically avoids Pitfall 2. + +

        +

      • + If __getstate__ is required, include the instance's __dict__ in the + Python object that is returned. +
      + +
      +
      +Author: Ralf W. Grosse-Kunstleve, March 2001 +
      +