From 567802fd75828a8d7ac206e2f5ff05dd2898781e Mon Sep 17 00:00:00 2001 From: Dirk Gerrits Date: Fri, 6 Dec 2002 00:11:13 +0000 Subject: [PATCH] First embedding tutorial draft. Still some TODOs. [SVN r749] --- doc/tutorial/doc/embedding.html | 132 ++++++++++++++++++++++++++++++++ doc/tutorial/doc/embedding.txt | 111 +++++++++++++++++++++++++++ 2 files changed, 243 insertions(+) create mode 100644 doc/tutorial/doc/embedding.html create mode 100644 doc/tutorial/doc/embedding.txt diff --git a/doc/tutorial/doc/embedding.html b/doc/tutorial/doc/embedding.html new file mode 100644 index 00000000..7a00519b --- /dev/null +++ b/doc/tutorial/doc/embedding.html @@ -0,0 +1,132 @@ + + + +Embedding + + + + + + + + + +
+ + Embedding +
+
+ + + + + + +
+

Embedding basics

+By now you should know how to use Boost.Python to call your C++ code from Python. However sometimes you might need to do the reverse: call Python code from the C++-side. This requires you to 'embed' the Python interpreter in your C++ program. For this we need to use the +Python C API.

+

+The basics to embedding with the API are pretty simple. First you need to make sure that your program links with pythonXY.lib where X.Y is your Python version number. You'll typically find this library in the libs subdirectory of your Python installation. Now your program can follow these steps:

+
  1. #include "Python.h"
  2. Call +Py_Initialize() to start the Python interpreter.
  3. Call other Python C API routines.
  4. Call +Py_Finalize() to stop the Python interpreter and release its resources.

+Of course, there can be other C++ code between all of these steps.

+

+Now this looks pretty simple, but we'll see in the next section that step 3 can be tricky.

+

Manual reference counting

+Most things in Python are objects. Therefore it is only natural that many of the Python C API functions operate on Python objects. Because C/C++ can't work with Python objects directly, the API defines a PyObject structure and a lot of functions to operate on PyObject pointers.

+

+An important property of Python objects, and therefore of PyObjects, is that they are reference counted. This has major advantages compared to 'dumb' copying: it requiring less memory and it avoids unnecessary copying overhead. However, there is a downside as well. Although the reference counting is transparent from the Python-side, it is quite explicit in the C API. In other words you must increase and decrease the reference counts of PyObjects manually using the +Py_XINCREF and +Py_XDECREF macros. This is cumbersome, and if you don't do it properly some objects might be released when you still need them, or not be released at all.

+

+I will briefly explain how to update the reference counts correctly, but I'll soon show a better way to do things.

+

+The API functions that return PyObject pointers are listed in the Python C API documentation as either returning a borrowed or a new reference. The difference is in reference ownership.

+

+When a new reference is returned, you own that reference. Therefore you don't need to worry about the object being deallocated while you still need it. You do need to decrease the reference count when you are done with it however, otherwise the object will never be deallocated. In other words, you'll have a resource leak.

+

+Here's a simple example:

+

TODO: need different examples because the current ones can be done very naturally with python::tuple which makes them rather unsuitable for explaining python:: +handle

+    // Create a new tuple of 3 elements long
+    PyObject* my_tuple = PyTuple_New(3);
+    ... // Use my_tuple here
+    // We're done with the tuple
+    Py_XDECREF(my_tuple);
+
+

+When a borrowed reference is returned, you do not have ownership of the reference. So if you just want to discard the return value, there is nothing you have to do: you didn't own it anyway. If want to use it however, you'll first have to increase its reference count (to prevent the objects deletion). Then later on when you are done with itm you'll need to decrease the reference count again. Here's another example:

+
+    // Retrieve the first item in the tuple
+    PyObject* first = PyTuple_GetItem(my_tuple, 0);
+    Py_XINCREF(first);
+    ... // Use first here
+    // We're done with the first tuple item
+    Py_XDECREF(first);
+
+

+While this certainly works, it's hardly elegant and it's easy to make mistakes, especially when there are multiple execution paths.

+

Boost.Python to the rescue

+Now we get to the good news. If you don't want to do all the error prone reference counting yourself, you can let Boost.Python do all the work. First include <boost/python.hpp> instead of "Python.h" and link to boost_python.lib (or boost_python_debug.lib) instead of pythonXY.lib. Then all we really need to do is replace every PyObject* with +handle<PyObject> and we can remove all the +Py_XINCREFs and +Py_XDECREFs! All the reference counting will be done automagically through the power of the +Resource Acquisition Is Initialization idiom.

+

+We still need a way to differentiate between new and borrowed references though. Luckily, this is pretty straightforward using the +borrowed function. Here is an example using +handle that combines the functionality of the above two PyObject* examples. Notice how it is both shorter and cleaner:

+
+    // Create a new tuple of 3 elements long
+    handle<PyObject> my_tuple( PyTuple_New(3) );
+    // Retrieve the first item in the tuple
+    handle<PyObject> first( borrowed(PyTuple_GetItem(my_tuple.get(), 0)) );
+    ... // Use first here
+
+

+Note that the +handle member function get() returns the raw PyObject* that is held by the +handle.

+

TODO: Explain PyRun_... basics, somewhere. Perhaps these functions can be used in the examples?

Boost.Python modules in an embedded program

+Now that we know how to call Python code from C++ and C++ code from Python, how about doing it both at the same time? Sometimes you might want to call Python code from C++ and have that Python code call C++ code again. If you built your Boost.Python module then you can just use it in your embedded Python code as you would in a standalone Python program: no further changes necessary.

+

+However, you can also define the Boost.Python module in the same program that embeds the Python code which will be using the module. Then you won't have to build the module and place it in the proper directory, and this also prevents others from using it in their own Python code. (Unless they start taking your executable apart that is. )

+

+Doing this is relatively straightforward. You just define your Boost.Python module as usual and use the basic embedding steps described above. However, before calling +Py_Initialize you call +PyImport_AppendInittab first. This function takes the name and initialization function of your module as parameters and adds the module to the interpreters list of builtin modules. So when the Python interpreter comes across an import statement, it will find the module in its list of builtin modules instead of (unsuccessfully) searching for it in the Python directory.

+

+Your program will look something like this:

+
+    BOOST_PYTHON_MODULE(my_module)
+    {
+        ...
+    }
+    ...
+    PyImport_AppendInittab("my_module", initmy_module);
+    Py_Initialize();
+    ...
+
+

+There is one catch at the moment though. You must not call +Py_Finalize. Boost.Python keeps some PyObject references alive in global data structures, and when those go out of scope after interpreter finalization, Python crashes. This will be fixed in the future.

+

Additional reading

+A more elaborate example showing these techniques is located at +libs/python/test/embedding.cpp.

+ + + + + + +
+
+
+ + diff --git a/doc/tutorial/doc/embedding.txt b/doc/tutorial/doc/embedding.txt new file mode 100644 index 00000000..12aea648 --- /dev/null +++ b/doc/tutorial/doc/embedding.txt @@ -0,0 +1,111 @@ +[doc Tutorial extension draft] + +[/ this probably needs to be merged into quickstart.txt ] + +[def :-) [$theme/smiley.gif]] +[def Py_Initialize [@http://www.python.org/doc/current/api/initialization.html#l2h-652 Py_Initialize]] +[def Py_Finalize [@http://www.python.org/doc/current/api/initialization.html#l2h-656 Py_Finalize]] +[def PyRun_String [@http://www.python.org/doc/current/api/veryhigh.html#l2h-55 PyRun_String]] +[def PyRun_File [@http://www.python.org/doc/current/api/veryhigh.html#l2h-56 PyRun_File]] +[def Py_eval_input [@http://www.python.org/doc/current/api/veryhigh.html#l2h-58 Py_eval_input]] +[def Py_file_input [@http://www.python.org/doc/current/api/veryhigh.html#l2h-59 Py_file_input]] +[def Py_single_input [@http://www.python.org/doc/current/api/veryhigh.html#l2h-60 Py_single_input]] +[def Py_XINCREF [@http://www.python.org/doc/current/api/countingRefs.html#l2h-65 Py_XINCREF]] +[def Py_XDECREF [@http://www.python.org/doc/current/api/countingRefs.html#l2h-67 Py_XDECREF]] +[def PyImport_AppendInittab [@http://www.python.org/doc/current/api/importing.html#l2h-137 PyImport_AppendInittab]] +[def handle [@../../v2/handle.html handle]] + +[page Embedding] + +[h2 Embedding basics] + +By now you should know how to use Boost.Python to call your C++ code from Python. However sometimes you might need to do the reverse: call Python code from the C++-side. This requires you to 'embed' the Python interpreter in your C++ program. For this we need to use the [@http://www.python.org/doc/current/api/api.html Python C API]. + +The basics to embedding with the API are pretty simple. First you need to make sure that your program links with [^pythonXY.lib] where X.Y is your Python version number. You'll typically find this library in the libs subdirectory of your Python installation. Now your program can follow these steps: + +# '''#include''' [^"Python.h"] + +# Call Py_Initialize() to start the Python interpreter. + +# Call other Python C API routines. + +# Call Py_Finalize() to stop the Python interpreter and release its resources. + +Of course, there can be other C++ code between all of these steps. + +Now this looks pretty simple, but we'll see in the next section that step 3 can be tricky. + +[h2 Manual reference counting] + +Most things in Python are objects. Therefore it is only natural that many of the Python C API functions operate on Python objects. Because C/C++ can't work with Python objects directly, the API defines a PyObject structure and a lot of functions to operate on PyObject pointers. + +An important property of Python objects, and therefore of PyObjects, is that they are reference counted. This has major advantages compared to 'dumb' copying: it requiring less memory and it avoids unnecessary copying overhead. However, there is a downside as well. Although the reference counting is transparent from the Python-side, it is quite explicit in the C API. In other words you must increase and decrease the reference counts of PyObjects [*manually] using the Py_XINCREF and Py_XDECREF macros. This is cumbersome, and if you don't do it properly some objects might be released when you still need them, or not be released at all. + +I will briefly explain how to update the reference counts correctly, but I'll soon show a better way to do things. + +The API functions that return PyObject pointers are listed in the Python C API documentation as either returning a ['borrowed] or a ['new] reference. The difference is in ['reference ownership]. + +When a ['new] reference is returned, you own that reference. Therefore you don't need to worry about the object being deallocated while you still need it. You do need to decrease the reference count when you are done with it however, otherwise the object will never be deallocated. In other words, you'll have a resource leak. + +Here's a simple example: + +[:[*TODO:] [^need different examples because the current ones can be done very naturally with python::tuple which makes them rather unsuitable for explaining python::handle ]] + + // Create a new tuple of 3 elements long + PyObject* my_tuple = PyTuple_New(3); + ... // Use my_tuple here + // We're done with the tuple + Py_XDECREF(my_tuple); + +When a ['borrowed] reference is returned, you do not have ownership of the reference. So if you just want to discard the return value, there is nothing you have to do: you didn't own it anyway. If want to use it however, you'll first have to increase its reference count (to prevent the objects deletion). Then later on when you are done with itm you'll need to decrease the reference count again. Here's another example: + + // Retrieve the first item in the tuple + PyObject* first = PyTuple_GetItem(my_tuple, 0); + Py_XINCREF(first); + ... // Use first here + // We're done with the first tuple item + Py_XDECREF(first); + +While this certainly works, it's hardly elegant and it's easy to make mistakes, especially when there are multiple execution paths. + +[h2 Boost.Python to the rescue] + +Now we get to the good news. If you don't want to do all the error prone reference counting yourself, you can let Boost.Python do all the work. First include [^] instead of [^"Python.h"] and link to [^boost_python.lib] (or [^boost_python_debug.lib]) instead of [^pythonXY.lib]. Then all we really need to do is replace every PyObject* with handle and we can remove all the Py_XINCREFs and Py_XDECREFs! All the reference counting will be done automagically through the power of the [@http://sourceforge.net/docman/display_doc.php?docid=8673&group_id=9028 Resource Acquisition Is Initialization] idiom. + +We still need a way to differentiate between new and borrowed references though. Luckily, this is pretty straightforward using the [@../../v2/handle.html#borrowed-spec borrowed] function. Here is an example using handle that combines the functionality of the above two PyObject* examples. Notice how it is both shorter and cleaner: + + // Create a new tuple of 3 elements long + handle my_tuple( PyTuple_New(3) ); + // Retrieve the first item in the tuple + handle first( borrowed(PyTuple_GetItem(my_tuple.get(), 0)) ); + ... // Use first here + +Note that the handle member function get() returns the raw PyObject* that is held by the handle. + +[:[*TODO:] [^Explain PyRun_... basics, somewhere. Perhaps these functions can be used in the examples?]] + +[h2 Boost.Python modules in an embedded program] + +Now that we know how to call Python code from C++ and C++ code from Python, how about doing it both at the same time? Sometimes you might want to call Python code from C++ and have that Python code call C++ code again. If you built your Boost.Python module then you can just use it in your embedded Python code as you would in a standalone Python program: no further changes necessary. + +However, you can also define the Boost.Python module in the same program that embeds the Python code which will be using the module. Then you won't have to build the module and place it in the proper directory, and this also prevents others from using it in their own Python code. (Unless they start taking your executable apart that is. :-)) + +Doing this is relatively straightforward. You just define your Boost.Python module as usual and use the basic embedding steps described above. However, before calling Py_Initialize you call PyImport_AppendInittab first. This function takes the name and initialization function of your module as parameters and adds the module to the interpreters list of builtin modules. So when the Python interpreter comes across an import statement, it will find the module in its list of builtin modules instead of (unsuccessfully) searching for it in the Python directory. + +Your program will look something like this: + + BOOST_PYTHON_MODULE(my_module) + { + ... + } + ... + PyImport_AppendInittab("my_module", initmy_module); + Py_Initialize(); + ... + +There is one catch at the moment though. You must [*not] call Py_Finalize. Boost.Python keeps some PyObject references alive in global data structures, and when those go out of scope after interpreter finalization, Python crashes. This will be fixed in the future. + +[h2 Additional reading] + +A more elaborate example showing these techniques is located at [@../../../test/embedding.cpp libs/python/test/embedding.cpp]. +