Monday, June 30, 2008

Python C/API tutorial

Upon my friend's request, I wrote a small Python C/API tutorial. I don't really recommend you use Python C/API directly, without the help of tools like boost::python, swig, pyrex, ... unless your code is really performance sensitive, or very small, but still it'll be good to know what it is like for a better understanding of Python/C interaction.

The reader is expected to know
- Python scripting
- Basic C++ programming
- How CPython handles object lifetime using reference counter

This tutorial is only for embedding. Extending (making a Python module in C) also uses Python C/API but it's much more tedious and I don't even want to think of it without a tool. It also doesn't cover threading stuff. Basically these are simple example programs that are not thread safe. You'll need to read Python C/API reference manual closely (global interpreter lock etc.) to make it thread safe.

I used Linux to write this doc but everything except compilation is platform independent.

[main1.cpp]



#include <Python.h>

int main()
{
Py_Initialize();
PyRun_SimpleString("print 'Hello Python C/API'");
Py_Finalize();
return 0;
}


$ gcc main1.cpp -I/usr/include/python2.5 -lpython2.5 -lstdc++
$ ./a.out
Hello Python C/API

This is the first Python C/API program. You can find anywhere about this program on the net so I'll skip explanation (and it's fully self explanatory after all). Just several things,

- Don't include Python.h like #include <python2.5/Python.h> even if you feel its better.

- When you use Python C/API, don't use names that start Py and _Py because it's reserved for Python C/API.

- Probably you will want to use -Wall and -O3 options when you compile.

- For Maya guys: If you are using Python C/API in a Maya plug-in, it'll be better compiling it with Maya Python
$ gcc main1.cpp -I/usr/autodesk/maya2008/include/python2.5 -L/usr/autodesk/maya2008/lib -lpython2.5 -lstdc++ (or something like this. I don't have maya2008. I'm just guessing the directory structure will be like this)

- For Maya guys: You don't need Py_Initialize(); because Python is already running. You don't need Py_Finalize() either because you don't want to terminate Python interpreter.

(Maya is a commercial software that is used for computer graphics)

[main2.cpp]

#include <Python.h>

int main()
{
Py_Initialize();

PyObject* po_main = PyImport_AddModule("__main__");
PyObject* po_int10000 = PyInt_FromLong(10000);
PyObject_SetAttrString(po_main, "tamtam", po_int10000);
Py_XDECREF(po_int10000);
PyRun_SimpleString("print tamtam");

Py_Finalize();
return 0;
}


$ ./a.out
10000

It gets a PyObject* to the main module, create an int Python object, set it to an attribute "tamtam" in the main module, and print it.

- You need to dereference po_int10000 after using it with Py_XDECREF() macro (or Py_DECREF if you are sure po_int10000 is non-null).

- PyObject_SetAttrString increments po_int10000 referene counter so when you execute Py_XDECREF its reference count still doesn't become zero, that's why subsequent PyRun_SimpleString("print tamtam") can print out that value.

- You can use PyImport_AddModule only when you know Python has the module already (You can use it when the module is not imported, but no module is imported and it returns an empty new module). Use PyImport_ImportModule() instead. Unlike PyImport_AddModule, you need to dereference it after using it. Usually you can see whether you need to dereference an object or not by looking at the Python C/API reference manual.

- Usually Python C/API functions that returns PyObject* returns NULL on failure. You'll need to check it. Though I omitted it for simplicity in this tut, you should at least take a look at Exception Handling chapter o the Python C/API manual.

- Many Python C/API functions which name end with "String" have a sibling. For example, PyObject_SetAttrString() has a sibling PyObject_SetAttr(), which takes PyObject* for attribute name instead of const char*. If you use it, it'll be like

PyObject* po_int10000 = PyInt_FromLong(10000);
PyObject* po_tamtam = PyString_FromString("tamtam");
PyObject_SetAttr(po_main, po_tamtam, po_int10000);

- More about Python reference lator (see [main5.cpp])

- "po_" prefix is just my naming convention indicating it's a PyObject*. You can follow it, or you can ignore it, either.

[main3.cpp]

#include <Python.h>
#include <iostream>

int main()
{
Py_Initialize();

PyObject* po_main = PyImport_AddModule("__main__");
PyObject* po_int10000 = PyInt_FromLong(10000);
std::cout << po_int10000->ob_refcnt << std::endl;
PyObject_SetAttrString(po_main, "tamtam", po_int10000);
std::cout << po_int10000->ob_refcnt << std::endl;
Py_XDECREF(po_int10000);
PyRun_SimpleString("print tamtam");

Py_Finalize();
return 0;
}


$ ./a.out
1
2
10000

It demonstrates how to print out an object(here an int object)'s reference counter. Usually it's enough to look at the manual but sometimes you will want to confirm it.

- include Python.h before any standard libraries

- when you change 10000 to 10, the result will change.

./a.out
12
13
10

That's because Python is using 'int 10' object somewhere else already (it may be for internal purposes). Also when Python interpreter has started, it already has a number of frequently used immutable objects so that it doesn't have to be recreated when requested.

[main4.cpp]

#include <Python.h>

int main()
{
Py_Initialize();

PyObject* po_dict = PyDict_New();
PyObject* po_bar = PyString_FromString("bar");
PyDict_SetItemString(po_dict, "foo", po_bar);
Py_DECREF(po_bar);

PyObject* po_main = PyImport_AddModule("__main__");
PyObject_SetAttrString(po_main, "tamtam", po_dict);
Py_DECREF(po_dict);

PyRun_SimpleString("print tamtam");

Py_Finalize();
return 0;
}


$ ./a.out
{'foo': 'bar'}

A little bit more practical example that uses Python dictionary.

- Refer to Python C/API reference manual to see what functions exists for dictionary objects.

- A Python dictionary is a Python object (ofcourse), so those functions in the Object Protocol an be used for a dictionary too. And if you see the manual, you'll see the documentation for dictionay is under "Mapping Objects". It means you can use functions in the "Mapping Protocol" section for a dictionary object.

- As you may have guessed, PyDict_SetItemString() has a sibling PyDict_SetItem().


[main5.cpp]

#include <Python.h>
#include <iostream>
#include <string>

int main()
{
Py_Initialize();

PyRun_SimpleString("tamtam = {'foo':'bar'}");
PyObject* po_main = PyImport_AddModule("__main__");
PyObject* po_dict = PyObject_GetAttrString(po_main, "tamtam");

PyObject* po_value = PyDict_GetItemString(po_dict, "foo");
if(PyObject_IsInstance(po_value, (PyObject*)&PyString_Type))
{
std::string valstr = PyString_AsString(po_value);
std::cout << valstr << std::endl;
}

Py_DECREF(po_value);
Py_DECREF(po_dict);

Py_Finalize();
return 0;
}


$ ./a.out
bar

How to get a value from a dictionary.

- If you see a document for PyObject_GetAttrString() and PyObject_GetAttrString(), you'll see it says

return Value: New reference.

It means you are responsible to dereference it after using it. Sometimes you'll see

return Value: Borrowed reference.

It means you don't have to dereference it after using it. If you see PyImport_AddModule() doc, it says so, that's why the code doesn't have Py_DECREF(po_main);

Sometimes you may find a term "steal a reference". It means when you pass an object that you are responsible to dereference to such a Python C/API function, Python does it so you don't have to do it any more.

- PyObject_IsInstance(po_value, (PyObject*)&PyString_Type) This is how to type check an object. PyString_Type is also a Python objecjt (everything is a Python object in Python) so you can cast it but I will not go into details. You can find type objects like PyString_Type in the manual.


[main6.cpp]

#include <Python.h>
#include <iostream>
#include <string>

int main()
{
Py_Initialize();

PyRun_SimpleString(
"def myfunc(a, b, c):¥n"
" print a¥n"
" return b + c¥n"
);

PyObject* po_main = PyImport_AddModule("__main__");
PyObject* po_func = PyObject_GetAttrString(po_main, "myfunc");

PyObject* po_int = PyInt_FromLong(10000);
PyObject* po_foo = PyString_FromString("foo");
PyObject* po_bar = PyString_FromString("bar");

PyObject* po_result = PyObject_CallFunctionObjArgs(po_func, po_int, po_foo, po_bar, NULL);
std::string resultstr = PyString_AsString(po_result);
std::cout << resultstr << std::endl;

Py_DECREF(po_result);
Py_DECREF(po_bar);
Py_DECREF(po_foo);
Py_DECREF(po_int);
Py_DECREF(po_func);

Py_Finalize();
return 0;
}


$ ./a.out
10000
foobar

Function call example.

- PyObject protocol has a set of function/method call API functions.
- These functions return a PyObject* that represents the return value of the python function. If the function fails (either the function call itself fails or the function call throws an exception and nobody catchs), it returns NULL. Note that it doesn't return NULL if None is returned, in that case a PyObject* that points to the None object is returned.



To be continued to the next post which shows how you can write these examples using boost::python.

11 comments:

M2ccc said...

Nice intro tutorial, thanks!

hohehohe2 [at] gmail.com said...

That's all right!

Shanti Pothapragada said...
This comment has been removed by the author.
Shanti Pothapragada said...

Thank you! This helped me. I've now posted a simple CPython/C extension example, with notes one how to compile it appropriately at http://rgbdreamer.blogspot.com/2011/04/python-c-extension-example.html

hohehohe2 [at] gmail.com said...

Shanti,
Thanks for sharing the example on your blog!

alex said...

hello, can you help me with my app which uses embedded python...it crashes..can i tell you details?

Shanti Pothapragada said...

I'll take a look if it's small. rgbDreamer@gmail.com .

Arash Ghasemi said...

Finally I found that the reason that my code doesn't compile is that I forgot to include "-lpython -lstdc++" in the command line. This should be documented in Python official document. It just killed my three hours to find the bug. I really appreciate your post.
Arash

Ravi said...

Very useful tutorial....

hohehohe2 [at] gmail.com said...

Cheers ;)

srinya said...

Hi,
This tutorial really helped me a lot;
Can you please share the PyObj library methods to invoke python class from C++ file. I want to invoke python class which has __init() as a subtype and with super class initialization