The Undocumented SWIG

Building High Performance Integrated Python Extensions

For those who have never ventured into the dark underworld of the Python C‑Extension API, you may believe that it is as fluid and rewarding as the rest of the Python ecosystem. I regret to inform you that this is not the case. Line 13 of The Zen of Python says:

There should be one – and preferably only one – obvious way to do it.

The C‑Extension API is an excellent example of what happens when one completely ignores that advice. There are two (incompatible) ways to export a module, a half-dozen documented ways to parse an argument list, and no less than nine options for calling a method. The documentation for this mess is excellent by C Library standards, but falls woefully short of the gold standard set by the rest of the Python docs.

Thankfully there is a better way, the excellent Simplified Wrapper and Interface Generator, better known as SWIG. In this three part series we’ll take a crash course in typical SWIG usage, discuss some advanced features like typemaps, templates, and ownership semantics, and then do a deep dive into using the SWIG runtime header to allow for tight, seamless integration of C/C++ code written specifically to accelerate Python modules.

Introducing SWIG

SWIG is the 8th Wonder of the Software World, it takes an incredibly complicated job and makes it a transparent part of your build process. SWIG cleanly integrates C and C++ routines into any of a dozen target languages using their native ABIs and foreign function interfaces. For many real-world use cases, not trivial example code, SWIG can do this out-of-the-box with barely any configuration whatsoever.

Unlike the C‑Extension API, SWIG has top-notch documentation full of example code and extensive reference material. This post is not a substitute for that documentation, it’s here to rapidly get the reader up to speed with the bare-minimum required to follow along with the other posts in the series. With that said, let’s begin our journey.

Figure 1 contains two files. The first is a simple C++ header containing a POD struct. All C‑esque code for these examples will be C++, but the exact same principles hold when working with pure C. The second file is called the interface file, and it’s how we’re going to instruct SWIG to build all the necessary code to interact with Python.

All code examples can also be found in this companion repository, along with build files.

Before we explore the interface file further, try building these files to see what happens. For Python the SWIG command to use is:

swig -c++ -python -py3 Agent.i

The switches here do what you expect, configuring SWIG to accept C++ as input (instead of C), and produce a Python Extension (specifically Python 3) as output. The extensions consists of two files, CAgent.py and Agent_wrap.cxx.

If you start Python and try to import CAgent right now, you’ll get an import error for a module called _CAgent. CAgent.py is a proxy for the actual extension, Agent_wrap.cxx, which we still need to build. How you choose to build this is up to you and your workflow, the following command will build the extension if you’re using the CMakeLists.txt included with the companion repo:

cmake . && cmake --build .

Now open a Python REPL in the same folder that you’ve built the extension and import CAgent. You can create an AgentUpdate object and play with it, just like a native Python class. Do AgentUpdate objects act the way you expect them to? What are the differences from a normal Python object? Hint: Check out the __dict__.

Interrogating the Interface

Now we’ll explore that interface file in more depth. SWIG interface files are typically quite trivial, and our interface file is barely going to change at all in this entire series, but that doesn’t mean they’re not powerful. Rather, SWIG itself is so powerful that we rarely need to leverage the many capabilities of interface files very much.

Starting with the first line:

%module CAgent

The module directive gives the resulting Python module it’s name. I typically prefix SWIG-generated modules with “C” to make them easy to tell apart at a glance and easy to add to .gitignore.

%{
#include "Agent.hpp"
%}

All code between %{ and %} directives is included literally in the generated wrapper. This is typically used to include headers necessary to build the wrapper, which we do here.

%include <stdint.i>
%include <std_string.i>

The %include directive in SWIG works the same way #include does in C/C++, the preprocessor places a copy of the include’d file into the unit. Here we’re including standard SWIG typemaps for interacting C++ strings and the standard integer types.

This raises the awkward question of “What is a typemap?” For now, I’m going to quote SWIG’s documentation:

Let’s start with a short disclaimer that “typemaps” are an advanced customization feature that provide direct access to SWIG’s low-level code generator. Not only that, they are an integral part of the SWIG C++ type system (a non-trivial topic of its own).

Suffice to say the concept of typemaps is outside the scope of this crash course. We need these two includes because they allow us to transparently interact with standard integers and C++ strings, but that’s as much as this post is going to explore them.

%include "Agent.hpp"

The final %include takes all the declarations from Agent.hpp and places them in our interface file. SWIG parses these declarations and generates wrapper code based on them.

As mentioned earlier, there are far more powerful directives available than the ones explored here. Additionally, SWIG has a library of support files that build yet more advanced functionality on top of those directives. Rather than trying to learn all of SWIG in one fell swoop, it’s best to just learn on the go.

Getting Classy

Normally this is the part of the tutorial where we add a second layer of complexity to the material introduced in the first couple sections. But thanks to SWIG, there is no additional complexity. Classes, methods, and functions all work identically to the basic POD we’re already familiar with.

Figure 2 creates a proper class with a method; it also adds an implementation file, Agent.cpp, as a matter of good C++ practice but not necessity. SWIG only needs to see declarations, not definitions, so it doesn’t care about this file. The result builds and acts the way you expect it to without any changes to the interface file.

Again I encourage you to play with the resulting CAgent module, or even to modify the SecretAgent’s C++ source code. For code that only wants to call into C/C++, and does not need to call back into Python, this is as complicated as SWIG gets for most use cases.

As a fun exercise, Figure 3 uses these same techniques to add a combat function to the SecretAgent class, with an enum return type.

Of note, the combat_result member enum is translated to a set of member variables for the Python class, which are mapped to globals defined in the underlying shared library. This means they’re accessed almost identically to the enum members in C++.

What’s Next

None of the techniques discussed in this post are Python specific, they can be applied to any of the target languages that SWIG supports. In the next part we’ll talk a little more about typemaps and using them to interact with more complex types than integers and strings. This involves calling Python.h specific functions, and will begin our descent into the less traveled corners of SWIG usage.

The images used in this post are public domain, made available thanks to the invaluable work of Liam Quin at fromoldbooks.org