Inline c

The most fully featured quasiquoter and the the reason that this project exists is the c quasiquoter. The c quasiquoter is designed to be a way to seamlessly use the CPython API while preserving code locality and avoiding boilerplate.

When optimizing python, we often find that very few functions are hotspots that require us to rewrite in c. Good practice says to start in python and then slowly port the slow functions into c one at a time. We don’t just want to rewrite all of it because then we lose the maintainability of python for a trivial gain. The c quasiquoter gives us even more fine control over which parts of our program can be in c by allowing us to weave sections of c into our python functions. We can even do things like rewrite a single loop in a function in c.

One of the main benifits of this approach is that we can keep the optimized c code right next to the python that it is supporting. This is a huge benifit for maintainability.

Namespace Management

The c quasiquoter allows us to manipulate the python namespace of the enclosing scope. For example:

>>> a = 1
>>> b = 'test'
>>> with $c:
...     printf("%ld\n%s\n",
...            PyLong_AsLong(a),
...            PyUnicode_AsUTF8(b));
1
test

Here we can see that the variables from the enclosing scope have been passed into our function. All python values will have the standard type of PyObject* and can be used like normal.

We can also change the namespace just like a normal context manager.

>>> a = 1
... with $c:
...     printf("%ld\n", PyLong_AsLong(a));
...     a = Py_None;
...     Py_INCREF(a);
1
>>> a is None
True

Here we can see that the enhanced with block can reassign the names in scope. This even works for the locals of a function.

Quoted Expressions

The c quasiquoter also allows for quoted expressions. Just like the enhanced with statment, the quoted expression can use the names from the enclosing scope. For example:

>>> [$c|PyLong_FromLong(2)|] + 2
4
>>> a = 2
>>> [$c|PyLong_FromLong(PyLong_AsLong(a) + 2)|]
4

Quoted expressions are built on compound statements, a gnu extension to c. These look like:

int a = ({
    int b = 1;  /* This is a new block, new declarations are allowed
    int c = 2;
    b + c;  /* The final expression is the result of the block.
});

We need this because most quoted expressions that will return to python need to remember to incref the return. For example:

>>> [$c|Py_INCREF(Py_None); Py_None|] is None
True

We need to remember to call Py_INCREF or we will get a segfault somewhere in the garbage collector at interpreter shutdown.

Note

The last semicolon is optional in c quoted expression.

Type Conversion

Because one intended use case of the c quasiquoter is optimization, there is no implicit object conversion. All names passed from the outside scope will have type PyObject*. This matches the normal CPython API conventions. There are many type specific conversion functions, for example: PyLong_AsLong or PyUnicode_AsUTF8.

This is also true for the quoted expression return value. a quasiquotes.c.CompilationError will be raised if the final expression does not have type PyObject*.

Reference Counting

CPython uses a reference counting garbage collection strategy. This means that every PyObject has an ob_refcnt field (of type Py_ssize_t. This measures the number of objects that can refer to this object. Whenever an object is added to some container, the container will Py_INCREF the object, increasing the reference count by 1. When the object is removed from the container the container will Py_DECREF the object, reducing the reference count by 1. When an object with exactly 1 reference is Py_DECREFed it will be destroyed immediatly by calling ((PyTypeObject*) Py_TYPE(ob))->tp_dealloc(ob). This will deallocate the object.

CPython documentation will also refer to the concept of borrowed references. A borrowed reference is a reference to an object that the current scope does not own. This means that the current scope is not responsible for calling Py_DECREF on this object. For example, when arguments are passed to a function, they are passed as a borrowed reference, if one wishes to hold onto the object, they must Py_INCREF it to take ownership. Some CPython API functions will return borrowed references.

Similar to the idea of borrowed reference is the idea of stealing references. This means that a function will not Py_INCREF the object but it will Py_DECREF it when it releases ownership. It is the job of the caller to ensure that they want to release ownership to the function.

quasiquotes does not help the programmer with reference counting. It is still the user’s responsibility to manage the lifetimes on their objects.

Exceptions

When a function or quoted block raises an exception, the user should call PyErr_SetString, PyErr_Format, or one of the other functions used for setting the exception state. These will mark that a failure has occurred so that the interpreter knows which type of failure happened. This is very similar to the raise keyword in python.

When an exception has been set, the function should return NULL to show that an exception as occured. After calling most CPython API functions, the user should verify that the return is not NULL. Often the user should bubble the return of NULL up, making sure to Py_DECREF all of the values they had temporary ownership of.

Compilation Caching

Whenever a quoted statement or expression is compiled, it will create a shared object next to the python source of the file. The name of the shared object will start with _qq_<kind> where kind can be either stmt or expr. This marks the type of quasiquote that was used. Then it will have the name of the module it is in. After that is an md5 hash of the body of the quoted section. Finally, there is the ABI compat string, like cpython-34m that says that this was CPython major version 3 minor version 4 compiled with PyMalloc enabled.

The quasiquoter can also be configured to cache the generated c source code or to not cache the shared objects with the keep_c and keep_so keyword arguments to the c quasiquoter.

Every compiled chunk will be cached in memory after the quasiquote has been executed once.

Every so often you will want to cleanup stale compiled shared objects. This can be done with the quasiquotes.c.c.cleanup() method, or by executing: python -m quasiquotes.c Both of these accept two arguments: path and recurse defaulting to . and True respectivly. This marks where the search for cached c and shared objects should begin and if the search should recurse through subdirectories.

Compilation Options

The c quasiquoter accepts a keyword argument: extra_compile_args which should be a sequence of string to pass to gcc. This can be used to add include directories or link against other libraries.