You can (kind of) run Django in subinterpreters

In November of 2023 cpython core developer Anthony Shaw wrote about sub-interpreters in his post: Running Python Parallel Applications with Sub Interpreters. In there, he previewed the ongoing development for subinterpreters and highlighted his attempt to run web applications in a subinterpreter.

Anthony conclude his post having adapted hypercorn to run with subinterpreters and managing to run FastAPI and Flask applications. He notably couldn't run Django due to the standard library's own datetime module.

It has now been 2 and half years and a lot has changed. Both concurrent.interpreters and free-threading have seen official releases, with subinterpreters especially seeing a lot of changes w.r.t the Python interface.

Continuing the Work

So I decided to give it a try myself. Since enough had changed since Anthony's implementation, I did much of the code changes from scratch.

This turns out to be surprisingly simple, as we have a much better Python interface for interpreters in in Python 3.14.

Sharing Sockets

In the original multiprocessing version, hypercorn shares the same set of sockets between all the worker processes. The processes will all consume this set of sockets thereby distributing requests.

So we will need to do the same thing here as well. Sockets out of the box can be shared between a parent process and any of its sub-processes but that's not the case for sub-interpreters.

In my previous blog post I came to realise that we can just rely on pickle to do a lot of the message passing so long as we understand what actual information is being serialised. We can do the same here, in fact, Python lets you to customise what state is passed to the pickle (de)serialiser via __getstate__ and __setstate__:

class Sockets:
    secure_sockets: list[socket.socket]
    insecure_sockets: list[socket.socket]
    quic_sockets: list[socket.socket]

    def __getstate__(self):
        """Prepares the socket for transport."""
        return {
            "secure_sockets": [
                {
                    "fd": sock.fileno(),
                    "family": sock.family,
                    "type": sock.type,
                    "proto": sock.proto,
                }
                for sock in self.secure_sockets
            ],
            "insecure_sockets": [
                {
                    "fd": sock.fileno(),
                    "family": sock.family,
                    "type": sock.type,
                    "proto": sock.proto,
                }
                for sock in self.insecure_sockets
            ],
            "quic_sockets": [
                {
                    "fd": sock.fileno(),
                    "family": sock.family,
                    "type": sock.type,
                    "proto": sock.proto,
                }
                for sock in self.quic_sockets
            ],
        }

    def __setstate__(self, state):
        """Reconstructs the socket in the new interpreter."""
        for key, socks in state.items():
            setattr(
                self,
                key,
                [
                    socket.fromfd(raw["fd"], raw["family"], raw["type"], raw["proto"])
                    for raw in socks
                ],
            )

Simply, this extracts the enough metadata of the socket (e.g. file descriptor number) to pass to the sub-interpreter. The sub-interpreter then calls socket.fromfd to rebuild the socket in the interpreter.

I believe this is the same way Anthony shares sockets the only difference being that we leverage pickle here to minimise code changes.

Calling the Sub-Interpreter

When I first started playing around with interpreters I assumed we had to use the exec method, with queues for passing data.

But over an embarrassingly lengthy period of time, I finally [figured out] that we can just use the higher level function works just as well, or maybe even better. For this I used Interpreter.call:

def run_interp[**P](fn: Callable[P, Any], *args: P.args, **kwargs: P.kwargs) -> None:
    interp = interpreters.create()
    try:
        interp.call(fn, *args, **kwargs)
    finally:
        interp.close()


thread = Thread(
    target=run_interp, args=(worker_func, config, sockets, shutdown_indicator)
)
thread.start()

Stopping the Interpreters

Finally, we need to consider how the main interpreter can communicate with the worker interpreters to shutdown. Let's first look at how this is done with processes.

The main processes passes an Event as an indicator for the subprocess to shutdown.

The subprocess periodically checks to see if the event is set:

async def check_multiprocess_shutdown_event(
    shutdown_event: EventType, sleep: Callable[[float], Awaitable[Any]]
) -> None:
    while True:
        if shutdown_event.is_set():
            return
        await sleep(0.1)

It's important to note that whilst an event is often waited on to synchronise the timing in sub-processes, in hypercorn, it's just used to conveniently store some global state. The state is effectively polled in the sub-process for changes.

There is no equivalent to Event for interpreters, but there is a very simple and powerful way to share global mutable state. This is done via directly sharing a memoryview.

Let's start by creating a memoryview wrapping an array of 1 byte. we'll use the value to indicate whether a shutdown is indicated or not. 0 by default to mean no shutdown and 1 to mean shutdown.

shutdown_indicator = memoryview(array("B", [0]))


async def check_multiprocess_shutdown_event(
    shutdown_indicator: memoryview[bytes], sleep: Callable[[float], Awaitable[Any]]
) -> None:
    while True:
        if shutdown_indicator[0]:
            return
        await sleep(0.1)

This approach leverages the powerful shared memory mechanisms of interpreters without changing the code too much.

Testing this out

All the code can be found at hypercorn-subinterpreters, I plan on tidying this up more before publishing to pypi.

uv add "git+https://github.com/Jamie-Chang/hypercorn-subinterpreters"

You can test this out with a simple ASGI^*^ webserver, for example:

async def app(scope, receive, send):
    if scope["type"] != "http":
        raise Exception("Only the HTTP protocol is supported")

    await send({
        'type': 'http.response.start',
        'status': 200,
        'headers': [
            (b'content-type', b'text/plain'),
            (b'content-length', b'5'),
        ],
    })
    await send({
        'type': 'http.response.body',
        'body': b'hello',
    })

and then running this with:

hypercorn main:app --workers 10

This worked very well for me, and notably the startup speed was higher than the process version. I'll leave the task of benchmarking to my future self or other parties as performance is not my main focus.

The Compatibility Landmine

It's easy to run a dependency free web app as above, you can even run it in the web.

The bigger question is how well this runs a typical Python web application which would typically use a framework and also connect to a database.

Luckily, I recently went through the trouble of writing the same app using flask, fastapi and django to run some benchmarks. To better simulate a real workload the apps connect to postgres using either a pooled connection or a single connection.

So here's what worked and what didn't.

FastAPI

FastAPI is hands down the most popular web framework and is unfortunately the biggest disappointment.

ImportError: module pydantic_core._pydantic_core does not support loading in subinterpreters

The trouble lies with Pydantic, Pydantic v2 (the current version) is written in rust using PyO3. PyO3 does not currently support subinterpreters . I'm pretty sure this also applies to other rust-based libraries like polars

Django

Django is a much better story. Django itself is mostly pure Python, so it works well on its own. I wasn't expecting it to work with postgres as the postgres driver psycopg is a native library. However I was pleasantly surprised to find it working.

There are unfortunately still caveats here, psycopg won't work when used as a connection pool:

RuntimeError: daemon threads are disabled in this (sub)interpreter

In my prior benchmarking post, I discovered that psycopg pools actively manages connections using threads. The threads are daemon threads, meaning they exit when the process exits. The sub-interpreters have a lifetime smaller than processes lifetime, so they cannot create and own daemon threads.

Flask with Sqlalchemy

Like Django Flask also runs fine, and with flask we can play around with other database library options. sqlalchemy is the natural choice here. And unlike psycopg, it supports pooling without background threads.

from asgiref.wsgi import WsgiToAsgi
from flask import Flask
from sqlalchemy import create_engine

app = Flask(__name__)

# Flask is wsgi, need to be wrapped
asgi_app = WsgiToAsgi(app)

# sqlalchemy engine is Pooled by default
engine = create_engine(os.getenv("DSN"))

...

Other Compatibility Issues

Outside of web applications there are issues in a lot of native code libraries.

Data Science

One might imagine a good use case for interpreters are data science applications. They tend to be CPU intensive and a lot of exiting code uses multiprocessing for parallelism. Interpreters should be a natural replacement, however most of the core libraries don't support interpreters. These libraries include

  • pyarrow
  • pandas
  • numpy
  • polars

In contrast many libraries have prioritised on free-threading compatibility. And there's even a tracking page for it.

My understanding is that whilst the work to support free-threading and sub-interpreters are not the same, they share some similarities. They tend to involve some reorganisation for native memory management. I believe that interpreters support will come given more time, but I'm not too familiar with the development process for this.

Serialisation

As mentioned before, Pydantic doesn't currently work, but it's not the only serialisation library with this issue. I also noticed problems with msgspec. I did test protobuf as well, and thankfully that worked! So there is at least one good serialisation option.

Pytest

When 3.14 initially came out, I wanted to experiment with running tests in an isolated interpreter. This is also not supported, but there seem to be some interest in this at least.

Closing Words

I've been a big proponent of sub-interpreters since its inception, I do believe it has a place in Python even with better support for free-threading across the wider industry.

I was a bit surprised at the incompatibilities, since I always viewed sub-interpreters as being closer to multi-processing. However, I think it's understandable that a new concurrency model was never going to be an easy change and support won't come over night.

I think it's time for me to roll up my sleeve and learn about C extensions in Python, so that I can better understand this complicated transition process. And perhaps I can help in this transition.

social