Joren Verspeurt

2019-09-02

Performant HTTP with Aiohttp in Python 3

AIO-what?

One of the criticisms regularly lobbed at Python is that it doesn’t have a good concurrency story. If you’re an experienced Python programmer you’ve probably heard of the GIL, or Global Interpreter Lock. This lock protects access to Python objects, such that only one thread can be executing bytecode at a time. It’s needed because Python (the CPython standard implementation in particular) doesn’t have thread-safe memory management. Strange things could start happening if we let multiple threads operate on the same memory, and the Python approach to solving this problem is to simply forbid this. And it gets worse, the GIL causes overhead due to context switches, such that executing the same code in 2 different threads takes longer than executing it twice in the same thread.

Source: http://www.dabeaz.com/python/UnderstandingGIL.pdf

So, is concurrency impossible in Python? Not at all! Actually, we have a couple of different tools available to us, each with their own advantages and disadvantages. There’s multiprocessing, which works well in many situations in practice, even though there’s a bigger initial overhead compared to multithreading. Also, even though I mentioned that there are some downsides to multithreading in Python compared to other languages that doesn’t mean they’re not useful as a way to structure how your code runs. There is also a third option, which I will discuss in more detail: asynchronous execution.

There was somewhat of an async boom among programming languages a couple of years ago, and now it seems like every mainstream language has integrated not only a way of enabling asynchronous execution through futures or coroutines or event handlers, but also the async/await syntax. More about that later. So what is it? In Python, async works a lot like threading at first glance, with one big difference: scheduling. If you use threads in Python your operating system kernel is aware of these and switches between them when it sees fit. This is called pre-emptive multitasking, the code executing in the thread has no idea it is being interrupted and has no control over when it happens. With async, Python itself does the scheduling, and this enables what is called cooperative multitasking. Code you write that takes advantage of this approach has to somehow yield control of the interpreter, so that other code can continue executing. In Javascript this yielding of control happens on every method call, in Python you have more control over when this does and doesn’t happen.

Why would we want to use this? Well, it turns out that there is a class of operations where it makes perfect sense to give up control for a while: I/O! Relative to executing program logic getting something from your hard disk takes a long time, and getting an answer to a network request takes even longer. If we have to wait anyway, why not do something useful in the meantime? This also means that executing multiple I/O tasks can happen almost simultaneously, giving some of the benefits of genuine concurrent execution, as long as the time to start a new task is short enough. In I/O-bound programs where you’re reading and writing a lot of files, doing a lot of database communication, or sending a lot of network requests, this can really speed things up.

A little history

Cooperative multitasking in Python isn’t new. People have been using Python to implement web servers for quite some time, and have been looking for ways to increase the number of requests they can handle per minute for just as long, many of the solutions based on some kind of event loop with cooperative scheduling. Twisted, an event-driven network programming framework, has existed since 2002, and is still used today. It’s the magic behind projects like Scrapy, and Twitch uses it for their backend. As it stands, there is no lack of libraries and frameworks that implement some form of async workflow, many of them specifically for networking. To name a few:

  • tornado, a web server/framework — gevent, a networking library based on greenlet
  • curio, a library for concurrent I/O by Python community heavyweight Dabeaz
  • trio, a library inspired by some of the others, aiming to make async programming more approachable
Logos for Tornado and Trio

What was lacking for a long time, however, was a language-integrated way of doing async, with functionality packaged right in the standard library and handy keywords that allow for a straighter, more convenient programming flow, avoiding the callback hell that is so familiar to long-time Javascript programmers. Recognizing this need, PEP 3156 introduced BDFL Guido van Rossum’s vision for asynchronous I/O in Python, implemented in Python 3.3 as the asyncio module. PEP 492, implemented in Python 3.5, brought us the async/await keywords, and since then there have been many more additions and improvements, notably in version 3.7.

To be clear, I’m not advocating the use of asyncio over the alternatives I mentioned earlier, they all have their strengths and weaknesses, but as asyncio is part of the Python 3 standard library it’s a good place to start.

Welcome to the land of asyncio

So, say you’re convinced you’d like to try this async thing, and you’ve decided the built-in asyncio is the way to go, which goodies are available to you? First there’s the asyncio module itself, which provides a few primitives we will need, and on which the other libraries mentioned in this section are built. We’ll look at some examples of these primitives in the practical section below.

The core of asyncio is the event loop. The event loop is what enables all of the cooperative scheduling, and at a high level it’s exactly what it sounds like, just a simple loop that keeps track of everything and handles events one by one when they occur. The event loop tracks Tasks, these get put in a queue from where they can be woken up when whatever they’re waiting for (if anything) has completed. Tasks can be checked to see if they’re done, or cancelled if their result is no longer needed or a timeout has been reached. A Task, in turn, wraps a Coroutine, this is the code that’s actually getting executed when a Task is activated again. We’ll see how to use these in the examples. Just remember that because the event loop only actually runs one task at a time and only switches when that task yields control, if you have one task that keeps on running the others that are still in the queue will never get finished.

Source: https://luminousmen.com/post/asynchronous-programming-cooperative-multitasking

A lot of the libraries you’ll probably be using when working with asyncio are contained in the aio-libs group on GitHub. This includes aiohttp, which we’ll discuss in more detail below, but also libraries for a large number of other tasks, such as aioftp, aiopg (for PostgreSQL), aioredis, aioelasticsearch, aiokafka, aiodocker, … There are some other libraries which build on these, notably a couple of web framework-like libraries, but I’ll let you discover those by yourself.

Aiohttp is by far the aio-libs project with the most activity, and is arguably the main use case for asyncio. Aiohttp provides both an HTTP client and server, with support for Web-Sockets and such niceties as request handling middleware and pluggable routing. The wiki provides two minimal examples to get started with either the client or the server right on the front page, so it’s really easy to quickly try it out.

Show me the code!

Let’s start with a pretty minimal example:

import asyncio

async def a_square(x):
    print(f'Asynchronously squaring {x}!')
    return x ** 2

# This will only work in Python 3.7 and above
asyncio.run(a_square(2))

If you’re not using 3.7 or above you’ll need to replace the run call by something like:

loop = asyncio.get_event_loop()
try:
    loop.run_until_complete(a_square(2))
finally:
    loop.close()

Pretty easy, isn’t it? That a_square is a coroutine function, the async in front of the function definition means that if you call it it doesn’t really start doing anything, it’s only when you wrap it in a Task and the event loop wakes it up that any computation happens. In most cases we don’t need to do this explicitly. Here the run function took care of everything, it started a loop for us, wrapped our coroutine object and scheduled it as a task, after which it immediately started running, because there were no other tasks.

Let’s look at something a little more complicated: we can chain coroutines by making them wait for each other. For example:

import asyncio

async def sleeper(x):
    await asyncio.sleep(x)
    return x + 1

async def waiter(x):
    sleepy_result = (await sleepy(x)) ** 2
    return sleepy_result

# python >= 3.7
asyncio.run(waiter(2))

Here the waiter coroutine has to wait for the sleeper coroutine to finish. asyncio.sleep() is an async version of time.sleep() so it’s also a coroutine function, just like waiter and sleeper. The await keyword here does a couple of things, it ensures that whatever we pass to it is wrapped in a Task, it schedules that Task in the active loop’s queue and when the Task is done it returns the result. This also means, as you might have gathered from its name, that the code after that line only starts executing when that result is available. If you want to schedule a Task and continue executing, that’s possible too:

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def fetch_all(urls):
    async with aiohttp.ClientSession() as session:
        texts = await asyncio.gather(*[
            fetch(session, url)
            for url in urls
        ])
        return texts

years_to_fetch = [f'https://en.wikipedia.org/wiki/{year}' for year in range(1990, 2020)]
asyncio.run(fetch_all(years_to_fetch))

There are a couple of interesting things here. First we see the create_task function, which was added in Python 3.7, and takes a coroutine, which it wraps in a Task and schedules. Before 3.7 the equivalent was ensure_future (which is actually a bit more low-level, but accomplishes the same thing for us here. Then there’s gather, which as its name suggests gathers the results of all the tasks and coroutines you pass it and returns them as a list.

Now that we’ve got the basics down, let’s get to what we’re really here for: aiohttp! The following example shows off the client side:

import aiohttp
import asyncio

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def fetch_all(urls):
    async with aiohttp.ClientSession() as session:
        texts = await asyncio.gather(*[
            fetch(session, url)
            for url in urls
        ])
        return texts

years_to_fetch = [f'https://en.wikipedia.org/wiki/{year}' for year in range(1990, 2020)]
asyncio.run(fetch_all(years_to_fetch))

This code is a version of the example on the front page of the aiohttp docs with multiple requests, it gets the (HTML) text of the Wikipedia pages for the years 1990 to 2019. These GET requests are launched concurrently, so executing all of them takes about as long as just executing the longest one. Of course there are limits to the number of requests you can execute like this, and if you need to do thousands you should probably launch them in chunks, but doing them as shown should work in most cases. As with the normal requests library you can do much more than just gets, but I will refer you to the docs for all of the different options.

The aiohttp library also includes a server component, complete with router and all the basics you need for a simple web server or REST API. The following example shows some of the things you can do with it:

from aiohttp import web

async def empty(request):
    return web.Response()

async def get_json(request):
    return web.json_response({
        'path_name_variable': request.match_info.get('name'),
        'query_param_a': request.rel_url.query.get('a'),
        'query_param_b': request.rel_url.query.get('b'),
    })

async def redirected(request):
    location = request.app.router['default'].url_for()
    raise web.HTTPFound(location=location)

async def index(request):
    return web.FileResponse('./index.html')

app = web.Application()
app.add_routes([
    web.get('/', redirected),
    web.get('/empty', empty),
    web.get('/json/{name}', get_json),
    web.static('/index', index, name='default'),
])

web.run_app(app)

This is the last example and also the most elaborate. It represents a web application with 4 routes, one which is redirected, one which returns an empty response, one which returns a static html file, and one which has an optional path variable and query parameters and returns a json document.

Looping at ultraviolet speeds

One of the interesting things about the asyncio event loop is the fact that it is pluggable. This means that you can supply your own implementation. Though the standard implementation, which is based on libev, is already quite good, there is another option. A faster option! It’s called uvloop, and it’s based on libuv, the asynchronous I/O library originally developed for node.js that is now also used in Julia, among other things. The uvloop library is easy to use and makes just about anything you do with asyncio go faster, so it would be a shame not to mention it here. Find out how to get started with it here. Still not convinced it’s that fast? Find some benchmark results here.

Conclusion

You may be wondering at this point why there’s an article on performing and handling HTTP requests in an asynchronous way in Python on the Medium blog for an AI company. Well, a good machine learning model isn’t worth much if you can’t put it into production, and it’s quite common for predictions made with such a model to be served up with some sort of REST API clients can call. We also sometimes need to pull in a lot of data over HTTP from different sources, and the number of requests can grow pretty quickly. In these cases good performance is important, and for now it seems that pure Aiohttp with uvloop can’t be beaten in terms of raw performance, if Python is your language of choice as it is for us. So if you have workloads like these, why not try some async for yourself today!