Making an Unlimited Number of Requests with Python aiohttp + pypeln
Results
I am going to run each of the clients described here in order with 100_000
requests (for the sake of time) using the timed.sh
script.
➜ bash timed.sh python client-async-sem.py 100_000Memory usage: 352684KB Time: 154.87 seconds CPU usage: 38%
➜ bash timed.sh python client-async-as-completed.py 100_000Memory usage: 57548KB Time: 154.91 seconds CPU usage: 100%
➜ bash timed.sh python client-task-pool.py 100_000Memory usage: 58188KB Time: 153.40 seconds CPU usage: 36%
➜ bash timed.sh python client-pypeln-io.py 100_000Memory usage: 63624KB Time: 154.39 seconds CPU usage: 37%
A few things to note:
- Paweł Miech’s semaphore approach (
client-async-sem.py
) has a higher memory usage (almost 10x) and would blow up if we put a bigger number like e.g - Balaam’s continuous monitoring approach (
client-async-as-completed.py
) uses the least amount of memory, but its CPU consumption is excessive and uses 100% of one of the cores. - Both the pure
TaskPool
(client-task-pool.py
) and thepypeln.io.each
client-pypeln-io.py
) approaches have fairly similar metrics, they are equally fast, memory efficient, and have low CPU usage; possibly the best methods judging by the numbers. - If you truly want to make an unlimited number of requests you can use a iterable/generator that doesn’t terminate instead of
range
.
Conclusion
The asyncio
module and the new async/await
syntax enables us to create very powerful IO programs with Python that were once only in the grasp of languages like Erlang/Elixir, Go, or even Node.js. However, some things are hard to get right specially since there is very little material out there, libraries are just being made for these kind of tasks, and the paradigm by itself is quite different.
I hope this post is useful to those wanting to do high-performance IO applications in Python. Thanks to Andy Balaam for his post which served as an inspiration when implementing my code and for his feedback.
In the future I want to make a more real world benchmark which involves downloading, resizing, and storing a huge amount of images.
STAY TUNNED!