Performance¶

Note

This page is a stub. This page presently only has reusable snippets.

Profiling¶

Use profiling to:

Identify slow dependencies, in case faster alternatives can be easily swapped in
Find major hotspots, like a loop that runs in exponential time instead of quadratic time
Find minor hotspots, if changing language, etc. is too costly

Once a hotspot is found, the solution might be to:

Call it once, via refactoring: for example, traversing JSON once for all CoVE calculations, instead of once for each calculation, in lib-cove
Call it less, via batching: for example, reducing the number of SQL queries in Django projects
Cache the results: for example, caching mergers in Kingfisher Process
Process in parallel: for example, distributing work to multiple threads, like we do with RabbitMQ
Replace it entirely: for example, using the orjson package instead of the json library

CPU¶

For example:

cat packages.json | python -m cProfile -o code.prof ocdskit/__main__.py compile > /dev/null
gprof2dot -f pstats code.prof | dot -Tpng -o output.png
open output.png

To see where a running program is spending its time, use py-spy top.

Memory¶

For example:

pip install memory_profiler matplotlib
time mprof run libcoveoc4ids data.json
mprof plot

Reference¶

High Performance Browser Networking
Memray by Bloomberg
Computer, Enhance! course by Casey Muratori