Performance¶
Note
This page is a stub. This page presently only has reusable snippets.
See also
Software performance depends on many choices: language (like Rust versus Python), framework (like FastAPI versus Django), architecture (e.g. map-reduce), networking (e.g. batch requests), etc. Many choices are costly to change at a later date (e.g. full rewrite).
Profiling¶
Use profiling to:
Identify slow dependencies, in case faster alternatives can be easily swapped in
Find major hotspots, like a loop that runs in exponential time instead of quadratic time
Find minor hotspots, if changing language, etc. is too costly
Once a hotspot is found, the solution might be to:
Call it once, via refactoring: for example, traversing JSON once for all CoVE calculations, instead of once for each calculation, in lib-cove
Call it less, via batching: for example, reducing the number of SQL queries in Django projects
Cache the results: for example, caching mergers in Kingfisher Process
Process in parallel: for example, distributing work to multiple threads, like we do with RabbitMQ
Replace it entirely: for example, using the orjson package instead of the
json
library
CPU¶
For example:
cat packages.json | python -m cProfile -o code.prof ocdskit/__main__.py compile > /dev/null
gprof2dot -f pstats code.prof | dot -Tpng -o output.png
open output.png
To see where a running program is spending its time, use py-spy top.
Memory¶
For example:
pip install memory_profiler matplotlib
time mprof run libcoveoc4ids data.json
mprof plot