Continuous CPU Profiling for Python Applications

There are multiple reasons a program will consume more CPU resources than excepted. In case of high computational complexity of an algorithm, the amount of data it operates on will drive the CPU usage. For I/O intensive programs, data processing may be the bottleneck. Garbage collection activity is another usual suspect.

To optimize or troubleshoot application’s consumption of CPU resources, a CPU profiler is necessary. Without it, it would take a lot of guesswork, code modifications and diagnostics to localize the CPU hot spots, i.e. the lines of code, where the most of the CPU is being used.

The problem of profiling production applications

The profilers that are traditionally used in development environment are not suitable for modern cloud applications. One reason is the data, which is usually not available offline for debugging and problem reproduction purposes. Think of machine learning algorithms or other data-intensive applications. Another reason is the difference of production and development environments in terms of configuration, infrastructure, types of possible errors, etc. Last but not least, the development profilers’ overhead is typically very high.

The missing profile history

Another problem is irregular or on-demand one-off profiling. For the apps that are constantly running in production it is important to know the dynamics of the hot spots with associated historical context, where context can be an application release version, runtime version or other metrics.

In other words, it is equally important to understand when the problem is started and for what reason.

Using StackImpact for continuous CPU profiling

StackImpact is designed for profiling and monitoring production environments. It completely automates the collection of CPU profiles. The StackImpact agent, which is initialized in the application, records and reports regular CPU profiles to the Dashboard. Here is how to add the agent to the application:

Get the agent key at stackimpact.com, pip install stackimpact and add the following code snippet to the main thread of your application.

import stackimpact

...

agent = stackimpact.start(
    agent_key = 'agent key here',
    app_name = 'MyPythonApp')

See documentation for detailed setup instructions. The agent is available on GitHub.

After restarting/deploying the application, the profiles will be available in the Dashboard in a historically comparable form with context information for each profile.

Similar profile history is automatically available for:

  • Memory allocations
  • Blocking calls
  • Errors

Metrics from Python runtime are also available in the Dashboard.