CPU Profiling in Production Node.js Applications

There are multiple reasons why a program will consume more CPU resources than excepted. For a highly computationally complex algorithm, the amount of data it operates on will drive the CPU usage. For I/O-intensive programs, data processing may be the bottleneck. Garbage collection activity is another usual suspect.

To optimize or troubleshoot an application’s consumption of CPU resources, a CPU profiler is necessary. Without it, it would take a lot of guesswork, code modifications and diagnostics to localize the CPU hot spots, i.e. the lines of code where the most of the CPU is being used.

The challenge of profiling production and cloud applications

The profilers that are traditionally used in development environment are not suitable for modern cloud applications. One reason is that the data is usually not available offline for debugging or problem reproduction purposes. Think of machine learning algorithms or other data-intensive applications. Another reason is the difference between production and development environments in terms of configuration, infrastructure, types of possible errors, etc. Last but not least, the development profiler’s overhead is typically very high.

The missing profile history

Another problem is irregular or on-demand one-off profiling. For the apps that are constantly running in production it is important to know the dynamics of the hot spots with associated historical context (e.g. an application release version, runtime version or other metrics).

In other words, it is equally important to understand when the problem started and why.

Using StackImpact for continuous CPU profiling

StackImpact is designed for profiling and monitoring both production and development environments. It completely automates the collection of CPU profiles. The StackImpact agent is initialized in the application. It records and reports regular CPU profiles to the Dashboard.

To add the agent to your application, you’ll need to get the agent key at stackimpact.com, npm install stackimpact and add the following code snippet to the application. For applications that use cluster, i.e. master and worker configurations, the agent initialization should be done in the worker code.

const stackimpact = require('stackimpact');

...

let agent = stackimpact.start({
  agentKey: "agent key here",
  appName: "MyNodejsApp",
});

See the StackImpact Node.js agent’s GitHub page or the documentation for detailed setup instructions.

After restarting/deploying the application, the profiles will be available in the Dashboard in a historically comparable form with context information for each profile.

The CPU profiles in the following Dashboard screenshot show that a high percentage of CPU usage is caused by the garbage collection and a function in the app.js, which is the example application we used for this demonstration.

Screenshot

The memory allocation profile, in turn, will help us understand where exactly in the code most of the memory allocations happen.

Similar profile history is automatically available for asynchronous calls and errors. Metrics from Node.js runtime are also available in the Dashboard.