AWS Lambda CPU and Memory Profiling (Node.js)

Profiling cloud applications

Performance profiling is essential for optimizing and fixing application’s resource consumption, response time and failures. A performance issue without an execution profile is like an error without a stack trace. It will lead to a lot of manual work to get to the root cause.

Profiling cloud applications and functions such as AWS Lambda require special profiling tools designed for the cloud production environments, since it is quite unrealistic to simulate cloud environment with all its data, traffic and configuration locally.

Adding StackImpact profiler agent to the Lambda function

StackImpact cloud profiler was specifically designed for production environments. Unlike traditional profilers, which usually only run locally, StackImpact profiler runs inside of the cloud applications and completely automates the burdensome process of profiling of CPU, memory allocations, and other aspects of the application. Additionally it reports various health metrics and errors.

The following simple AWS Lambda function simulates some CPU work and a memory allocations. Adding the StackImpact agent is only one a couple of statement. Make sure you install the StackImpact Node.js package with npm install stackimpact locally before bundling the lambda package.

const stackimpact = require('stackimpact');

const agent = stackimpact.start({
  agentKey: 'agent key here',
  appName: 'LambdaDemoNode',
  appEnvironment: 'prod',
  autoProfiling: false,
  debug: true
});

function simulateCpuWork() {
  for(let i = 0; i < 1000000; i++) {
    Math.random();
  }
}

let mem;
function simulateMemAlloc() {
  let mem = [];
  for(let i = 0; i < 10000; i++) {
    mem.push({v: i});
  }
}

exports.handler = function(event, context, callback) {
  const span = agent.profile();

  simulateCpuWork();
  simulateMemAlloc();

  setTimeout(() => {
    let response = {
      statusCode: 200,
      body: 'Done.'
    };

    span.stop();

    agent.report(() => {
      callback(null, response);
    });
  }, Math.random() * 10);
};

You can get an agent key by signing up for a free account. Please note that autoProfiling option is set to false. This is done because the Node.js process freezes between requests and the agent cannot use timers to report performance data to the Dashboard. Therefore the report() method takes over the periodic data reporting.

Locating CPU hot spots and memory leaks

When constantly generating requests against this lambda function, the CPU hot spots can be located in the reported profiles.

The memory allocation rate for function calls can be found in the Hot spots section as well. Using these rates we can locate where exactly is the most of the memory allocated, which is not immediately released. The allocation profiler is disabled by default, since V8’s heap sampling is still experimental. To enable, add allocationProfilerDisabled: false to startup options.

Continuous performance profiling

One of the important benefits of continuously profiling application is that profiles can be historically analysed and compared. Unlike one-time call graphs, a call graph history per process and application allow for a much deeper understanding of application execution. For example, a root cause of the performance regression in a new version of the function can be easily identified.

Since there can be many instances of Lambda containers, not every one of them will have active profiling agents, but only a small subset (adjustable from Dashboard).

See full documentation for more details.