Profiling AWS Lambda Go Functions: CPU, Memory and Latency

Performance profiling is essential for optimizing and fixing an application’s resource consumption and latency. A performance issue without an execution profile is like an error without a stack trace. It will lead to a lot of manual work to get to the root cause.

Profiling cloud applications and functions such as AWS Lambda requires special profiling tools designed for the cloud environments (specifically FaaS) due to their restrictions on background tasks, system calls, incoming connections, etc. On top of that, it is quite unrealistic to simulate a cloud environment with all its data, traffic and configuration locally.

Adding the StackImpact profiler agent to the Lambda function

StackImpact cloud profiler was specifically designed for cloud production environments. Unlike traditional profilers, which usually only run locally, the StackImpact profiler runs inside of the cloud applications and completely automates the burdensome process of profiling CPU, memory allocations and other aspects of the application.

The following simple AWS Lambda Go function simulates some CPU work, a memory leak and a blocking call. Adding the StackImpact Golang agent is only a few lines of code. More details can found in the StackImpact Go package GitHub page or in the documentation.

package main

import (
	"time"
	"math/rand"

	"github.com/aws/aws-lambda-go/events"
	"github.com/aws/aws-lambda-go/lambda"
	"github.com/stackimpact/stackimpact-go"
)

var agent *stackimpact.Agent

var mem []string = make([]string, 0)

func Handler(request events.APIGatewayProxyRequest) (events.APIGatewayProxyResponse, error) {
	// profile this handler
	span := agent.Profile()
	defer span.Stop()

	// simulate cpu work
	for i := 0; i < 1000000; i++ {
		rand.Intn(1000)
	}

	// simulate memory leak
	for i := 0; i < 1000; i++ {
		mem = append(mem, string(i))
	}

	// simulate blocking call
	done := make(chan bool)
	go func() {
		time.Sleep(200 * time.Millisecond)
		done <- true
	}()
	<-done

	return events.APIGatewayProxyResponse{
		Body:       "Hello",
		StatusCode: 200,
	}, nil

}

func main() {
	agent = stackimpact.Start(stackimpact.Options{
		AgentKey: "agent key here",
		AppName:  "LambdaGo",
		DisableAutoProfiling: true,
	})

	lambda.Start(Handler)
}

You can get the agent key by signing up for a free trial account. The auto profiling is disabled with the DisableAutoProfiling: false agent startup option, since the agent cannot initiate profiling automatically. This is because of restrictions on background tasks. More precisely, the Lambda process freezes when the handler is inactive and therefore the agent cannot use timers to report performance data to the Dashboard. This is why the Profile() method should be used and the span.Stop() method takes over the periodic data reporting on behalf of the executing Lambda function.

Locating CPU hot spots

After generating requests against this Lambda function for some period of time, the CPU hot spots can be easily located in the reported profiles in the StackImpact Dashboard’s Hot spots / CPU section.

Screenshot

Finding memory leaks

Screenshot

Similarly, memory allocations hot spots will be shown in the Hot spots / Memory section. In addition to that, we also see that the allocated and uncollected memory is increasing over time, which may indicate that there is a memory leak at the highlighted code location.

Identifying latency bottlenecks

Latency of the service is perhaps the most important metric when it comes to user experience. Understanding where the Lambda function is waiting is critical for improving its response time. The Dashboard section Bottlenecks / Latency contains latency profile(s) with per-function-call blocking time information. Filtering the profile by our sample Lambda function name “main” shows function calls along with their waiting times, more precisely the 95th percentile of all sampled function call waiting times.

Screenshot

Continuous performance profiling

One of the important benefits of continuously profiling an application is that the profiles can be historically analyzed and compared. Unlike one-time call graphs, a call graph history per process and application allows a much deeper understanding of application execution. For example, the root cause of the performance regression in a new version of the function can be easily identified.

Since there can be many instances of Lambda containers, only a small subset will have active profiling agents.

See full documentation for more details.