StackImpact is a performance profiling and monitoring service for production Go (Golang) applications. It gives you continuous visibility with line-of-code precision into application performance, such as CPU, memory and I/O hot spots as well execution bottlenecks, allowing to optimize applications and troubleshoot issues before they impact customers.


  • Automatic hot spot profiling for CPU, memory allocations, network, system calls and lock contention.
  • Automatic bottleneck tracing for HTTP handlers and HTTP clients.
  • Error and panic monitoring.
  • Health monitoring including CPU, memory, garbage collection and other runtime metrics.
  • Alerts on hot spot anomalies.
  • Multiple account users for team collaboration.

Learn more on features page (with screenshots).

StackImpact agent reports performance information to the Dashboard running as SaaS or On-Premise.

Supported platforms and languages

Linux, OS X and Windows. Go version 1.5+.

Getting started with Go profiling

Create StackImpact account

Sign up for a free account at

Installing the agent

Install the Go agent by running

go get

And import the package in your application.

Configuring the agent


Start the agent by specifying agent key and application name. The agent key can be found in your account's Configuration section.

agent := stackimpact.NewAgent();
  AgentKey: "agent key here",
  AppName: "MyGoApp",

Other initialization options:

  • AppVersion (Optional) Sets application version, which can be used to associate profiling information with the source code release.
  • AppEnvironment (Optional) Used to differentiate applications in different environments.
  • HostName (Optional) By default host name will be the OS hostname.
  • Debug (Optional) Enables debug logging.
  • DashboardAddress (Optional) Used by on-premises deployments only.

Basic example

package main

import (


func handler(w http.ResponseWriter, r *http.Request) {
      fmt.Fprintf(w, "Hello world!")

func main() {
    agent := stackimpact.NewAgent()
      AgentKey: "agent key here",
      AppName: "Basic Go Server",
      AppVersion: "1.0.0",
      AppEnvironment: "production",

    http.HandleFunc(agent.MeasureHandlerSegment("/", handler)) // MeasureHandlerSegment wrapper is optional
    http.ListenAndServe(":8080", nil)

Measuring code segments (optional)

To measure execution time of arbitrary parts of application, the segment API can be used.

// Starts measurement of execution time of a code segment.
// To stop measurement call Stop on returned Segment object.
// After calling Stop the segment is recorded, aggregated and
// reported with regular intervals.
segment := agent.MeasureSegment("Segment1")
defer segment.Stop()
// A helper function to measure HTTP handlers by wrapping HandeFunc parameters.
// Usage example:
//   http.HandleFunc(agent.MeasureHandlerSegment("/some-path", someHandlerFunc))
pattern, wrappedHandlerFunc := agent.MeasureHandlerSegment(pattern, handlerFunc)

Monitoring errors (optional)

To monitor exceptions and panics with stack traces, the error recording API can be used.

Recording handled errors:

// Aggregates and reports errors with regular intervals.

Recording panics without recovering:

// Aggregates and reports panics with regular intervals.
defer agent.RecordPanic()

Recording and recovering from panics:

// Aggregates and reports panics with regular intervals. This function also
// recovers from panics
defer agent.RecordAndRecoverPanic()

Analyzing performance data in the Dashboard

Once your application is restarted, start observing regular and anomaly-triggered CPU, memory, IO, and other hot spot profiles, execution bottlenecks as well as process metrics in the Dashboard.


To enable debug logging, add Debug: true to startup options. If debug log doesn't give you any hints on how to fix a problem, please report it to our support team in your account's Support section.


Hot spot profiling

Profile recording and reporting by the agent

Each profiling report represents a series of profiles recorded by the agent regularly or on application anomaly, such as rapid change in a metric relevant for the profile type. Regular recording intervals are normally a few minute long. A recording duration of a profile is limited to few seconds, depending on the profile type and associated overhead.

Historical profile grouping

Reports are shown for a selected application and time frame. A default view is Timeline, which will present profiles from multiple subsequent sources, e.g. machines or containers, in a single time sequence. If multiple sources report profiles simultaneously, e.g. the application is scaled to multiple machines or containers, not all profiles will be visible. It is possible to select a particular source only.

Additionally, a time frame can be selected to filter recorded profiles.

Profiles history

Profiles chart shows a key measurement, e.g. total or max, for each recorded profile over time. By clicking on the measurement point in the chart a profile for the selected time point will be shown as an expandable call tree. Every call (stack frame) in the call tree shows its own share of the total measurement as well as the trend based on previous values of the call found in previous profiles.

Profile context

Profile context, which is displayed as a list of tags, reflects the application environment and the state at which the profile was recorded. The following entries are possible.

  • Host name – the host name of the host, instance, container the application is running on. The value is obtained from the system.
  • Runtime type – a language or platform, e.g. Go.
  • Runtime version – a version of the language or platform.
  • Application version – can be defined in agent initialization statement by developer.
  • Build ID – a prefix of a sha1 of the program.
  • Run ID – a unique ID for every (re)start of the application.
  • Agent version – version of the agent, which recorded the profile.

CPU usage profile

CPU usage profiles are recorded by Go’s built-in sampling profiler. In the Dashboard it is represented by a call tree with nodes corresponding to function calls. Each node’s value represents a percentage of absolute time a call was executing during recording of the profile. The percentage is a best effort to calculate absolute execution time. It is achieved by using number of cores available to the process and the profiler's sampling rate. Additionally, a number of samples for each call is provided.

The root cause of high CPU usage can be many. Some of them are:

  • Algorithm complexity, i.e. a code has a high time complexity. For example it performs exponentially more steps relative to the data size the algorithm is processing.
  • Extensive garbage collection caused by too many objects being allocated and released.
  • Infinite or tight loops

See also:

Memory allocation profile

Memory profiles are recorded by reading current heap allocation statistics. Each node in the memory allocation profile call tree represents a line of code where memory was allocated and not released after garbage collection. The value of the node is the number of bytes allocated by a function call or by some of its nested calls, which call new at some point. If a node has multiple children nodes, the node's value is the total of its children's values. Number of samples, which is shown next to the allocated size, corresponds to the number of allocated objects.

A single profile is not a good indicator of a memory leak, while memory can be released a little after the memory allocation statistics were read. A better indication of a memory leak is a continuous increase of allocated memory at a single call node relative to its previous readings. Different types of memory leaks may manifest themselves at different time scales.

Memory leaks can have different root causes. Some of them are:

  • The pointer to which an object is assigned after allocation is stays unreleased, e.g. it has a wrong scope.
  • A pointer is assigned to another pointer, which is not released similarly to previous point.
  • Unintended allocation of memory, e.g. in a loop.

See also:

Network, system, lock and channel wait profiles

Wait time profiles represent a call tree, where each node is a function call, which waits for an event. The value is an aggregated waiting time of a call during one second. It can be larger than one second, because the same call can wait for an event simultaneously in different goroutines/threads. Events can be network reads and writes, system calls, mutex waits as well as channel synchronization. Number of samples, which is shown next to the wait time, corresponds to the number of executions of the function call.

See also:

Bottleneck profiling

Bottleneck profiles are recorded, reported and represented identically to hot spot profiles, except that the values of function calls in the call trees are not aggregate values over time, but 95th percentiles of the call execution times during recording period. Number of samples is the count of call executions, which were seen by the profiler.

Unlike hot spot profiles, bottleneck profiles are built around a specific type of a functionality, for example HTTP handlers. The call tree node values are times where the calls were waiting on some blocking event, such as I/O, system calls, mutexes, etc.

HTTP handler profile

HTTP handler bottleneck profile includes all function calls related to HTTP request execution, which were waiting for some blocking event.

See also:

HTTP client profile

HTTP client bottleneck profile includes all function calls related to outgoing HTTP client requests, which were waiting for some blocking event.

Database client profile

Database client bottleneck profile includes all function calls related to database client commands, which were waiting for some blocking event. Currently supported clients are SQL packages, which implement standard interface as well as most popular MongoDB and Redis packages.

See also:

Segment measurements

Measuring execution time of custom code segments is possible using agent's API. The agent will aggregate and report the execution time of a segment, which is the 95th percentile of all instances of the same subsegment during 60 seconds.

A helper wrapper for measuring execution time of HTTP handler functions is also available.

Measurement charts will be available in Dashboard's Bottlenecks section under Segments.

See also:

Error monitoring

Agent provides an API for reporting errors. When used, the Events -> Errors section will contain error profile reports for different types of errors. Each report is a collection of error profiles for a sequence of sources (Timeline), e.g. hosts or containers, or a single source over a period of time, which is adjustable.

A chart shows the number of total errors of a particular type over time. Clicking on a point will select an error profile corresponding to the selection time.

An error profile is a call tree, where each branch is an error stack trace and each node is an error stack frame. The value of a node indicates the number of times an error has occurred during 60 seconds.

See also:

Health monitoring

Agents report various metrics related to application execution, runtime, operating system and so on. The measurements are taken every 60 seconds. The following metrics are supported for Golang applications:

  • CPU
    • CPU usage – percentage of total time the CPU was busy. This is a best effort to calculate absolute CPU usage based on the number of cores available to the process.
    • CPU time – similar to CPU usage, but not converted to percentage.
  • Memory
    • Allocated memory – total number of bytes allocated and not garbage collected.
    • Mallocs – number of malloc operations per measurement interval.
    • Frees – number of free operations per measurement interval.
    • Lookups – number of pointer lookup operations per measurement interval.
    • Heap objects – number of heap objects.
    • Heap non-idle – heap space in bytes currently used.
    • Heap idle – heap space in bytes currently used.
    • Current RSS – resident set size, which is a portion of process' memory held in RAM.
    • Max RSS – peak resident set size during application execution.
  • Garbage collection
    • Number of GCs – number of garbage collection cycles per measurement interval.
    • GC CPU fraction – fraction of CPU used by garbage collection.
    • GC total pause – number of time garbage collection took during measurement interval.
  • Runtime
    • Number of goroutines – number of currently running goroutines.
    • Number of cgo calls – number of cgo calls made per measurement interval.

See also:

Agent overhead

Reporting CPU, network and system profiles requires regular and anomaly-triggered profiling and tracing activation for short periods of time. Unlike memory profiling and process-level metric reporting, they produce some overhead when active. The agent makes sure the profiling is active not more than 5% of the time and, while active, the overhead stays very low and has no effect on application execution.