The most surprising thing about Go’s pprof is how little instrumentation it actually needs to give you a crystal-clear picture of your application’s performance.

Let’s see it in action. Imagine you have a simple Go web server that does some work, and you want to profile it.

package main

import (
	"fmt"
	"net/http"
	_ "net/http/pprof" // Import this to enable pprof handlers
	"runtime"
	"sync"
	"time"
)

func heavyWork() {
	// Simulate some CPU-bound work
	var sum int64
	for i := 0; i < 1000000; i++ {
		sum += int64(i)
	}
	runtime.KeepAlive(sum) // Prevent compiler from optimizing away the work

	// Simulate some memory allocation
	_ = make([]byte, 1024*1024) // Allocate 1MB
}

func handler(w http.ResponseWriter, r *http.Request) {
	heavyWork()
	fmt.Fprintf(w, "Done with heavy work!")
}

func main() {
	http.HandleFunc("/", handler)
	fmt.Println("Starting server on :8080")
	go func() {
		// This goroutine will just consume CPU doing nothing much for demonstration
		for {
			select {
			case <-time.After(10 * time.Second):
				// Do nothing, just keep the loop alive
			}
		}
	}()
	http.ListenAndServe(":8080", nil)
}

To enable pprof, you just need to import _ "net/http/pprof". This registers handlers for pprof endpoints on your default http.ServeMux. Now, when your server is running, you can access these endpoints.

Run this code: go run your_program.go

Then, in another terminal, you can fetch profiles. For CPU profiling, you typically run it for a short duration, say 30 seconds:

go tool pprof http://localhost:8080/debug/pprof/profile?seconds=30

This command fetches the CPU profile and drops you into the pprof interactive shell. Here are some essential commands:

  • top: Shows the functions consuming the most CPU time.
  • list <function_name>: Shows the source code of a function with line-by-line CPU usage.
  • web: Generates a call graph visualization (requires Graphviz).
  • peek <function_name>: Shows callers and callees of a function.

Let’s look at memory. The net/http/pprof package exposes several memory profiles:

  • /debug/pprof/heap: Current live heap allocations.
  • /debug/pprof/allocs: Cumulative allocations since program start.
  • /debug/pprof/goroutine: Stack traces of all current goroutines.

To analyze heap usage:

go tool pprof http://localhost:8080/debug/pprof/heap

And for cumulative allocations:

go tool pprof http://localhost:8080/debug/pprof/allocs

The pprof tool allows you to analyze these profiles with similar commands (top, list, web, peek). You can also specify the number of samples to collect for heap profiles, like http://localhost:8080/debug/pprof/heap?seconds=30&sample_index=1000000.

Understanding the output is key. When you see top with CPU profiles, the percentages refer to the total CPU time spent in that function and its callees. For memory profiles (heap, allocs), the numbers represent the amount of memory (in bytes) allocated or currently in use.

The allocs profile is particularly useful for identifying allocation hotspots that might lead to increased garbage collection pressure, even if the current heap size isn’t alarming. For instance, if allocs shows a significant percentage in a function that’s called frequently but whose allocations are short-lived, it’s still a potential GC problem.

The mental model here is that pprof works by sampling. For CPU, it periodically interrupts the running program and records the call stack. For memory, it either samples allocations as they happen or samples the live heap. The more frequently something happens or the larger it is, the more likely it is to be captured by the sampling process, making it appear prominently in the profiles.

A common mistake is to only look at the heap profile. While it shows current memory usage, the allocs profile reveals the rate of allocation, which is often a more direct indicator of potential GC issues. If allocs shows a function is responsible for a large percentage of total allocations, even if those allocations are quickly freed, it can still be contributing significantly to GC work.

The next concept you’ll likely encounter is understanding how to interpret goroutine profiles and identify goroutine leaks, especially when dealing with concurrent applications.

Want structured learning?

Take the full Golang course →