Go’s garbage collector is a concurrent, tri-color mark-and-sweep collector that aims for low latency.

Let’s say you’re building a microservice that processes user requests. You’ve got Go routines spinning up for each request, and you’re seeing some high latency spikes. You suspect the GC might be involved.

Here’s a look at the Go runtime’s scheduler, which is crucial for understanding how Go routines are managed and how they interact with the underlying OS threads.

The Go scheduler’s primary job is to multiplex a large number of Go routines onto a smaller number of OS threads (M:N scheduling). This allows Go to achieve high concurrency without the overhead of creating an OS thread for every concurrent task.

Imagine you have 10,000 Go routines but only 4 CPU cores. The scheduler needs to efficiently switch between these Go routines to make progress on all of them. It does this by managing three key data structures:

  • P (Processor): Represents a logical processor, which is tied to a specific OS thread (M). A P has a local queue of Go routines to execute. When a Go routine is running on an M, it’s associated with a P.
  • M (Machine/OS Thread): An operating system thread that executes Go code. Ms can be created dynamically by the Go runtime as needed.
  • G (Goroutine): The lightweight concurrent execution unit in Go.

Here’s a simplified view of the scheduling loop:

  1. Go routine runs: An M, associated with a P, picks a G from its local P queue and starts executing it.
  2. Preemption/Yield: If the G runs for too long (e.g., GOGC limit), or if it calls a blocking system call, or if it yields explicitly (runtime.Gosched()), the M might put the G back into a queue (local or global) and pick another G to run.
  3. System Call: If a G makes a blocking system call, the M associated with it might be released back to the M cache, and a new M might be created or fetched to continue running other Gs. This is how Go avoids blocking all threads on I/O.
  4. GC Pause: During a garbage collection cycle, the scheduler might be involved in pausing or coordinating Go routines.

Let’s see this in action. Consider this simple concurrent program:

package main

import (
	"fmt"
	"runtime"
	"sync"
	"time"
)

func worker(id int, wg *sync.WaitGroup) {
	defer wg.Done()
	fmt.Printf("Worker %d starting\n", id)
	// Simulate some work
	time.Sleep(time.Second)
	fmt.Printf("Worker %d done\n", id)
}

func main() {
	runtime.GOMAXPROCS(runtime.NumCPU()) // Set number of OS threads to number of CPU cores
	var wg sync.WaitGroup
	numWorkers := 100

	fmt.Printf("Starting %d workers...\n", numWorkers)
	for i := 0; i < numWorkers; i++ {
		wg.Add(1)
		go worker(i, &wg)
	}

	wg.Wait()
	fmt.Println("All workers finished.")
}

When you run this, Go’s scheduler will take the 100 worker Go routines and multiplex them onto the number of OS threads specified by runtime.GOMAXPROCS. If runtime.NumCPU() is 4, the scheduler will manage how those 100 Go routines share those 4 threads.

You can observe the scheduler’s behavior with GODEBUG=schedtrace=1000 environment variable. This will print a line every 1000 GC ticks (roughly every millisecond) showing the state of the scheduler, including the number of running Go routines, the number of P’s, and the number of M’s.

GODEBUG=schedtrace=1000 go run your_program.go

The output might look something like this (simplified):

schedtrace: 1000ms: ip=0x40a7c1 gc=0 @0.000s
schedtrace: 1000ms: ip=0x40a7c1 gc=0 @0.000s
...

This output, while dense, shows the scheduler’s activity. For instance, you’d see changes in the number of runnable Go routines (runq) and the number of idle M’s.

The runtime.Gosched() function is a direct way to yield the current Go routine, allowing the scheduler to pick another one to run. This is useful in tight loops or when you want to ensure other Go routines get a chance to execute without necessarily blocking.

func busyLoop() {
	for i := 0; i < 1000000; i++ {
		// Do some computation
		// ...
		if i%1000 == 0 {
			runtime.Gosched() // Yield to other goroutines periodically
		}
	}
}

A common misconception is that Go routines are directly mapped to OS threads. They are not; the M:N multiplexing is the key. Another is that runtime.Gosched() always causes a context switch. It allows a context switch, but the scheduler ultimately decides if and when it happens based on its internal logic and available resources.

The most surprising thing about the Go scheduler is how it handles blocking system calls. When a Go routine makes a blocking system call, the OS thread it’s running on (M) is blocked. To prevent this from halting all other Go routines, the Go runtime detaches the P from the blocked M and assigns it to another M (either an existing idle one or a new one). The blocked M continues to wait for the syscall to return, but the P and its Go routines are no longer held up. Once the syscall returns, the original M can be reused. This is a core reason for Go’s excellent concurrency handling.

The next concept to explore is how the Go compiler and runtime work together to manage memory and perform garbage collection.

Want structured learning?

Take the full Golang course →