CGo doesn’t just add a tiny overhead; it fundamentally changes how your Go program interacts with the outside world, and that interaction can become a surprisingly expensive bottleneck.
Let’s see this in action. Imagine a simple Go function that calls a C function, which in turn does almost nothing.
package main
/*
#include <stdio.h>
void noop() {
// Does absolutely nothing.
}
*/
import "C"
import (
"fmt"
"time"
)
func main() {
start := time.Now()
for i := 0; i < 1000000; i++ {
C.noop()
}
duration := time.Since(start)
fmt.Printf("Took %s for 1,000,000 CGo calls.\n", duration)
}
If you run this, you might be surprised by the duration. It’s not microseconds per call; it’s often in the tens or even hundreds of nanoseconds per call, adding up significantly. This isn’t just a function call; it’s a cross-language boundary traversal.
The core problem CGo solves is enabling Go programs to call into C libraries. This is incredibly powerful for leveraging existing C code, accessing operating system primitives, or using high-performance C libraries. However, the mechanism by which this happens is what incurs the cost.
When a Go function calls a C function using CGo, the Go runtime has to perform a series of actions to transition from Go’s execution environment to C’s. This involves:
- Stack Management: The Go runtime needs to ensure that the Go stack is properly managed and can be safely suspended while C code executes. This can involve copying stack frames or switching to a C-style stack.
- Argument Marshaling: Go types (like strings, slices, maps) are not directly compatible with C types. CGo has to convert these Go types into C-compatible representations. For strings, this often means allocating memory in C’s heap and copying the string data. For slices, it involves passing pointers and lengths.
- Runtime State Switching: The Go garbage collector and scheduler need to be informed that C code is running. This is often done by entering a "GC-unsafe" state, which can prevent the Go scheduler from operating efficiently or the GC from running.
- System Call Overhead: If the C code itself makes system calls, those calls will also incur their usual overhead, but now they’re wrapped in the CGo transition.
- Return Value Unmarshaling: When the C function returns, the process reverses. CGo converts C return values back into Go types, potentially involving memory allocation and copying.
The C.noop() example highlights the overhead of just the transition itself. Even if the C function does nothing, the Go runtime still has to perform the stack setup, state switching, and return path setup.
Consider a more realistic scenario where you’re using a C library for high-performance I/O or cryptography. You might be passing Go strings to a C function that expects a char*.
package main
/*
#include <string.h>
#include <stdlib.h>
void process_string(const char* str) {
// In a real scenario, this would do something with the string.
// For demonstration, we'll just use strlen, which is a C function.
size_t len = strlen(str);
// Pretend we did something computationally intensive.
volatile size_t dummy = len * 2;
}
*/
import "C"
import (
"fmt"
"time"
"unsafe"
)
func main() {
goStr := "This is a moderately long string that we will pass to C."
cStr := C.CString(goStr) // Convert Go string to C string
defer C.free(unsafe.Pointer(cStr)) // Free C string memory
start := time.Now()
for i := 0; i < 100000; i++ {
C.process_string(cStr)
}
duration := time.Since(start)
fmt.Printf("Took %s for 100,000 CGo calls with string processing.\n", duration)
}
In this example, C.CString(goStr) allocates memory on the C heap and copies the Go string data. C.free is then necessary to prevent memory leaks. Even strlen itself is a C function, but the dominant cost here is the C.CString conversion and the CGo transition for each call to C.process_string. If process_string were more complex, it would add to the cost, but the CGo boundary is a constant tax.
The most surprising part for many is how much work C.CString and C.GoString actually do. They aren’t just simple pointers; they involve memory allocation and byte-by-byte copying between Go’s memory space and C’s memory space. This copy is often the hidden cost, especially when dealing with large strings or data buffers.
If you find yourself making frequent, small CGo calls, especially those involving string or slice conversions, you’re likely paying a significant performance penalty. The Go scheduler can also get "stuck" in C code, meaning other Go goroutines might not get CPU time, leading to perceived unresponsiveness.
The next pitfall you’ll likely encounter is understanding how to manage the C heap when interacting with C libraries, particularly around memory allocation and deallocation to avoid leaks or double-frees.