GraphQL’s ability to fetch exactly the data you need is powerful, but it can also obscure what’s happening under the hood, making it hard to debug performance issues or security concerns.

Let’s see this in action. Imagine a simple GraphQL API backed by a few microservices. A client asks for user(id: 1) { name posts { title } }.

query GetUserAndPosts($userId: ID!) {
  user(id: $userId) {
    name
    posts {
      title
    }
  }
}

Without proper logging and tracing, when this request is slow, you’re flying blind. You don’t know if it’s the GraphQL server itself, the user resolver, the posts resolver, or one of the downstream services that’s the bottleneck.

The Core Problem: Opaque Execution

The GraphQL server receives a query, parses it, validates it, and then executes it. This execution is often a tree of resolver calls, potentially spanning multiple services. Each resolver might perform I/O, computation, or call other services. The problem is that by default, the server doesn’t tell you how long each of these steps took, or what data was actually requested and returned at each stage.

Introducing Request Logging and Tracing

Request logging captures details about each incoming GraphQL request: the query itself, variables, and potentially the operation name. Tracing, on its own or integrated with logging, breaks down the execution of that request into granular spans, showing the duration of each resolver and the relationships between them.

Let’s Build It: A Practical Setup

We’ll use Apollo Server as our GraphQL server and apollo-tracing for distributed tracing.

1. Enable Tracing in Apollo Server:

First, ensure you have graphql-tools and apollo-server-express (or your chosen framework) installed.

npm install apollo-server-express graphql-tools
npm install --save apollo-tracing

In your Apollo Server setup, you’ll create a Tracer instance and pass it to the server configuration.

import { ApolloServer } from 'apollo-server-express';
import { makeExecutableSchema } from 'graphql-tools';
import { GraphQLSchema } from 'graphql';
import { TracingExtension } from 'apollo-tracing'; // Import this

// Your typeDefs and resolvers go here
const typeDefs = `...`;
const resolvers = { ... };

const schema: GraphQLSchema = makeExecutableSchema({ typeDefs, resolvers });

const server = new ApolloServer({
  schema,
  extensions: [
    () => new TracingExtension(), // Add the tracing extension
  ],
});

// ... rest of your Express app setup

When a request comes in, apollo-tracing automatically generates trace data for each field being resolved.

2. Sending Traces to a Backend:

The TracingExtension generates trace data, but it needs to be sent somewhere for analysis. A common destination is a tracing backend like Jaeger, Zipkin, or Datadog. You’ll need a way to export these traces. Apollo Server doesn’t do this export directly; you’d typically add another extension or middleware for that.

For example, to send traces to Jaeger via its native protocol (using jaeger-client):

npm install jaeger-client

Then, in your server setup, you’d initialize the Jaeger tracer and configure an exporter. This is a bit more involved and often done at the application’s entry point.

// Example using a simplified Jaeger exporter setup
import { initTracer } from 'jaeger-client';

const config = {
  serviceName: 'my-graphql-api',
  sampler: {
    type: 'const',
    param: 1,
  },
  reporter: {
    logSpans: true, // For debugging, logs spans to console
    // You'd configure a real reporter for production, e.g., UDP to Jaeger agent
  },
};
const options = {
  // ... other tracer options
};
const tracer = initTracer(config, options);

// ... in your ApolloServer setup, you'd pass this tracer
// to a custom extension that pushes spans to the tracer.
// apollo-tracing's TracingExtension doesn't export directly,
// you'd create your own extension that consumes its data.

3. Accessing Trace Data:

Once traces are being sent to your backend, you can use its UI to visualize them. You’ll see a timeline for each request, with individual resolver calls represented as colored bars. You can click on a bar to see its duration, the parent operation, and any tags (like the GraphQL field name).

Request Logging: The Complement

While tracing shows how long things took, logging shows what happened. You can add middleware to your Express app (or equivalent) to log details about each incoming request before it hits Apollo Server.

import express from 'express';
import { ApolloServer } from 'apollo-server-express';
// ... other imports

const app = express();

// Middleware for logging incoming requests
app.use((req, res, next) => {
  const { query, variables, operationName } = req.body; // Assuming POST requests with JSON body
  console.log(`GraphQL Request: Operation=${operationName || 'N/A'}, Query=${query.substring(0, 100)}...`);
  next();
});

const server = new ApolloServer({ /* ... schema and extensions */ });
server.applyMiddleware({ app });

// ... start server

This gives you a chronological log of what was asked. Combining this with trace data allows you to correlate a slow request in your logs with its detailed execution breakdown in your tracing system.

The Mental Model: A Distributed System View

Think of your GraphQL API not as a monolith, but as the orchestrator of a microservice ecosystem. Each GraphQL field resolution is a potential call to another service. Tracing provides the end-to-end view of this orchestration, showing where delays occur. Logging provides the audit trail of requests made.

The Counterintuitive Part: Resolver Performance is Not Uniform

Most developers assume that if a GraphQL query is slow, it’s because a single resolver is slow. In reality, the cumulative effect of many individually fast resolvers can lead to a slow overall request. A query asking for 100 posts, each requiring a separate database lookup, might have resolvers that take only 5ms each, but 100 * 5ms = 500ms. Tracing will reveal this aggregation of work, whereas simple logging might just show the total request time.

The Next Step: Performance Optimization

Once you can see which resolvers are slow or contributing to overall latency, you can start optimizing. This might involve batching requests to downstream services (e.g., instead of 100 individual DB lookups, fetch all 100 IDs in one go), caching responses, or re-architecting parts of your data-fetching layer.

Want structured learning?

Take the full Graphql-tools course →