GraphQL Query Complexity: The Hidden Danger

GraphQL queries can become incredibly complex, leading to denial-of-service attacks where a malicious client sends a query that exhausts server resources, crashing the application.

Imagine a client asking for a user’s profile, then their friends, then each friend’s posts, and then each post’s comments, and for each comment, the author’s profile, and so on, recursively. A naive server would happily churn through this, burning CPU and memory until it can’t serve legitimate requests.

This is the problem: uncontrolled query depth and breadth.

To combat this, we implement query complexity analysis on the server-side. The core idea is to assign a "cost" to different parts of a GraphQL query and reject queries that exceed a predefined budget. This isn’t about preventing any query, but preventing excessive ones.

Here’s how it works in practice. Most GraphQL server implementations (like Apollo Server for Node.js) offer middleware or plugins to hook into the query parsing and validation phase. Before the actual query execution even begins, we can analyze the Abstract Syntax Tree (AST) of the incoming query.

Let’s say we have a simple schema:

type User {
  id: ID!
  name: String!
  friends: [User!]!
  posts: [Post!]!
}

type Post {
  id: ID!
  title: String!
  comments: [Comment!]!
}

type Comment {
  id: ID!
  text: String!
  author: User!
}

type Query {
  user(id: ID!): User
}

We can define a cost function. A common approach is to assign a base cost of 1 to each field, and then multiply that cost by a factor based on arguments or the "depth" of the field. For instance, fetching a User might cost 1, but fetching a User and then their friends might cost 1 (for the user) + N (for each friend, where N is a configurable factor).

Consider this query:

query GetUserAndFriends($userId: ID!) {
  user(id: $userId) {
    id
    name
    friends {
      id
      name
    }
  }
}

A simple cost calculator might see:

query (root): cost 0
user(id: $userId): cost 1 (base cost for the field)
id (under user): cost 1
name (under user): cost 1
friends: cost 1 (base cost for the field)
id (under friend): cost 1
name (under friend): cost 1

The total cost here depends on how many friends the user has. If we assign a cost of 10 for each item in a list (like friends), and the user has 100 friends, the cost for the friends field alone would be 100 * 10 = 1000.

To implement this, we’d use a library like graphql-query-complexity. You’d typically configure it like this in your Apollo Server setup:

import { ApolloServer } from '@apollo/server';
import { startStandaloneServer } from '@apollo/server/standalone';
import { readFileSync } from 'fs';
import gql from 'graphql-tag';
import { applyMiddleware } from 'graphql-middleware';
import { shield } from 'graphql-shield'; // Or use graphql-shield for more advanced auth
import { queryComplexity, simpleEstimator } from 'graphql-query-complexity';

const typeDefs = gql`
  ${readFileSync('./schema.graphql', 'utf-8')}
`;

// ... your resolvers ...

const server = new ApolloServer({
  typeDefs,
  resolvers,
});

const schema = applyMiddleware(
  server.getSchema(),
  queryComplexity({
    // The maximum allowed query complexity, defaults to 2000
    maximumComplexity: 1000,
    // The query complexity estimator function, defaults to a simple estimator
    estimators: [
      simpleEstimator({
        // The cost for a field, defaults to 1
        defaultFieldCost: 1,
        // The cost for a list item, defaults to 10
        defaultListElementCost: 10,
      }),
    ],
    // An optional function to retrieve the complexity of a query.
    // If not provided, the query complexity will be calculated by the estimators.
    // onComplete: (complexity) => {
    //   console.log(`Query Complexity: ${complexity}`);
    // },
    // An optional function to handle the error when the query complexity is too high.
    // Defaults to throwing a QueryComplexityTooHighError.
    // onError: (error) => {
    //   console.error(`Query Complexity Error: ${error.message}`);
    //   throw error;
    // }
  })
);

// ... start server with the 'schema' object ...

With this configuration, if a query’s calculated complexity exceeds 1000, the server will reject it with a QUERY_COMPLEXITY_TOO_HIGH error before any resolvers are invoked. The simpleEstimator is a good starting point, assigning a base cost of 1 to fields and 10 to each item in a list. You can customize these costs. For example, fetching a user’s posts might be more expensive than fetching their name.

The key levers you control are maximumComplexity and the values within simpleEstimator (or a custom estimator function). You tune maximumComplexity based on your server’s capacity and typical query patterns. You might start with 1000 and monitor for rejected queries. If legitimate queries are being rejected, you increase it; if you’re still seeing performance issues or suspect abuse, you decrease it.

The most surprising thing about query complexity is that it’s not just about preventing infinitely recursive queries; it’s also about preventing highly branched queries that, while not recursive, can still fetch an enormous number of individual items. A query asking for the top 100 users, and for each user, their top 100 posts, and for each post, its top 100 comments, would have a massive complexity even if the "depth" is only 3.

The next problem you’ll face is handling batched queries, where multiple independent GraphQL requests are sent in a single HTTP POST.