Neo4j relationships have direction, and it matters more than you think for how fast your queries run.

Let’s see it in action. Imagine a simple social network where people KNOWS other people.

// Creating nodes
CREATE (alice:Person {name: 'Alice'});
CREATE (bob:Person {name: 'Bob'});
CREATE (charlie:Person {name: 'Charlie'});

// Creating relationships
CREATE (alice)-[:KNOWS]->(bob);
CREATE (bob)-[:KNOWS]->(charlie);
CREATE (alice)-[:KNOWS]->(charlie);

Now, let’s query: "Who does Alice know?"

MATCH (alice:Person {name: 'Alice'})-[:KNOWS]->(friend:Person)
RETURN friend.name;

This query is fast. Neo4j traverses the KNOWS relationship from Alice to her friends. The engine knows exactly where to look: it starts at the Alice node and follows outgoing KNOWS edges.

What if we ask, "Who knows Alice?"

MATCH (person:Person)-[:KNOWS]->(alice:Person {name: 'Alice'})
RETURN person.name;

This query is also fast. Neo4j starts at the Alice node and follows incoming KNOWS edges. The database is optimized for both directions of traversal.

The real performance difference emerges when your graph gets large and your queries involve multiple hops or complex patterns where directionality isn’t perfectly aligned with your question.

Consider a scenario where you want to find people Alice knows who also know Charlie.

MATCH (alice:Person {name: 'Alice'})-[:KNOWS]->(friend:Person)-[:KNOWS]->(charlie:Person {name: 'Charlie'})
RETURN friend.name;

Neo4j will start at Alice, find outgoing KNOWS relationships, then for each friend found, it will look for outgoing KNOWS relationships to Charlie. This is efficient because the directionality matches the traversal.

Now, what if your data model isn’t so neat? Suppose you model FOLLOWS relationships. If Alice FOLLOWS Bob, and Bob FOLLOWS Charlie, but you ask "Who is followed by Alice?", you’d write:

MATCH (alice:Person {name: 'Alice'})-[:FOLLOWS]->(followed:Person)
RETURN followed.name;

This is fine. But what if you want to find people who follow Alice? If your data is only modeled as (follower)-[:FOLLOWS]->(followed), you’d need to search for incoming FOLLOWS relationships to Alice.

MATCH (follower:Person)-[:FOLLOWS]->(alice:Person {name: 'Alice'})
RETURN follower.name;

This is still okay. The problem arises when you have a mix of relationship directions that don’t intuitively map to your query. For instance, if you had (person1)-[:WORKS_WITH]->(person2) and (person3)-[:MANAGES]->(person1), and you wanted to find everyone who MANAGES Alice, or is WORKED_WITH by Alice.

The core of Neo4j’s performance is its index-free adjacency. Each node stores pointers to its incoming and outgoing relationships. When you traverse a relationship in a specific direction, Neo4j directly accesses these pointers. If you traverse in the opposite direction of how the relationship was created, Neo4j has to perform a slightly more involved lookup: it finds all relationships attached to the starting node and then filters them by type and direction. While still efficient, it’s not as direct as following pre-indexed pointers.

The key takeaway is to design your relationship directions to mirror your most frequent and performance-critical query patterns. If you often ask "Who is related to X in this way?", and the relationship is naturally from A to B, model it as (A)-[:RELATIONSHIP]->(B). If your query is "Who relates to X in this way?", and the natural flow is B to A, model it as (B)-[:RELATIONSHIP]->(A).

Don’t be afraid to create relationships in both directions if necessary, e.g., (alice)-[:KNOWS]->(bob) and (bob)-[:KNOWS]->(alice). This doubles your storage but can make bidirectional queries trivial and very fast. However, for many patterns, a single directed relationship is sufficient and more memory-efficient. The magic is that Neo4j indexes relationships by type and direction, making both MATCH (a)-[:R]->(b) and MATCH (b)<-[:R]-(a) highly performant if the relationship is modeled as (a)-[:R]->(b).

When modeling, think about the verb of the relationship. "Alice knows Bob" is naturally (Alice)-[:KNOWS]->(Bob). "Bob is known by Alice" is the same fact, but if your query is phrased that way, you might still want to traverse (Bob)<-[:KNOWS]-(Alice). Neo4j handles this traversal efficiently, but if you always query "who is known by X", then modeling it as (Alice)-[:KNOWS]->(Bob) and querying MATCH (alice:Person {name: 'Alice'})<-[:KNOWS]-(known_by:Person) will be as fast as querying the outgoing direction. The system doesn’t "prefer" outgoing or incoming traversals; it prefers traversals that align with the stored directionality.

The most surprising aspect is how little difference it makes for simple, single-hop queries, leading many to underestimate its importance. However, as query complexity grows, or as the graph scales to billions of relationships, even minor inefficiencies in traversal can compound into significant performance degradations. The database is designed to be direction-agnostic at the conceptual level for queries, but highly direction-aware at the physical storage and traversal level for performance.

The next logical step is understanding how to optimize these relationships further using indexes and constraints.

Want structured learning?

Take the full Neo4j course →