Neo4j Fabric lets you query across multiple independent Neo4j databases as if they were one, without moving the data.

Here’s what that looks like in practice. Imagine you have two distinct Neo4j databases: users_db and orders_db.

users_db schema:

(:User {userId: 'u123', name: 'Alice'})
(:User {userId: 'u456', name: 'Bob'})

orders_db schema:

(:Order {orderId: 'o789', amount: 100, userId: 'u123'})
(:Order {orderId: 'o012', amount: 250, userId: 'u456'})
(:Order {orderId: 'o345', amount: 75, userId: 'u123'})

With Fabric, you can write a single query that joins data from both. First, you need to set up Fabric by defining "Cores" (your individual databases) and a "Fabric" (the unified view).

Let’s assume you’ve installed Neo4j 5.13+ and have two running Neo4j instances (or have created two separate databases within a single instance using different database names).

1. Define the Cores (Databases):

In your Neo4j browser, connect to the instance that will host your Fabric. You’ll execute commands to register your other databases as Cores.

To register users_db (assuming it’s on bolt://localhost:7687):

CALL.db.config.create("cores", "users_core", {connectionUrl: "bolt://localhost:7687", database: "users_db"})

To register orders_db (assuming it’s on bolt://localhost:7688):

CALL.db.config.create("cores", "orders_core", {connectionUrl: "bolt://localhost:7688", database: "orders_db"})

These commands tell Fabric where to find your individual graph databases and give them logical names within the Fabric.

2. Create the Fabric:

Now, create the Fabric itself. This is the entity that will orchestrate queries across the Cores.

CALL.db.config.create("fabrics", "my_fabric", {cores: ["users_core", "orders_core"]})

This command creates a Fabric named my_fabric and explicitly lists the Cores it can access.

3. Query Across Databases:

Now, the magic. You can query across users_db and orders_db using the Fabric. You’ll connect to the Neo4j instance hosting the Fabric and use the USE clause to specify which Core to run a part of your query against.

To find all users and their orders:

MATCH (u:User)
CALL {
  USE users_core
  WITH u
  CALL {
    USE orders_core
    WITH u
    MATCH (o:Order) WHERE o.userId = u.userId
    RETURN collect(o) AS orders
  }
  RETURN u.name AS userName, orders
}
RETURN userName, orders

Output:

[
  {
    "userName": "Alice",
    "orders": [
      {"orderId": "o789", "amount": 100, "userId": "u123"},
      {"orderId": "o345", "amount": 75, "userId": "u123"}
    ]
  },
  {
    "userName": "Bob",
    "orders": [
      {"orderId": "o012", "amount": 250, "userId": "u456"}
    ]
  }
]

This query works by:

  • MATCH (u:User): This part of the query is executed against the default database of the Fabric connection.
  • CALL { USE users_core ... }: This subquery explicitly tells Fabric to execute the enclosed Cypher against the users_core. In this case, it’s just passing through the u variable.
  • CALL { USE orders_core ... }: This nested subquery tells Fabric to execute its enclosed Cypher against the orders_core. It finds Order nodes where the userId matches the userId from the User node passed down from the outer scope.
  • The results are then aggregated and returned.

Fabric solves the problem of distributed graph data where you have logical separations (e.g., by domain, by tenant, by region) but need to perform analytical queries that span these boundaries without the overhead of data replication or complex ETL pipelines. It allows you to maintain independent operational databases while enabling holistic analytics.

The core mechanism is the USE clause within a Fabric query. It’s not just a directive; it’s a transactional boundary. When you USE <core_name>, Neo4j initiates a transaction against that specific Core database. This means that operations within a USE block are atomic for that Core, and if you have multiple USE blocks in a single query, Neo4j ensures transactional consistency across them where possible, or at least provides clear failure semantics if one Core fails. It’s this ability to atomically target and query specific, independently managed graph databases that makes Fabric so powerful for federated graph management.

When you define a Fabric, you’re not creating a new database; you’re creating a query endpoint that understands how to route Cypher statements to different underlying databases based on the USE clause. Each Core is essentially a connection string and a logical name mapping to a specific Neo4j database. The Fabric itself doesn’t store any graph data; it’s purely a routing and orchestration layer.

The next step is understanding how to manage schema and indexing across these distributed databases for optimal performance.

Want structured learning?

Take the full Neo4j course →