Keycloak can run in a single instance, but for production, you need high availability (HA). Without it, if that one instance goes down, your users can’t log in to anything. Setting up an HA cluster means running multiple Keycloak instances that can all handle traffic and share session data, so if one dies, the others pick up the slack seamlessly.

Let’s see Keycloak in action with a basic HA setup. Imagine two Keycloak instances, keycloak-0 and keycloak-1, both running and configured to talk to each other.

Here’s a snippet from a hypothetical standalone.xml (or standalone-ha.xml) on keycloak-0:

<subsystem xmlns="urn:jboss:domain:ha-singleton:1.0">
    <singleton name="keycloak-cluster" default-missing="CREATE">
        <simple-protocol multicast="true" port="1122" />
        <deployment-override>
            <source path="${jboss.server.config.dir}/standalone-ha.xml"/>
        </deployment-override>
    </singleton>
</subsystem>
<subsystem xmlns="urn:jboss:domain:messaging-activemq:1.0">
    <server name="default">
        <security enabled="false"/>
        <http-connector name="http-connector" socket-binding="http" endpoint="ajp"/>
        <in-vm-connector name="in-vm" />
        <in-vm-acceptor name="in-vm" />
        <remote-connector name="netty" socket-binding="messaging"/>
        <remote-acceptor name="netty" socket-binding="messaging"/>
        <pooled-jms-queue name="expiringQueue" entries="java:/queue/expiringQueue" durable="false"/>
        <connection-factory name="RemoteConnectionFactory" entries="java:jboss/exported/jms/RemoteConnectionFactory" connectors="netty"/>
        <jms-queue name="InQueue" entries="java:/queue/InQueue"/>
        <jms-queue name="OutQueue" entries="java:/queue/OutQueue"/>
        <jms-queue name="ha-singleton-message-queue" entries="java:/queue/ha-singleton-message-queue"/>
        <topic name="ha-singleton-topic" entries="java:/topic/ha-singleton-topic"/>
        <topic name="ha-singleton-broadcast-topic" entries="java:/topic/ha-singleton-broadcast-topic"/>
    </server>
</subsystem>
<subsystem xmlns="urn:jboss:domain:clustering:1.0">
    <infinispan>
        <cache-container name="keycloak" default-cache="sessions">
            <transport lock-timeout="60000" distributed-ht-cache-timeout="60000" node-timeout="60000" shutdown-timeout="60000">
                <local/>
            </transport>
            <replicated-cache name="sessions" mode="SYNC" batching="false">
                <locking acquire-timeout="60000" striping="false"/>
                <transaction mode="NONE"/>
                <eviction strategy="LRU" max-entries="10000"/>
                <expiration interval="0" max-idle="3600000"/>
            </replicated-cache>
            <replicated-cache name="authenticationSessions" mode="SYNC" batching="false">
                <locking acquire-timeout="60000" striping="false"/>
                <transaction mode="NONE"/>
                <eviction strategy="LRU" max-entries="10000"/>
                <expiration interval="0" max-idle="3600000"/>
            </replicated-cache>
            <replicated-cache name="offlineSessions" mode="SYNC" batching="false">
                <locking acquire-timeout="60000" striping="false"/>
                <transaction mode="NONE"/>
                <eviction strategy="LRU" max-entries="10000"/>
                <expiration interval="0" max-idle="3600000"/>
            </replicated-cache>
            <replicated-cache name="clientSessions" mode="SYNC" batching="false">
                <locking acquire-timeout="60000" striping="false"/>
                <transaction mode="NONE"/>
                <eviction strategy="LRU" max-entries="10000"/>
                <expiration interval="0" max-idle="3600000"/>
            </replicated-cache>
            <replicated-cache name="userSessions" mode="SYNC" batching="false">
                <locking acquire-timeout="60000" striping="false"/>
                <transaction mode="NONE"/>
                <eviction strategy="LRU" max-entries="10000"/>
                <expiration interval="0" max-idle="3600000"/>
            </replicated-cache>
            <replicated-cache name="authorizationKeys" mode="SYNC" batching="false">
                <locking acquire-timeout="60000" striping="false"/>
                <transaction mode="NONE"/>
                <eviction strategy="LRU" max-entries="10000"/>
                <expiration interval="0" max-idle="3600000"/>
            </replicated-cache>
        </cache-container>
        <subsystem xmlns="urn:jboss:domain:remoting:1.0">
            <endpoint worker-threads="10"/>
            <http-connector name="http-remoting-connector" socket-binding="http"/>
        </subsystem>
    </infinispan>
</subsystem>

Notice the infinispan subsystem. This is where Keycloak stores its session data (users, tokens, client sessions, etc.). In an HA setup, these caches are configured to be replicated across all nodes in the cluster. When a user logs in on keycloak-0, that session data is immediately sent to keycloak-1. If keycloak-0 fails, keycloak-1 already has all the necessary session information to continue serving the user without them noticing a disruption.

The ha-singleton subsystem is responsible for managing services that should only run on one node at a time, like background tasks or specific listeners. It ensures that if a node running a singleton service fails, another node takes over that service.

The messaging-activemq subsystem is used by the HA singleton service for communication between nodes.

This entire setup relies on nodes being able to discover each other. By default, Keycloak uses multicast for discovery. If multicast isn’t an option in your network, you’ll need to configure a unicast discovery mechanism.

The core problem Keycloak HA solves is state consistency across multiple instances. Without it, each Keycloak instance would have its own, isolated session data. If a user logged into keycloak-0 and then their next request hit keycloak-1, keycloak-1 wouldn’t know about that session, and the user would be unauthenticated. By replicating session data using Infinispan, all nodes have a consistent view of active sessions.

The most surprising true thing about Keycloak HA is that it’s not just about replicating session data; it’s also about ensuring that critical background processes and administrative tasks are managed by only one active instance at a time, preventing race conditions or duplicate operations.

One aspect that often trips people up is the cache configuration. The mode="SYNC" on the Infinispan caches means that writes are confirmed by all nodes before the operation is considered complete. This guarantees consistency but can introduce latency. For performance-critical setups, you might explore ASYNC replication, but this comes with a risk of eventual consistency and potential data loss if a node fails before its changes are propagated. The batching="false" is also important; it ensures each cache update is sent immediately rather than waiting for a batch to fill, which is crucial for session data where immediacy is key.

The next hurdle you’ll likely face is configuring sticky sessions with your load balancer, ensuring a user’s requests consistently go to the same Keycloak node if you decide to deviate from full cache replication for certain data.

Want structured learning?

Take the full Keycloak course →