OIDC: OAuth 2.0's Identity Layer

OpenID Connect on Kafka is less about securing Kafka itself and more about using Kafka as a conduit for distributing identity information after a user has authenticated via an OpenID Connect provider.

Let’s watch a user log in. Alice wants to access a Kafka topic.

Alice’s application (the "client") redirects her to the OpenID Connect provider (e.g., Okta, Auth0, Google).
Alice logs into the provider.
The provider redirects Alice back to her application with an authorization code.
Alice’s application exchanges the authorization code for an ID Token and an Access Token from the provider.
Now, Alice’s application has proof of her identity and authorization. This is where Kafka comes in. The application can publish a message to a Kafka topic containing Alice’s identity information (e.g., her user ID, roles, claims) extracted from the ID Token.
Other services that need to know about Alice (e.g., an authorization service, a data processing pipeline) subscribe to this Kafka topic and consume the identity information. They don’t need to talk to the OIDC provider directly; they trust the identity information distributed via Kafka.

Here’s a simplified conceptual flow using JSON for messages:

// Message published by Alice's application after successful OIDC login
{
  "eventType": "userLoggedIn",
  "timestamp": "2023-10-27T10:30:00Z",
  "userId": "alice_12345",
  "username": "alice@example.com",
  "roles": ["user", "editor"],
  "claims": {
    "tenant_id": "abc-789",
    "department": "engineering"
  },
  "oidcProvider": "mycorp.okta.com"
}

This Kafka topic acts as a distributed, real-time user registry, populated by the authoritative OIDC provider. Services consuming these messages can make authorization decisions or personalize user experiences based on this trusted identity data.

The core problem this solves is decoupling services from direct OIDC provider interactions. Instead of every microservice needing to integrate with Okta, Keycloak, or another provider, they only need to integrate with Kafka. The OIDC provider authenticates the user once, issues tokens, and the client application then broadcasts the essential identity attributes to Kafka. This simplifies service-to-service authorization and reduces the load on the OIDC provider.

Internally, the "client" application (which could be a web app, a mobile backend, or even another microservice acting on behalf of a user) uses a standard OIDC client library. After obtaining an ID Token (which is a JWT), it parses out the relevant claims. It then uses a Kafka producer library (like kafka-python, librdkafka, or java-kafka-clients) to send a message containing these claims to a designated Kafka topic. The "consumer" services use Kafka consumer libraries to read from this topic. These consumers then parse the incoming identity messages and use that information to grant or deny access, filter data, or perform other context-aware actions.

The levers you control are:

OIDC Provider Configuration: Which scopes you request (e.g., openid, profile, email, custom scopes for roles/permissions), which claims are included in the ID Token.
Client Application Logic: How the ID Token is parsed, which claims are extracted, how they are transformed into Kafka messages, and what the Kafka message schema looks like.
Kafka Topic Design: The name of the topic, partitioning strategy (e.g., by userId for ordered processing), and retention policies.
Consumer Service Logic: How consumers subscribe to topics, how they process the identity messages, and how they use the information for downstream decisions.

A surprising number of engineers assume that Kafka itself is performing the OIDC authentication or validation. In this pattern, Kafka is merely a message bus. The actual authentication and issuance of tokens are handled entirely by the external OIDC provider. Kafka’s role is to distribute the results of that authentication – the identity claims – to interested parties. The security of the identity data relies on the security of the OIDC provider, the client application’s secure handling of tokens, and the security of the Kafka cluster itself (e.g., TLS encryption for data in transit, SASL for authentication to Kafka).

The next conceptual hurdle is managing the lifecycle of these identity events, particularly how to handle user de-provisioning or revocation when the OIDC provider signals it.