The MessageSizeTooLargeException means a producer tried to send a message larger than the broker is configured to accept.
Here are the common causes and how to fix them:
1. Producer max.request.size is too small.
This is the most frequent culprit. The producer has a limit on how large a single request (which can contain multiple messages) it will send.
- Diagnosis: On the producer side, check your producer configuration for
max.request.size. If you don’t see it explicitly set, it defaults to 1MB (1048576 bytes). - Fix: Increase
max.request.sizein your producer’s configuration. For example, to allow requests up to 10MB:producer.properties: max.request.size: 10485760 # 10MB - Why it works: This setting directly controls the maximum size of a single request the producer will batch and send to the broker.
2. Broker message.max.bytes is too small.
This broker configuration defines the maximum size of a single message that the broker will accept. It’s often overlooked because people focus on the producer setting.
- Diagnosis: On the broker(s), check
server.propertiesformessage.max.bytes. The default is 1MB (1048576 bytes). - Fix: Increase
message.max.byteson all brokers in the cluster. For example, to allow messages up to 10MB:server.properties: message.max.bytes: 10485760 # 10MB - Why it works: This is the hard limit on the broker’s ingestion side for individual messages.
3. Broker replica.fetch.max.bytes is too small.
When a leader broker receives a large message, it needs to replicate it to its followers. This setting limits the size of data fetched by a replica from the leader. If this is smaller than the message.max.bytes (or the actual message size), replication will fail, and the producer might eventually get an error.
- Diagnosis: On the broker(s), check
server.propertiesforreplica.fetch.max.bytes. The default is 1MB (1048576 bytes). - Fix: Increase
replica.fetch.max.byteson all brokers. It should be at least as large asmessage.max.bytes. For example, to allow fetches up to 10MB:server.properties: replica.fetch.max.bytes: 10485760 # 10MB - Why it works: This ensures that replicas can pull the entire large message from the leader during the replication process.
4. Broker log.segment.bytes is too small.
Kafka brokers store messages in log segments, which are files on disk. This setting defines the maximum size of a single segment file. While not a direct cause of MessageSizeTooLargeException, if your messages are large and this is set too low, Kafka will frequently roll over segments, which can indirectly lead to issues or mask the real problem by causing other errors. More importantly, if a single message exceeds this size, it cannot even be written to a segment, though this is less common as it’s usually much larger than message.max.bytes.
- Diagnosis: On the broker(s), check
server.propertiesforlog.segment.bytes. The default is 1GB (1073741824 bytes), so this is rarely the culprit for typical message size issues but is good to be aware of. - Fix: Increase
log.segment.bytesif your messages are exceptionally large and you’re hitting segment size limits beforemessage.max.bytes. For example, to allow 2GB segments:server.properties: log.segment.bytes: 2147483648 # 2GB - Why it works: This allows larger individual log files, accommodating larger messages within a single segment if necessary.
5. Producer batching behavior.
While max.request.size is the request limit, the producer’s internal batching (batch.size) can also play a role. If batch.size is very large and a single message within that batch is also large, it can exceed max.request.size. However, the MessageSizeTooLargeException is typically thrown when a single message is too large for the broker’s message.max.bytes or the producer’s max.request.size, not when a batch exceeds max.request.size due to many small messages. The producer will simply stop adding messages to the current batch if adding the next one would exceed max.request.size. If a single message itself is larger than max.request.size, it will be sent in its own request, and that’s where the exception usually originates.
- Diagnosis: Examine your producer’s
batch.sizeandlinger.mssettings. If you have very large messages, you might consider settingbatch.sizeto a value equal to or slightly larger than your expected maximum message size, andlinger.msto a small value to send batches more frequently. - Fix: If you have a few very large messages and many small ones, you might set
batch.sizeto a value that accommodates your largest expected message, or even set it tomax.request.sizeif you only send large messages.producer.properties: batch.size: 10485760 # Match or exceed max message size linger.ms: 10 # Send batches more frequently - Why it works: Adjusting
batch.sizecan help control how messages are grouped beforemax.request.sizeis considered. A smallerbatch.sizemight help isolate large messages into their own requests more efficiently.
Important Note: You must ensure that message.max.bytes on the broker is at least as large as the producer’s max.request.size. If max.request.size is larger than message.max.bytes, the producer might send a request that the broker cannot accept, leading to this error.
After fixing these, you might encounter UnknownTopicOrPartitionException if you haven’t ensured your topic has enough partitions for your throughput needs, or if there are other configuration inconsistencies across your cluster.