Monolith Performance: Profile and Fix Bottlenecks (2026)

The core issue is that your monolith’s request handling has become a distributed system problem, but it’s still running as a single process.

Here’s how you find and fix the bottlenecks:

Common Causes and Fixes

Database Connection Pool Exhaustion
- Diagnosis: Check your database connection pool metrics. Look for ActiveConnections approaching MaxConnections or ThreadsWaitingForConnection rising. In PostgreSQL, pg_stat_activity can show many idle connections.
- Fix: Increase the maxPoolSize in your application’s database connection configuration. For example, if you’re using HikariCP, change maximumPoolSize=10 to maximumPoolSize=25.
- Why it works: This allows more concurrent requests to acquire a database connection simultaneously, preventing threads from being blocked waiting for an available connection.
Excessive Garbage Collection (GC) Pauses
- Diagnosis: Monitor your JVM GC logs. Look for long Full GC pauses (e.g., > 500ms) or frequent Minor GC pauses that significantly impact application throughput. Tools like jvisualvm or GCViewer can visualize this.
- Fix:
  - Tune GC Algorithm: Switch to a more modern GC like G1GC (-XX:+UseG1GC) or ZGC (-XX:+UseZGC for newer JDKs).
  - Increase Heap Size: If objects are being promoted too quickly, increase the heap size: -Xmx4g -Xms4g.
  - Reduce Object Allocation: Profile your code to find high-allocation sites and optimize them.
- Why it works: Different GC algorithms have different pause time characteristics. G1GC and ZGC are designed for lower pause times. A larger heap gives the GC more room to work before needing to collect, and reducing object churn directly lowers GC pressure.
Thread Contention and Deadlocks
- Diagnosis: Use thread dumps to identify threads in BLOCKED or WAITING states. Look for patterns where threads are waiting on each other for locks. jstack <pid> or jcmd <pid> Thread.print are your friends here. Profilers like YourKit or JProfiler can visualize lock contention.
- Fix:
  - Reduce Synchronization Scope: Make synchronized blocks as small as possible.
  - Use Concurrent Data Structures: Replace synchronizedMap with ConcurrentHashMap.
  - Avoid Nested Locks: If you must use multiple locks, acquire them in a consistent order across all threads.
- Why it works: Minimizing the time spent holding locks or using lock-free data structures reduces the probability of threads blocking each other. Consistent lock ordering prevents circular dependencies that lead to deadlocks.
Inefficient Application Code (CPU-Bound)
- Diagnosis: Use a CPU profiler (e.g., async-profiler, perf, VisualVM’s sampler) to identify methods consuming the most CPU time. Look for hot spots in your application logic, not just external calls.
- Fix:
  - Algorithmic Improvements: Refactor inefficient algorithms (e.g., O(n^2) to O(n log n)).
  - Caching: Implement in-memory caches for frequently accessed, expensive-to-compute data.
  - Batching: Group similar operations to reduce overhead.
- Why it works: Directly optimizing the code that consumes the most CPU cycles reduces the overall processing time per request, freeing up threads and resources.
Slow External API Calls (Network I/O Bound)
- Diagnosis: Monitor your application’s network I/O. Look for high latency on outbound HTTP requests or other network calls. Tracing tools like Jaeger or Zipkin are invaluable here, showing the duration of each span, including external service calls.
- Fix:
  - Asynchronous Calls: Use non-blocking I/O (e.g., CompletableFuture, reactive libraries) for external calls.
  - Timeouts and Retries: Configure aggressive but reasonable timeouts for external calls (connectTimeout=500ms, readTimeout=1000ms for HTTP clients). Implement backoff-based retries.
  - Circuit Breakers: Implement circuit breaker patterns (e.g., Resilience4j) to quickly fail requests to an unhealthy external service.
- Why it works: Non-blocking I/O allows your threads to do other work while waiting for external responses. Proper timeouts and circuit breakers prevent cascading failures and stop your application from spending excessive time waiting on unresponsive services.
Insufficient Application Server Threads
- Diagnosis: Monitor your application server’s thread pool (e.g., Tomcat’s maxThreads, Undertow’s worker-threads). If ActiveThreads are consistently at maxThreads and requests are queued or dropped, this is your bottleneck.
- Fix: Increase the maxThreads setting in your application server’s configuration. For Tomcat, this is often maxThreads="200" in server.xml.
- Why it works: More threads can handle more concurrent requests, especially if the application is I/O bound (waiting for databases or external services). Be mindful of the trade-off with memory consumption and potential for increased contention.
Memory Leaks
- Diagnosis: Observe your application’s heap usage over time. If it steadily increases and never returns to a baseline, even after GC, you likely have a memory leak. Heap dumps analyzed with tools like Eclipse MAT can pinpoint leaking objects.
- Fix: Identify the leaking objects (e.g., unclosed resources, static collections holding references) and ensure they are properly released or cleared.
- Why it works: Eliminating memory leaks prevents the JVM from constantly running GC on ever-growing memory, which degrades performance and can lead to OutOfMemoryError.

You’ll likely hit java.lang.OutOfMemoryError: Metaspace if you’ve been dynamically loading and unloading classes without managing the classloader lifecycle.