Antonio Casado

Posted on Jun 17

Monitoring in Spring Boot

#webdev #programming #tutorial #productivity

Spring Boot Monitoring

Monitoring, logging, and operational support in Spring Boot

In production, a Spring Boot application should not only “run”; it must be observable, diagnosable, and supportable. That means we need health checks, metrics, logs, traces, alerts, dashboards, and clear incident procedures.

1. Monitoring in Spring Boot

Spring Boot provides Actuator for production-ready monitoring and management. It exposes endpoints such as health, metrics, info, loggers, mappings, environment, and Prometheus metrics. These endpoints can be exposed over HTTP or JMX.

Typical dependency:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

Useful endpoints:

/actuator/health
/actuator/info
/actuator/metrics
/actuator/prometheus
/actuator/loggers
/actuator/mappings

Example configuration:

management:
  endpoints:
    web:
      exposure:
        include: health,info,metrics,prometheus,loggers
  endpoint:
    health:
      show-details: when-authorized
  metrics:
    tags:
      application: customer-service

What to monitor

For a backend service, I would monitor:

Area	Examples
Availability	health status, uptime, readiness, liveness
Performance	response time, p95/p99 latency, throughput
Errors	HTTP 4xx/5xx, exceptions, failed Kafka messages
JVM	heap, GC, threads, CPU, memory
Database	connection pool usage, slow queries, timeouts
Kafka	consumer lag, producer errors, retry rate
Infrastructure	pod restarts, CPU/memory, Kubernetes health

Spring Boot uses Micrometer as the metrics facade, and Actuator auto-configures metrics collection for many parts of the application. (Home)

For Prometheus integration:

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

Then expose:

management:
  endpoints:
    web:
      exposure:
        include: prometheus,health,metrics

Prometheus can scrape /actuator/prometheus, and Grafana can visualize the metrics. Spring Boot’s Prometheus endpoint exposes application metrics in the format expected by Prometheus.

2. Logging in Spring Boot

Logging is used to understand what happened inside the application, especially during failures.

Spring Boot supports normal logging and also structured logging, where logs are emitted in machine-readable formats such as JSON. Current Spring Boot documentation lists built-in structured logging support for formats such as ECS, GELF, and Logstash.

Typical logging levels:

Level	Usage
ERROR	Something failed and needs attention
WARN	Unexpected but recoverable situation
INFO	Business or operational events
DEBUG	Technical details for development/debugging
TRACE	Very detailed low-level flow

Example:

@Slf4j
@Service
public class PaymentService {

    public PaymentResult processPayment(PaymentRequest request) {
        log.info("Processing payment. orderId={}, customerId={}",
                request.orderId(), request.customerId());

        try {
            // business logic
            return new PaymentResult("SUCCESS");
        } catch (Exception ex) {
            log.error("Payment processing failed. orderId={}",
                    request.orderId(), ex);
            throw ex;
        }
    }
}

Good logging practices:

Do:
- Log business identifiers: orderId, customerId, transactionId
- Use correlation IDs or trace IDs
- Use structured logs in production
- Log exceptions with stack traces
- Avoid sensitive data

Do not:
- Log passwords, tokens, card numbers, personal data
- Log too much inside loops
- Use System.out.println()
- Hide errors with empty catch blocks

Example with correlation ID:

@Component
public class CorrelationIdFilter extends OncePerRequestFilter {

    private static final String CORRELATION_ID = "X-Correlation-Id";

    @Override
    protected void doFilterInternal(
            HttpServletRequest request,
            HttpServletResponse response,
            FilterChain filterChain
    ) throws ServletException, IOException {

        String correlationId = Optional
                .ofNullable(request.getHeader(CORRELATION_ID))
                .orElse(UUID.randomUUID().toString());

        MDC.put("correlationId", correlationId);
        response.setHeader(CORRELATION_ID, correlationId);

        try {
            filterChain.doFilter(request, response);
        } finally {
            MDC.clear();
        }
    }
}

This helps connect all logs belonging to the same request.

3. Operational support

Operational support means keeping the application stable after deployment.

Typical responsibilities include:

Area	Responsibility
Incident support	investigate production issues
Monitoring	check dashboards and alerts
Logging	analyze application logs
Deployment support	validate releases, rollback if needed
Performance support	investigate latency, memory, CPU, database slowness
Kafka support	check lag, retries, dead-letter topics
Database support	analyze slow queries, connection pool problems
Documentation	runbooks, known issues, recovery steps

A good Spring Boot service should have:

- Health checks
- Metrics
- Centralized logs
- Distributed tracing
- Alerts
- Dashboards
- Runbooks
- Rollback strategy
- Error handling
- Retry and timeout policies

4. Health checks

Spring Boot Actuator provides /actuator/health, which can report whether the application is up and whether dependencies such as database, disk, Redis, Kafka, or other systems are healthy.

Example custom health indicator:

@Component
public class ExternalSystemHealthIndicator implements HealthIndicator {

    @Override
    public Health health() {
        boolean externalSystemAvailable = checkExternalSystem();

        if (externalSystemAvailable) {
            return Health.up()
                    .withDetail("externalSystem", "Available")
                    .build();
        }

        return Health.down()
                .withDetail("externalSystem", "Unavailable")
                .build();
    }

    private boolean checkExternalSystem() {
        return true;
    }
}

Then it appears in:

/actuator/health

5. Metrics example

You can create custom business metrics using Micrometer.

@Service
public class OrderService {

    private final Counter orderCreatedCounter;

    public OrderService(MeterRegistry meterRegistry) {
        this.orderCreatedCounter = Counter.builder("orders.created")
                .description("Number of created orders")
                .tag("service", "order-service")
                .register(meterRegistry);
    }

    public void createOrder() {
        // business logic
        orderCreatedCounter.increment();
    }
}

This metric can later be scraped by Prometheus and shown in Grafana.

6. Logging and monitoring architecture

A common production setup:

Spring Boot App
   |
   |-- Actuator metrics
   v
Prometheus
   |
   v
Grafana dashboards

Spring Boot App
   |
   |-- JSON logs
   v
Log collector: Fluent Bit / Filebeat / Logstash
   |
   v
Elasticsearch / OpenSearch / Loki / Splunk

Spring Boot App
   |
   |-- traces
   v
OpenTelemetry / Jaeger / Tempo / Zipkin

7. Example interview answer

You can say:

In Spring Boot, I use Actuator and Micrometer for monitoring. Actuator gives production-ready endpoints such as health, metrics, info, and Prometheus. Metrics are collected by Micrometer and exposed to Prometheus, then visualized in Grafana.

For logging, I use SLF4J with Logback or structured JSON logging. I make sure logs include correlation IDs, business identifiers, useful error messages, and stack traces, but never sensitive data.

For operational support, I focus on dashboards, alerts, runbooks, incident investigation, log analysis, performance troubleshooting, database monitoring, Kafka lag monitoring, and deployment validation. The goal is to detect issues early, reduce recovery time, and keep the service stable in production.

8. Practical checklist

For a production Spring Boot service:

Monitoring:
[ ] Add spring-boot-starter-actuator
[ ] Expose health, metrics, prometheus
[ ] Add Micrometer Prometheus registry
[ ] Create Grafana dashboards
[ ] Add alerts for errors, latency, CPU, memory, restarts

Logging:
[ ] Use SLF4J
[ ] Use structured JSON logs
[ ] Add correlation ID
[ ] Avoid sensitive data
[ ] Centralize logs in ELK, Loki, Splunk, or similar

Operations:
[ ] Add readiness and liveness probes
[ ] Create runbooks
[ ] Define rollback strategy
[ ] Monitor database pool
[ ] Monitor Kafka lag
[ ] Track p95/p99 latency
[ ] Document common incidents and resolutions

A strong production Spring Boot application should be easy to monitor, easy to debug, and easy to recover when something goes wrong.

DEV Community