Add advance health check for database#361
Conversation
…ance check for database
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 WalkthroughWalkthroughBuild now emits git metadata during Maven initialize; /version returns JSON built from generated git.properties; health checks gained a background MySQL diagnostics scheduler, richer per-component status and lifecycle shutdown; auth filter now exempts /health and /version from JWT checks. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant Scheduler
participant HealthService
participant MySQL as MySQL Database
participant Cache as Severity Cache
participant Redis as Redis
Note over Scheduler,HealthService: Background diagnostics (startup & periodic)
Scheduler->>HealthService: trigger runAdvancedMySQLDiagnostics
HealthService->>MySQL: run diagnostic queries (stuck, long tx, deadlocks, slow queries)
MySQL-->>HealthService: diagnostic results
HealthService->>HealthService: compute worst-case severity
HealthService->>Cache: update cached DB severity
Note over Client,HealthService: Health check request flow
Client->>HealthService: GET /health
HealthService->>MySQL: SELECT 1 (bounded timeout)
MySQL-->>HealthService: connectivity result
HealthService->>Cache: read cached DB severity
HealthService->>Redis: check connectivity
Redis-->>HealthService: connectivity result
HealthService->>Client: return per-component map (status, severity, checkedAt)
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java (1)
239-256:⚠️ Potential issue | 🟡 MinorDuplicate
/user/refreshTokenentry with inconsistent match semantics.Line 243 already checks
path.equals(contextPath + "/user/refreshToken"), and line 254 addspath.startsWith(contextPath + "/user/refreshToken"). ThestartsWithvariant is strictly more permissive (it would also match/user/refreshTokenEvil) and makes theequalscheck on line 243 redundant. Remove the duplicate and decide which semantic is intended —equalsis safer.Proposed fix
|| path.startsWith(contextPath + "/user/validateSecurityQuestionAndAnswer") || path.startsWith(contextPath + "/user/logOutUserFromConcurrentSession") - || path.startsWith(contextPath + "/user/refreshToken") || path.equals(contextPath + "/health") || path.equals(contextPath + "/version");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java` around lines 239 - 256, The shouldSkipAuthentication method contains duplicate checks for "/user/refreshToken": one using equals (path.equals(contextPath + "/user/refreshToken")) and another using startsWith (path.startsWith(contextPath + "/user/refreshToken")); remove the redundant check and choose the intended semantic (prefer equals for strict match). Edit the shouldSkipAuthentication function to eliminate the duplicate and keep only the intended condition (either the equals variant for exact match or the startsWith variant if prefix matching is required), ensuring no other logic references rely on the removed form.
🧹 Nitpick comments (3)
src/main/java/com/iemr/common/controller/version/VersionController.java (1)
58-76: Consider caching git properties — they never change at runtime.
loadGitProperties()re-reads git.properties from the classpath on every/versioncall. Since these values are immutable after build time, loading them once (e.g., in@PostConstructor a lazy-init field) would be cleaner and avoids repeated I/O, even if the endpoint is low-traffic.Also, the
logger.info("version Controller Start/End")calls on lines 61 and 74 will fire on every request to a public, unauthenticated endpoint — consider lowering todebug.Proposed refactor — load once at startup
`@RestController` public class VersionController { private final Logger logger = LoggerFactory.getLogger(this.getClass().getSimpleName()); private static final String UNKNOWN_VALUE = "unknown"; + private final Map<String, String> versionInfo; + + public VersionController() { + Map<String, String> info = new LinkedHashMap<>(); + try { + Properties gitProperties = loadGitProperties(); + info.put("buildTimestamp", gitProperties.getProperty("git.build.time", UNKNOWN_VALUE)); + info.put("version", gitProperties.getProperty("git.build.version", UNKNOWN_VALUE)); + info.put("branch", gitProperties.getProperty("git.branch", UNKNOWN_VALUE)); + info.put("commitHash", gitProperties.getProperty("git.commit.id.abbrev", UNKNOWN_VALUE)); + } catch (Exception e) { + logger.error("Failed to load version information", e); + info.put("buildTimestamp", UNKNOWN_VALUE); + info.put("version", UNKNOWN_VALUE); + info.put("branch", UNKNOWN_VALUE); + info.put("commitHash", UNKNOWN_VALUE); + } + this.versionInfo = Map.copyOf(info); + } `@Operation`(summary = "Get version information") `@GetMapping`(value = "/version", produces = MediaType.APPLICATION_JSON_VALUE) public ResponseEntity<Map<String, String>> versionInformation() { - Map<String, String> response = new LinkedHashMap<>(); - try { - logger.info("version Controller Start"); - Properties gitProperties = loadGitProperties(); - response.put("buildTimestamp", gitProperties.getProperty("git.build.time", UNKNOWN_VALUE)); - response.put("version", gitProperties.getProperty("git.build.version", UNKNOWN_VALUE)); - response.put("branch", gitProperties.getProperty("git.branch", UNKNOWN_VALUE)); - response.put("commitHash", gitProperties.getProperty("git.commit.id.abbrev", UNKNOWN_VALUE)); - } catch (Exception e) { - logger.error("Failed to load version information", e); - response.put("buildTimestamp", UNKNOWN_VALUE); - response.put("version", UNKNOWN_VALUE); - response.put("branch", UNKNOWN_VALUE); - response.put("commitHash", UNKNOWN_VALUE); - } - logger.info("version Controller End"); - return ResponseEntity.ok(response); + return ResponseEntity.ok(versionInfo); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/controller/version/VersionController.java` around lines 58 - 76, The versionInformation() method currently calls loadGitProperties() on every request and logs "version Controller Start/End" at INFO; change this to load git.properties once at startup and use DEBUG for request tracing: create a private field (e.g., gitPropertiesCache or cachedGitInfo) populated in a `@PostConstruct` method or as a lazy-initialized singleton that calls loadGitProperties() once and falls back to UNKNOWN_VALUE on error, update versionInformation() to read from the cache instead of calling loadGitProperties(), and change logger.info(...) to logger.debug(...) to avoid noisy INFO logs for the public endpoint.src/main/java/com/iemr/common/service/health/HealthService.java (2)
50-62: Unused constantRESPONSE_TIME_SLOW_MS.
RESPONSE_TIME_SLOW_MS(line 56) is declared but never referenced anywhere in this file. Remove it to avoid confusion, or implement the response-time check it was intended for.Remove dead constant
private static final String LOG_EVENT_POOL_EXHAUSTED = "MYSQL_POOL_EXHAUSTED"; - private static final long RESPONSE_TIME_SLOW_MS = 2000; // > 2s → SLOW private static final int STUCK_PROCESS_THRESHOLD = 5; // > 5 stuck → WARNING🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines 50 - 62, The constant RESPONSE_TIME_SLOW_MS is declared but never used; either remove the unused constant from HealthService or implement the intended response-time check using it (e.g., in any method that evaluates query latency) so the value is referenced. Locate RESPONSE_TIME_SLOW_MS in HealthService and either delete that declaration along with any related comment, or add logic (using RESPONSE_TIME_SLOW_MS) where query/response times are assessed (e.g., in methods that compute slow query metrics) to classify responses as SLOW when > RESPONSE_TIME_SLOW_MS.
202-355: Background diagnostic approach is well-structured overall.The per-check isolation with individual try-catch blocks, structured log events with tags, and the escalation model are all sound. A few minor observations beyond what's already flagged:
CHECK 2 (long-running transactions, line 253): Escalates directly to
CRITICAL, which maps toDOWNin the health endpoint. A single long transaction (e.g., a legitimate batch job > 30s) would mark the entire DB as DOWN. Consider usingWARNINGhere instead, or making the threshold configurable.CHECK 5 (lines 304-337):
Threads_connectedis the server-wide thread count, not the application's pool usage. In shared-database deployments, other apps' connections inflate this metric and could trigger false alerts. The log message says "connection pool" but this is actually server-level connection usage — consider clarifying the log or adding a note.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines 202 - 355, runAdvancedMySQLDiagnostics: for CHECK 2 (InnoDB long-running transactions / LOG_EVENT_LOCK_WAIT) avoid mapping a single long transaction directly to CRITICAL/DOWN—change the escalation to WARNING (use escalate(worstSeverity, "WARNING")) or make the threshold/severity configurable (introduce a config flag used by the check) so legitimate long-running jobs don't mark DB as DOWN; for CHECK 5 (Threads_connected / LOG_EVENT_POOL_EXHAUSTED / LOG_EVENT_CONN_USAGE) update the log messages and variable names to reflect these are server-level connection metrics (e.g., mention "server-level Threads_connected (may include other apps)" instead of "connection pool") and/or add a comment explaining this limitation so alerts aren't misleading in shared-database deployments.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 141-200: checkDatabaseConnectivity currently returns the
cachedDbSeverity (which defaults to "INFO") on a successful SELECT 1 while Redis
returns "OK", causing inconsistent healthy severities; modify
checkDatabaseConnectivity so that on successful connectivity it sets severity to
"OK" (and uses resolveDatabaseStatus("OK") for status) instead of using
cachedDbSeverity, and ensure any related logic in runAdvancedMySQLDiagnostics
continues to update cachedDbSeverity for degraded states only so healthy checks
remain "OK".
- Around line 284-301: The slow query check uses the cumulative MySQL
"Slow_queries" counter and will permanently warn after the first slow query; add
a long field (e.g., previousSlowQueryCount) to HealthService and update the
CHECK 4 block to compute a delta = slowQueries - previousSlowQueryCount, only
warn/escalate (use LOG_EVENT_SLOW_QUERIES and escalate(...)) when delta > 0,
include the delta and cumulative value in the log, and then assign
previousSlowQueryCount = slowQueries so subsequent runs only flag new slow
queries (follow the same delta-tracking pattern used by previousDeadlockCount).
---
Outside diff comments:
In `@src/main/java/com/iemr/common/utils/JwtUserIdValidationFilter.java`:
- Around line 239-256: The shouldSkipAuthentication method contains duplicate
checks for "/user/refreshToken": one using equals (path.equals(contextPath +
"/user/refreshToken")) and another using startsWith (path.startsWith(contextPath
+ "/user/refreshToken")); remove the redundant check and choose the intended
semantic (prefer equals for strict match). Edit the shouldSkipAuthentication
function to eliminate the duplicate and keep only the intended condition (either
the equals variant for exact match or the startsWith variant if prefix matching
is required), ensuring no other logic references rely on the removed form.
---
Nitpick comments:
In `@src/main/java/com/iemr/common/controller/version/VersionController.java`:
- Around line 58-76: The versionInformation() method currently calls
loadGitProperties() on every request and logs "version Controller Start/End" at
INFO; change this to load git.properties once at startup and use DEBUG for
request tracing: create a private field (e.g., gitPropertiesCache or
cachedGitInfo) populated in a `@PostConstruct` method or as a lazy-initialized
singleton that calls loadGitProperties() once and falls back to UNKNOWN_VALUE on
error, update versionInformation() to read from the cache instead of calling
loadGitProperties(), and change logger.info(...) to logger.debug(...) to avoid
noisy INFO logs for the public endpoint.
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 50-62: The constant RESPONSE_TIME_SLOW_MS is declared but never
used; either remove the unused constant from HealthService or implement the
intended response-time check using it (e.g., in any method that evaluates query
latency) so the value is referenced. Locate RESPONSE_TIME_SLOW_MS in
HealthService and either delete that declaration along with any related comment,
or add logic (using RESPONSE_TIME_SLOW_MS) where query/response times are
assessed (e.g., in methods that compute slow query metrics) to classify
responses as SLOW when > RESPONSE_TIME_SLOW_MS.
- Around line 202-355: runAdvancedMySQLDiagnostics: for CHECK 2 (InnoDB
long-running transactions / LOG_EVENT_LOCK_WAIT) avoid mapping a single long
transaction directly to CRITICAL/DOWN—change the escalation to WARNING (use
escalate(worstSeverity, "WARNING")) or make the threshold/severity configurable
(introduce a config flag used by the check) so legitimate long-running jobs
don't mark DB as DOWN; for CHECK 5 (Threads_connected / LOG_EVENT_POOL_EXHAUSTED
/ LOG_EVENT_CONN_USAGE) update the log messages and variable names to reflect
these are server-level connection metrics (e.g., mention "server-level
Threads_connected (may include other apps)" instead of "connection pool") and/or
add a comment explaining this limitation so alerts aren't misleading in
shared-database deployments.
src/main/java/com/iemr/common/service/health/HealthService.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/main/java/com/iemr/common/service/health/HealthService.java (2)
258-281: Stuck process check: warn-level log butOKseverity for counts 1–5.When
stuckCountis between 1 andSTUCK_PROCESS_THRESHOLD(5), the code logs atWARNlevel but returnsSEVERITY_OK, which means no escalation. This is a bit inconsistent — either the situation is worth alerting on (and should return at leastINFOor a lower-tier signal) or the log should be atINFO/DEBUGlevel.If this is intentional observability without escalation, consider logging at
INFOinstead ofWARNto avoid noisy log-based alerts that don't match the reported severity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines 258 - 281, performStuckProcessCheck logs a WARN when any stuckCount > 0 but still returns SEVERITY_OK for counts <= STUCK_PROCESS_THRESHOLD, causing inconsistent noisy warnings; change the non-escalating branch so that when stuckCount > 0 && stuckCount <= STUCK_PROCESS_THRESHOLD you call logger.info (not logger.warn) and retain returning SEVERITY_OK, and keep logger.warn + return SEVERITY_WARNING only when stuckCount > STUCK_PROCESS_THRESHOLD (refer to performStuckProcessCheck, STUCK_PROCESS_THRESHOLD, STUCK_PROCESS_SECONDS, logger.warn/info, SEVERITY_OK, SEVERITY_WARNING).
79-79: Remove unused constantRESPONSE_TIME_SLOW_MS.This constant is declared at line 79 but never used anywhere in the codebase. Remove it to keep the code clean.
🧹 Proposed fix
- private static final long RESPONSE_TIME_SLOW_MS = 2000; // > 2s → SLOW🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/service/health/HealthService.java` at line 79, Remove the unused constant RESPONSE_TIME_SLOW_MS from HealthService to clean up dead code: locate the declaration "private static final long RESPONSE_TIME_SLOW_MS = 2000;" in the HealthService class and delete that field, then run a build/compile to confirm no references remain (if any exist, replace usages with the appropriate existing threshold or refactor to use the existing RESPONSE_TIME_* constants).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 174-183: When a successful SELECT 1 proves DB connectivity, avoid
reporting DOWN based on background diagnostics: in the block that reads
cachedDbSeverity and sets result.put(FIELD_STATUS...) (using
cachedDbSeverity.get() and resolveDatabaseStatus(severity)), check if
resolveDatabaseStatus(severity) would return DOWN and, if so, override/cap the
severity to DEGRADED before putting values into result (e.g., set severity =
"DEGRADED" and use resolveDatabaseStatus("DEGRADED") for FIELD_STATUS); make the
same change in the similar block around lines 392-398 so that DOWN is reserved
for actual connection failures handled in the catch branch.
- Around line 283-304: performLongTransactionCheck currently marks any single
long-running InnoDB transaction (> STUCK_PROCESS_SECONDS) as CRITICAL; change it
to use a count threshold similar to STUCK_PROCESS_THRESHOLD (e.g.,
STUCK_TRANSACTION_THRESHOLD) and implement graduated escalation: if lockCount ==
0 return SEVERITY_OK; if 0 < lockCount <= STUCK_TRANSACTION_THRESHOLD return
SEVERITY_WARNING (and log with LOG_EVENT_LOCK_WAIT, lockCount and threshold); if
lockCount > STUCK_TRANSACTION_THRESHOLD return SEVERITY_CRITICAL (and log
likewise). Keep existing exception logging behavior but include lockCount-based
decision logic inside performLongTransactionCheck and add/define the new
STUCK_TRANSACTION_THRESHOLD constant.
---
Duplicate comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 330-353: The current delta logic in performSlowQueryCheck uses
previousSlowQueryCount.getAndSet(slowQueries) which can produce a false-positive
warning on first run if previousSlowQueryCount defaults to 0; change the
initialization/flow so the first observed slowQueries value is only used to seed
previousSlowQueryCount (no warning) and subsequent runs compute delta normally.
Specifically, in performSlowQueryCheck, after reading slowQueries from
ResultSet, detect an uninitialized sentinel (e.g., previousSlowQueryCount == 0
or a dedicated negative sentinel) and set previousSlowQueryCount to slowQueries
without logging; otherwise compute delta = slowQueries - previousSlow and only
log/warn when delta > 0 (retain STATUS_VALUE and LOG_EVENT_SLOW_QUERIES
references).
---
Nitpick comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 258-281: performStuckProcessCheck logs a WARN when any stuckCount
> 0 but still returns SEVERITY_OK for counts <= STUCK_PROCESS_THRESHOLD, causing
inconsistent noisy warnings; change the non-escalating branch so that when
stuckCount > 0 && stuckCount <= STUCK_PROCESS_THRESHOLD you call logger.info
(not logger.warn) and retain returning SEVERITY_OK, and keep logger.warn +
return SEVERITY_WARNING only when stuckCount > STUCK_PROCESS_THRESHOLD (refer to
performStuckProcessCheck, STUCK_PROCESS_THRESHOLD, STUCK_PROCESS_SECONDS,
logger.warn/info, SEVERITY_OK, SEVERITY_WARNING).
- Line 79: Remove the unused constant RESPONSE_TIME_SLOW_MS from HealthService
to clean up dead code: locate the declaration "private static final long
RESPONSE_TIME_SLOW_MS = 2000;" in the HealthService class and delete that field,
then run a build/compile to confirm no references remain (if any exist, replace
usages with the appropriate existing threshold or refactor to use the existing
RESPONSE_TIME_* constants).
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
src/main/java/com/iemr/common/service/health/HealthService.java (2)
102-103: Both delta counters start at0, causing a false-positive WARNING on the first diagnostic run.
Innodb_deadlocksandSlow_queriesare server-lifetime cumulative counters. On first run after service startup,previousDeadlockCountandpreviousSlowQueryCountare0, so any non-zero current value (accumulated before this service started) registers as a new delta and emits a WARNING — triggering alert noise on every deployment.Consider seeding the counters from MySQL during the first diagnostic run using a sentinel value (e.g.,
-1L) to skip delta comparison:♻️ Proposed approach
- private final AtomicLong previousDeadlockCount = new AtomicLong(0); - private final AtomicLong previousSlowQueryCount = new AtomicLong(0); + // -1L = "first run, seed only — do not emit a delta warning" + private final AtomicLong previousDeadlockCount = new AtomicLong(-1L); + private final AtomicLong previousSlowQueryCount = new AtomicLong(-1L);Then in each delta check:
- long previousDeadlocks = previousDeadlockCount.getAndSet(currentDeadlocks); - if (currentDeadlocks > previousDeadlocks) { + long previousDeadlocks = previousDeadlockCount.getAndSet(currentDeadlocks); + if (previousDeadlocks >= 0 && currentDeadlocks > previousDeadlocks) {Apply the same guard to
performSlowQueryCheck.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/service/health/HealthService.java` around lines 102 - 103, Previous delta counters previousDeadlockCount and previousSlowQueryCount are initialized to 0 causing false positives on first run; change their initial sentinel to -1L and update the diagnostic logic in the methods that compute deltas (the deadlock check and performSlowQueryCheck) to treat a previous value of -1 as "first run"—on first run set previousDeadlockCount/previousSlowQueryCount to the current MySQL-retrieved count and skip emitting a WARNING or computing a delta, otherwise compute delta normally and update the previous counter after the check.
79-79:RESPONSE_TIME_SLOW_MSis declared but never used — either wire it in or remove it.If the intent is to classify a slow-but-alive DB response (e.g., response time > 2 s →
SLOW/WARNING), the elapsed-time measurement and comparison logic is missing fromcheckDatabaseConnectivity. Otherwise, remove the dead constant.#!/bin/bash # Confirm RESPONSE_TIME_SLOW_MS is not referenced anywhere else in the codebase. rg -n "RESPONSE_TIME_SLOW_MS"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/java/com/iemr/common/service/health/HealthService.java` at line 79, RESPONSE_TIME_SLOW_MS is declared but unused; either remove the constant or wire it into checkDatabaseConnectivity by measuring DB call elapsed time (capture start/end around the DB ping in checkDatabaseConnectivity), compare the elapsed millis to RESPONSE_TIME_SLOW_MS and return/mark a SLOW or WARNING health state accordingly (use the existing health result/enum used by checkDatabaseConnectivity), or if you choose deletion simply remove the RESPONSE_TIME_SLOW_MS constant and any related dead code; update unit tests if present to reflect the chosen behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 177-181: The health check currently only sets
stmt.setQueryTimeout(3) but does not bound dataSource.getConnection(), so
getConnection() can block up to the pool's connectionTimeout and violate the
"/health must never block > 3s" guarantee; either ensure the connection pool
(HikariCP) configuration sets connectionTimeout ≤ 3000 ms or wrap the entire
connection+query block in a timed task (e.g., ExecutorService.submit(() -> { try
(Connection c = dataSource.getConnection(); Statement s = c.createStatement()) {
s.setQueryTimeout(3); s.execute("SELECT 1"); } })).get(3, TimeUnit.SECONDS) and
handle TimeoutException by failing fast and closing resources, referencing
dataSource.getConnection(), stmt.setQueryTimeout(...), and
performConnectionUsageCheck to keep consistent behavior.
- Around line 269-276: The current logic logs sub-threshold stuck MySQL process
counts using logger.warn while still returning SEVERITY_OK; change the logging
level for the non-escalating branch so that when stuckCount > 0 but <=
STUCK_PROCESS_THRESHOLD (use the same check around stuckCount and
STUCK_PROCESS_THRESHOLD) you call logger.info instead of logger.warn and keep
returning SEVERITY_OK, and reserve logger.warn (or leave as-is) only in the
branch where stuckCount > STUCK_PROCESS_THRESHOLD that returns SEVERITY_WARNING;
update the logging call that references LOG_EVENT_STUCK_PROCESS and
STUCK_PROCESS_SECONDS accordingly.
- Line 31: Update the incompatible import in HealthService: replace the javax
annotation import with the Jakarta equivalent by changing the import of
PreDestroy from javax.annotation.PreDestroy to jakarta.annotation.PreDestroy so
the HealthService class compiles under Spring Boot 3.2.2 (locate the import
statement in the HealthService file and update it accordingly).
---
Duplicate comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 183-186: When the background diagnostic cachedDbSeverity is
CRITICAL but the live "SELECT 1" check succeeds, cap the reported severity to
DEGRADED before resolving status so we don't report DOWN; update the block that
currently does result.put(FIELD_STATUS, resolveDatabaseStatus(severity)) and
result.put(FIELD_SEVERITY, severity) to first map SEVERITY_CRITICAL ->
SEVERITY_DEGRADED (e.g., compute a cappedSeverity variable from
cachedDbSeverity.get()), use cappedSeverity when calling resolveDatabaseStatus
and when putting FIELD_SEVERITY, and keep reporting STATUS_DOWN only in the
existing catch branch that handles exceptions.
---
Nitpick comments:
In `@src/main/java/com/iemr/common/service/health/HealthService.java`:
- Around line 102-103: Previous delta counters previousDeadlockCount and
previousSlowQueryCount are initialized to 0 causing false positives on first
run; change their initial sentinel to -1L and update the diagnostic logic in the
methods that compute deltas (the deadlock check and performSlowQueryCheck) to
treat a previous value of -1 as "first run"—on first run set
previousDeadlockCount/previousSlowQueryCount to the current MySQL-retrieved
count and skip emitting a WARNING or computing a delta, otherwise compute delta
normally and update the previous counter after the check.
- Line 79: RESPONSE_TIME_SLOW_MS is declared but unused; either remove the
constant or wire it into checkDatabaseConnectivity by measuring DB call elapsed
time (capture start/end around the DB ping in checkDatabaseConnectivity),
compare the elapsed millis to RESPONSE_TIME_SLOW_MS and return/mark a SLOW or
WARNING health state accordingly (use the existing health result/enum used by
checkDatabaseConnectivity), or if you choose deletion simply remove the
RESPONSE_TIME_SLOW_MS constant and any related dead code; update unit tests if
present to reflect the chosen behavior.
src/main/java/com/iemr/common/service/health/HealthService.java
Outdated
Show resolved
Hide resolved
|



📋 Description
JIRA ID:
This PR implements /health and /version endpoints in the Common-API as part of Issue PSMRI/AMRIT#102
What’s changed
/healthendpoint/versionendpointSummary by CodeRabbit
New Features
Security