Skip to content

feat: structured health check endpoints with dependency status#3304

Open
nikhil-bora wants to merge 1 commit intoGoogleCloudPlatform:mainfrom
arcline-eng:feat/structured-health-checks
Open

feat: structured health check endpoints with dependency status#3304
nikhil-bora wants to merge 1 commit intoGoogleCloudPlatform:mainfrom
arcline-eng:feat/structured-health-checks

Conversation

@nikhil-bora
Copy link
Copy Markdown

Summary

  • Add reusable healthcheck Go package with three-state model (healthy/degraded/unhealthy) and per-dependency probing
  • Critical dependencies (product-catalog, currency, cart, checkout) cause unhealthy status; non-critical (recommendation, shipping, ad) cause degraded
  • Concurrent dependency checks with configurable timeout, gRPC connection state helper
  • Wire /health (structured JSON) and /livez (simple liveness) endpoints into the frontend service
  • Unit tests covering all status transitions, HTTP status codes, and timeout behavior

Test plan

  • Run go test ./healthcheck/... to verify all unit tests pass
  • Deploy to staging and verify /health returns JSON with all dependency statuses
  • Stop a non-critical service (ad) and confirm /health returns degraded (200)
  • Stop a critical service (cart) and confirm /health returns unhealthy (503)
  • Verify /livez always returns 200 as long as the process is running
  • Confirm K8s readinessProbe can use /health endpoint

🤖 Generated with Claude Code

Add a reusable healthcheck package to the frontend service that reports
per-dependency health with aggregate roll-up:

- Three-state model: healthy / degraded / unhealthy
- Critical vs non-critical dependencies: a critical failure (e.g. cart,
  checkout) makes the service unhealthy; non-critical failure (e.g. ad
  service) results in degraded status
- Concurrent dependency probes with configurable timeout
- Convenience helper for gRPC connection state checks
- /health returns structured JSON with per-dependency latency and errors
- /livez returns simple liveness probe for K8s livenessProbe
- Unit tests covering all status transitions and timeout behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@nikhil-bora nikhil-bora requested review from a team and yoshi-approver as code owners April 6, 2026 07:38
@google-cla
Copy link
Copy Markdown

google-cla bot commented Apr 6, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant