HYPERFLEET-634 - doc: sentinel operator guide#63
HYPERFLEET-634 - doc: sentinel operator guide#6386254860 wants to merge 2 commits intoopenshift-hyperfleet:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughA new comprehensive Sentinel Operator guide was added at docs/sentinel-operator-guide.md. The document defines Sentinel as a polling-based event publisher for cluster lifecycle orchestration, describes the Decision Engine (generation- and time-based reconciliation), resource filtering and sharding, configuration schema (including CEL-based message_data and adapter contracts), broker integrations (RabbitMQ, Google Pub/Sub), CloudEvents payload structure, Helm deployment workflows for single- and multi-instance setups, and operational troubleshooting and checklists. Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 150-152: Several fenced code blocks are missing language
identifiers (causing lint/render issues); update the three blocks containing the
lines "next_event = reference_time + max_age_interval", "Error: invalid config:
message_data.id: empty CEL expression is not allowed", and
"{namespace}-{resourceType}" to include a language tag (e.g., ```text) on the
opening fence so each block becomes ```text ... ``` to satisfy linting and
improve rendering.
- Around line 618-620: Update the readiness documentation to explicitly state
that the /readyz endpoint remains false until the first successful poll
completes (i.e., s.LastSuccessfulPoll() is non-zero) and broker health checks
pass, so pods intentionally stay unready during initial startup; reference
HYPERFLEET-552 as the change that introduced this behavior and add guidance to
tune probe timing if startup latency causes false-positive failures, and mirror
this note where readiness is mentioned alongside the health check lines.
- Around line 7-22: The TOC and in-document links are pointing to invalid
fragment IDs (e.g., "#broker-setup")—update either the heading IDs or the links
so they match the actual generated anchor names: use the exact slug form the
renderer creates (lowercase, spaces → hyphens, remove/normalize punctuation like
parentheses). Specifically, fix TOC entries for "Decision Engine", "Time-Based
Reconciliation (Max Age Intervals)", "Resource Filtering and Sharding", "Broker
Configuration"/"Broker Setup" and all refs to the broker section so the hrefs
match the headings' slugs (or add explicit anchor tags next to headings such as
<!-- anchor: broker-setup --> if your renderer supports it); verify and update
the broken references mentioned (the ones pointing to broker-setup and the other
listed anchors) so clicking the TOC navigates to the correct sections.
- Line 535: The heading "Phase 1: Configuration Planning" is using #### which
jumps the hierarchy from ## to ####; change that heading marker to ### so it
follows the preceding `##` level and keeps the document structure valid—locate
the "Phase 1: Configuration Planning" heading and replace the four-sharp prefix
with three-sharps.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1b37f46e-bab2-4792-af3e-01da0c9a288d
📒 Files selected for processing (1)
docs/sentinel-operator-guide.md
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
docs/sentinel-operator-guide.md (4)
536-536:⚠️ Potential issue | 🟡 MinorFix heading level increment at Line 536.
This jumps from
##to####; use###to keep hierarchy valid.🛠️ Proposed heading fix
-#### Phase 1: Configuration Planning +### Phase 1: Configuration Planning🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 536, The heading "#### Phase 1: Configuration Planning" uses too many hashes and breaks the document hierarchy; change that heading to "### Phase 1: Configuration Planning" so it correctly follows the previous "##" level—update the line containing the exact heading text to use three hashes instead of four.
620-620:⚠️ Potential issue | 🟡 MinorDocument current
/readyzstartup behavior to avoid false alarms.Line 620 and Line 677 still imply generic readiness failures, but
/readyzis expected to stay unready until the first successful poll completes and broker checks pass. Add this note in both checklist and troubleshooting text.Based on learnings: In HYPERFLEET-552, the sentinel service readiness semantics were changed: /readyz now requires one successful poll completion (s.LastSuccessfulPoll() non-zero) plus broker health checks.🛠️ Proposed doc update
- [ ] Check readiness endpoint: `curl http://<sentinel-service>:8080/readyz` + - Note: `/readyz` remains unready until Sentinel completes at least one successful poll cycle and broker health checks pass.-| **Health/readiness endpoints return errors** | Configuration validation failed | Check pod logs for startup errors: `kubectl logs -n hyperfleet-system -l app.kubernetes.io/name=sentinel`. Verify all required config fields. | +| **Health/readiness endpoints return errors** | Startup not complete or configuration issue | `/readyz` is expected to be unready until the first successful poll and broker health checks pass. If it persists, check pod logs: `kubectl logs -n hyperfleet-system -l app.kubernetes.io/name=sentinel` and verify required config fields. |Also applies to: 677-677
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 620, Update the checklist entry and the troubleshooting section to document that the /readyz endpoint intentionally remains unready until the service has completed at least one successful poll and broker health checks pass; explicitly state the readiness criteria as "s.LastSuccessfulPoll() must be non-zero AND broker checks healthy" and mention that this change came from HYPERFLEET-552 so readers don't treat an unready /readyz as a generic failure until those conditions are met.
151-153:⚠️ Potential issue | 🟡 MinorAdd language identifiers to fenced code blocks.
The fences at Line 151, Line 409, and Line 474 are still missing language tags.
🛠️ Proposed fence-language fix
- ``` + ```text next_event = reference_time + max_age_interval ``` - ``` + ```text Error: invalid config: message_data.id: empty CEL expression is not allowed-
+text
{namespace}-{resourceType}Also applies to: 409-411, 474-476
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 151 - 153, Several fenced code blocks in the document are missing language identifiers; update the three blocks containing the snippets "next_event = reference_time + max_age_interval", "Error: invalid config: message_data.id: empty CEL expression is not allowed", and "{namespace}-{resourceType}" to include a language tag (e.g., ```text) immediately after the opening backticks so the fences read ```text ... ``` for each occurrence.
7-22:⚠️ Potential issue | 🟡 MinorFix invalid in-document anchor fragments.
Several links still point to non-existent fragments (for example Line 7–Line 22 TOC entries and Line 524
#broker-setup). Update links to the renderer-generated slugs so navigation works.🛠️ Proposed anchor fix
-1. [Introduction](`#introduction`) +1. [Introduction](`#1-introduction`) - - [What is Sentinel?](`#what-is-sentinel`) + - [What is Sentinel?](`#11-what-is-sentinel`) - - [When to Use Sentinel](`#when-to-use-sentinel`) + - [When to Use Sentinel](`#12-when-to-use-sentinel`) ... -2. [Core Concepts](`#core-concepts`) +2. [Core Concepts](`#2-core-concepts`) ... -3. [Configuration Reference](`#configuration-reference`) +3. [Configuration Reference](`#3-configuration-reference`) ... -4. [Deployment Checklist](`#deployment-checklist`) +4. [Deployment Checklist](`#4-deployment-checklist`)-When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Deployment: Broker Setup](`#broker-setup`) for details. +When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Configure Authentication and Permissions](`#configure-authentication-and-permissions`) for details.Also applies to: 524-524, 544-544, 552-552, 557-557, 563-563
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 7 - 22, The TOC contains invalid in-document anchor fragments (e.g., "Decision Engine", "Message Data (CEL Expressions)", "Broker Configuration", "Broker Setup") that don't match the renderer-generated slugs; update each TOC link to the exact slugified heading used by the renderer (match casing, replace spaces/special chars with hyphens, remove parentheses, etc.) so links like "Decision Engine", "Generation-Based Reconciliation", "Time-Based Reconciliation (Max Age Intervals)", "Resource Selector", "Message Data (CEL Expressions)" and "Broker Configuration"/"Broker Setup" point to the correct generated anchors; verify and correct the other reported anchors (those around Broker setup and nearby headings) to ensure navigation works.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 630-633: The fenced code block containing the expected log output
(the triple-backtick block that starts before "Fetched resources count=15...")
is missing a language tag; update that block to include a language identifier
(e.g., add "text" after the opening ```) so the fence reads ```text to ensure
proper rendering of the log output.
---
Duplicate comments:
In `@docs/sentinel-operator-guide.md`:
- Line 536: The heading "#### Phase 1: Configuration Planning" uses too many
hashes and breaks the document hierarchy; change that heading to "### Phase 1:
Configuration Planning" so it correctly follows the previous "##" level—update
the line containing the exact heading text to use three hashes instead of four.
- Line 620: Update the checklist entry and the troubleshooting section to
document that the /readyz endpoint intentionally remains unready until the
service has completed at least one successful poll and broker health checks
pass; explicitly state the readiness criteria as "s.LastSuccessfulPoll() must be
non-zero AND broker checks healthy" and mention that this change came from
HYPERFLEET-552 so readers don't treat an unready /readyz as a generic failure
until those conditions are met.
- Around line 151-153: Several fenced code blocks in the document are missing
language identifiers; update the three blocks containing the snippets
"next_event = reference_time + max_age_interval", "Error: invalid config:
message_data.id: empty CEL expression is not allowed", and
"{namespace}-{resourceType}" to include a language tag (e.g., ```text)
immediately after the opening backticks so the fences read ```text ... ``` for
each occurrence.
- Around line 7-22: The TOC contains invalid in-document anchor fragments (e.g.,
"Decision Engine", "Message Data (CEL Expressions)", "Broker Configuration",
"Broker Setup") that don't match the renderer-generated slugs; update each TOC
link to the exact slugified heading used by the renderer (match casing, replace
spaces/special chars with hyphens, remove parentheses, etc.) so links like
"Decision Engine", "Generation-Based Reconciliation", "Time-Based Reconciliation
(Max Age Intervals)", "Resource Selector", "Message Data (CEL Expressions)" and
"Broker Configuration"/"Broker Setup" point to the correct generated anchors;
verify and correct the other reported anchors (those around Broker setup and
nearby headings) to ensure navigation works.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9fd2d867-4cdf-427a-8b37-514713b3c5b7
📒 Files selected for processing (1)
docs/sentinel-operator-guide.md
There was a problem hiding this comment.
♻️ Duplicate comments (1)
docs/sentinel-operator-guide.md (1)
7-22:⚠️ Potential issue | 🟡 MinorFix invalid internal anchors in TOC and section cross-references.
Several links still point to fragments that don’t exist in the rendered document (for example, Line 524:
#broker-setup, plus multiple TOC/Reference links). This breaks in-doc navigation and keeps MD051 warnings open.Suggested anchor/link fix
-1. [Introduction](`#introduction`) - - [What is Sentinel?](`#what-is-sentinel`) - - [When to Use Sentinel](`#when-to-use-sentinel`) -2. [Core Concepts](`#core-concepts`) - - [Decision Engine](`#decision-engine`) - - [Generation-Based Reconciliation](`#generation-based-reconciliation`) - - [Time-Based Reconciliation (Max Age Intervals)](`#time-based-reconciliation-max-age-intervals`) - - [Resource Filtering and Sharding](`#resource-filtering-and-sharding`) -3. [Configuration Reference](`#configuration-reference`) - - [Configuration File Structure](`#configuration-file-structure`) - - [Required Fields](`#required-fields`) - - [Optional Fields](`#optional-fields`) - - [Resource Selector](`#resource-selector`) - - [Message Data (CEL Expressions)](`#message-data-cel-expressions`) - - [Broker Configuration](`#broker-configuration`) -4. [Deployment Checklist](`#deployment-checklist`) +1. [Introduction](`#1-introduction`) + - [What is Sentinel?](`#11-what-is-sentinel`) + - [When to Use Sentinel](`#12-when-to-use-sentinel`) +2. [Core Concepts](`#2-core-concepts`) + - [Decision Engine](`#21-decision-engine`) + - [Generation-Based Reconciliation](`#211-generation-based-reconciliation`) + - [Time-Based Reconciliation (Max Age Intervals)](`#212-time-based-reconciliation-max-age-intervals`) + - [Resource Filtering and Sharding](`#22-resource-filtering-and-sharding`) +3. [Configuration Reference](`#3-configuration-reference`) + - [Configuration File Structure](`#31-configuration-file-structure`) + - [Required Fields](`#32-required-fields`) + - [Optional Fields](`#33-optional-fields`) + - [Resource Selector](`#34-resource-selector`) + - [Message Data (CEL Expressions)](`#35-message-data-cel-expressions`) + - [Broker Configuration](`#36-broker-configuration`) +4. [Deployment Checklist](`#4-deployment-checklist`)-When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Deployment: Broker Setup](`#broker-setup`) for details. +When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Configure Authentication and Permissions](`#configure-authentication-and-permissions`) for details.Also applies to: 524-524, 544-544, 552-552, 557-557, 563-563
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 7 - 22, The TOC and cross-reference anchors are invalid (e.g., references to `#broker-setup` and other non-existent fragments) which breaks navigation and triggers MD051; locate the section headings and either update the links in the TOC (entries like "Broker Configuration", "Message Data (CEL Expressions)", "Resource Selector", "Deployment Checklist", and "Configuration File Structure") to match the exact rendered anchors (lowercase, hyphenated) or rename/add the corresponding headings to match the anchors (for example create or rename a heading to "Broker Setup" if links use `#broker-setup`), then run the renderer/markdown linter to confirm anchors resolve and remove duplicate or stale fragment links.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 7-22: The TOC and cross-reference anchors are invalid (e.g.,
references to `#broker-setup` and other non-existent fragments) which breaks
navigation and triggers MD051; locate the section headings and either update the
links in the TOC (entries like "Broker Configuration", "Message Data (CEL
Expressions)", "Resource Selector", "Deployment Checklist", and "Configuration
File Structure") to match the exact rendered anchors (lowercase, hyphenated) or
rename/add the corresponding headings to match the anchors (for example create
or rename a heading to "Broker Setup" if links use `#broker-setup`), then run the
renderer/markdown linter to confirm anchors resolve and remove duplicate or
stale fragment links.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: df4d77ff-40d3-4718-abb2-b373a27ae635
📒 Files selected for processing (1)
docs/sentinel-operator-guide.md
There was a problem hiding this comment.
♻️ Duplicate comments (4)
docs/sentinel-operator-guide.md (4)
7-22:⚠️ Potential issue | 🟡 MinorTOC anchor links remain broken despite previous fix attempt.
The TOC links still don't match the auto-generated heading anchors. For example,
#introductionshould be#1-introduction,#what-is-sentinelshould be#11-what-is-sentinel, etc.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 7 - 22, The TOC links use unnumbered anchors but your renderer generates numbered heading IDs (e.g., headings like "Introduction" and "What is Sentinel" produce anchors "1-introduction" and "11-what-is-sentinel"); update each TOC entry to use the numbered anchors that match the generated IDs (e.g., change "#introduction" → "#1-introduction", "#what-is-sentinel" → "#11-what-is-sentinel", etc.), or regenerate the TOC using the same tool/command that produces the document headings to ensure the TOC and header IDs remain in sync (verify for entries such as "Core Concepts", "Decision Engine", "Generation-Based Reconciliation" and "Time-Based Reconciliation (Max Age Intervals)").
524-524:⚠️ Potential issue | 🟡 MinorFix broken anchor link.
The link
#broker-setupdoesn't match any heading in the document. Based on context about Workload Identity Federation authentication, this should likely point to the "Configure Authentication and Permissions" section.🔗 Proposed fix
-When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Deployment: Broker Setup](`#broker-setup`) for details. +When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Configure Authentication and Permissions](`#configure-authentication-and-permissions`) for details.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 524, The anchor link "#broker-setup" is broken; update the link in the sentence referencing Workload Identity Federation to point to the correct heading id for the "Configure Authentication and Permissions" section (e.g., replace "#broker-setup" with "#configure-authentication-and-permissions" or the actual slug used by the document generator) so the link navigates to the Configure Authentication and Permissions section.
634-634:⚠️ Potential issue | 🟡 MinorUpdate log duration format to match actual code output.
The example shows
duration=125msbut the code formats it asduration=0.125susing%.3fs.📝 Proposed fix
- Trigger cycle completed total=15 published=3 skipped=12 duration=125ms topic=hyperfleet-dev-clusters subset=clusters + Trigger cycle completed total=15 published=3 skipped=12 duration=0.125s topic=hyperfleet-dev-clusters subset=clusters🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 634, The example log line in the docs shows duration=125ms but the runtime uses printf formatting with '%.3fs' (seconds to 3 decimals); update the example log entry (the line beginning "Trigger cycle completed total=15 published=3 skipped=12 duration=125ms topic=hyperfleet-dev-clusters subset=clusters") to use the seconds format produced by '%.3fs' (e.g., duration=0.125s) so the docs match the actual output.
420-420:⚠️ Potential issue | 🟡 MinorFix CloudEvents type field to match actual implementation.
The example shows
"type": "hyperfleet.clusters.reconcile"but the code generatescom.redhat.hyperfleet.Cluster.reconcile(withcom.redhat.prefix and singular capitalizedClusterfromresource.Kind).📝 Proposed fix
- "type": "hyperfleet.clusters.reconcile", + "type": "com.redhat.hyperfleet.Cluster.reconcile",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 420, The CloudEvents "type" example is incorrect; update the example value to match the implementation which constructs the type as "com.redhat.hyperfleet.<ResourceKind>.reconcile" (note singular, capitalized ResourceKind from resource.Kind and the com.redhat. prefix). Locate the sample line with `"type": "hyperfleet.clusters.reconcile"` and replace it with the form produced by the code (e.g., `"type": "com.redhat.hyperfleet.Cluster.reconcile"`) so it matches the actual behavior.
🧹 Nitpick comments (9)
docs/sentinel-operator-guide.md (9)
472-476: Clarify that topic naming convention is a default, not a requirement.The documentation presents
{namespace}-{resourceType}as the topic naming pattern, but this is the Helm chart default convention. Different deployments (GCP, ROSA, etc.) may use their own naming strategies. Consider clarifying this is a recommended default.📝 Suggested clarification
-The `BROKER_TOPIC` environment variable sets the full topic name where events are published. When using Helm, the default topic follows the pattern: +The `BROKER_TOPIC` environment variable sets the full topic name where events are published. When using the provided Helm chart, the default topic follows this pattern: ```text {namespace}-{resourceType}</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@docs/sentinel-operator-guide.mdaround lines 472 - 476, The docs currently
present the BROKER_TOPIC pattern "{namespace}-{resourceType}" as the topic name;
update the text to clarify this is the Helm chart default (a recommended
convention) and not a hard requirement so other deployments (e.g., GCP, ROSA)
can use different naming schemes—refer to the BROKER_TOPIC env var and
explicitly state "Helm default: {namespace}-{resourceType}" and mention that
custom topic naming is supported and may vary by deployment.</details> --- `41-41`: **Remove specific broker implementations from core responsibilities list to avoid staleness.** Hardcoding "RabbitMQ or Google Pub/Sub" in the introduction will require updates whenever broker support changes. Consider using generic "message broker" here and let the configuration section enumerate current implementations. <details> <summary>📝 Proposed change</summary> ```diff -3. **Publish CloudEvents** to a message broker (RabbitMQ or Google Pub/Sub) +3. **Publish CloudEvents** to a message broker🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 41, Replace the hardcoded broker examples in the intro line "Publish CloudEvents to a message broker (RabbitMQ or Google Pub/Sub)" with a generic phrase such as "Publish CloudEvents to a message broker" and move specific implementations into the configuration/implementation section; update the sentence containing "Publish CloudEvents" so it no longer names RabbitMQ or Google Pub/Sub and ensure the docs elsewhere (configuration section) list the currently supported brokers.
214-229: Clarify that multiple adapters can consume from the same topic.The architecture diagram shows a 1:1 relationship between topics and adapters, which might imply that only one adapter can consume from each topic. In reality, multiple adapters can subscribe to the same topic in a fan-out pattern. Consider adding a note to clarify this.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 214 - 229, The diagram currently implies a 1:1 mapping between Broker Topics (B1, B2, B3) and Adapters (A1, A2, A3); add a short clarifying note right below the Mermaid block stating that multiple adapters can subscribe to the same Broker Topic (fan-out) so A1/A2/A3 are examples and not exclusive consumers, and optionally show one example edge or parenthetical (e.g., "multiple adapters may consume from B1") to make the fan-out behavior explicit.
661-753: Consider linking to detailed docs instead of duplicating reference material in appendices.The appendices duplicate content that exists in dedicated documentation (metrics.md, configuration.md). While having quick-reference tables is convenient for operators, this creates maintenance burden and risks drift between the operator guide and detailed docs. Consider replacing appendices with links to the authoritative sources.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 661 - 753, Remove the duplicated detailed reference tables in Appendix A and B and replace them with concise “Quick links” and a short one-line summary pointing to the authoritative docs (e.g., configuration.md and metrics.md); specifically, delete the large Configuration Fields Reference and Metrics Catalog Reference tables and instead add links labeled (Configuration reference → configuration.md) and (Metrics catalog → metrics.md) plus a brief note that quick examples remain here for convenience; locate these changes around the headings "Appendix A: Troubleshooting" / "Appendix B: Quick Reference" and update any in-text references (like PromQL examples) to point readers to the external docs for full details.
515-519: Clarify development vs production authentication methods.The section shows two authentication options but doesn't clearly indicate that "Option 1: Use Application Default Credentials" with
gcloud auth application-default loginis for local development only. In production, ADC should use Workload Identity (already mentioned at line 522-524) or service account keys.📝 Suggested clarification
# Environment variables export BROKER_TOPIC="hyperfleet-prod-clusters" -# Option 1: Use Application Default Credentials (recommended for GKE) +# Option 1: Use Application Default Credentials (local development) gcloud auth application-default login -# Option 2: Use service account key file +# Option 2: Use service account key file (not recommended for production) export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 515 - 519, Update the "Option 1: Use Application Default Credentials (recommended for GKE)" section to clarify that the gcloud auth application-default login command is intended for local development only (not for production), and explicitly state that in production ADC should rely on Workload Identity or a service account key as already mentioned; modify the heading/text for "Option 1" to mark it as "local development" and add one sentence pointing readers to the existing Workload Identity guidance or the "Option 2" service account key option for production usage.
180-180: Consider more precise terminology than "sharding".The term "sharding" typically implies automatic distribution with guaranteed coverage, but this system requires manual label-based filtering where operators must ensure all resources are covered (as noted in lines 233-236). Consider using "Resource Filtering and Partitioning" or "Label-Based Resource Distribution" for more accurate terminology.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 180, Update the heading "### 2.2 Resource Filtering and Sharding" and all occurrences of "sharding" in this section to a more accurate term like "Resource Filtering and Partitioning" or "Label-Based Resource Distribution"; also update the explanatory text that references manual label-based filtering (around the noted lines referring to operator responsibility) to reflect that distribution is manual/label-based rather than automatic sharding, ensuring wording is consistent across the section and any examples or labels shown.
56-56: Clarify what "independently" means for operator understanding.Consider expanding to explain that Sentinel is unaware of downstream adapters and publishes events in a fan-out pattern where multiple adapters can consume from the broker topic.
📝 Suggested expansion
-Sentinel operates independently of adapters and uses a **dual-trigger reconciliation strategy**: +Sentinel operates independently of adapters (it is unaware of which adapters consume its events) and publishes to message broker topics in a fan-out pattern. It uses a **dual-trigger reconciliation strategy**:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 56, Update the sentence "Sentinel operates independently of adapters and uses a **dual-trigger reconciliation strategy**:" to clarify that "independently" means Sentinel is adapter-agnostic (it does not track or call specific downstream adapters), and explicitly describe that it publishes events to a broker topic in a fan-out pattern so multiple adapters can independently consume the same events; mention implications such as no built-in adapter-specific retry/ack semantics and that adapters must handle their own delivery guarantees. Reference the existing phrase "Sentinel operates independently of adapters", "dual-trigger reconciliation strategy", and "broker topic" when adding this explanatory expansion.
33-35: Add HyperFleet system context for better operator understanding.The introduction jumps directly to technical details without explaining Sentinel's role in the broader HyperFleet system. Consider adding context about why Sentinel exists and its purpose in cluster lifecycle orchestration.
📝 Suggested addition
### 1.1 What is Sentinel? +Sentinel is a component within the HyperFleet cluster lifecycle management system that triggers reconciliation events when managed resources change or require periodic validation. + HyperFleet Sentinel is a **polling-based event publisher** that monitors HyperFleet API resources and publishes CloudEvents to a message broker when reconciliation is needed. It acts as the trigger mechanism for the HyperFleet orchestration platform.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 33 - 35, The intro section titled "1.1 What is Sentinel?" is too technical and lacks system-level context; update that paragraph to briefly state Sentinel's role within the broader HyperFleet system (e.g., it acts as the event trigger for cluster lifecycle orchestration), why it exists (to detect resource drift and initiate reconciliations), and how it interacts with other components (HyperFleet API, orchestrator, message broker) so operators understand its purpose and place in workflows; modify the existing text under the "What is Sentinel?" heading to include these sentences while keeping the existing line about polling-based event publishing and CloudEvents.
134-136: Clarifystatus.readyfield vs conditions.The documentation uses
status.readythroughout (e.g., in the table and CEL expressions), but reviewer feedback indicates the API only returnsconditionsarrays, not a directstatus.readyfield.Consider either:
- Clarifying that
status.readyis a computed/convenience field derived fromconditions[type=Ready].value == "True"- Using the actual API structure:
resource.status.conditions.filter(c, c.type=="Ready")[0].value == "True"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 134 - 136, The docs reference a non-existent direct field status.ready; update the text and CEL examples to either (a) explain that status.ready is a computed/convenience property derived from the Ready condition (i.e., the document should state that status.ready == true when conditions[type=="Ready"].value == "True"), or (b) replace all occurrences of status.ready in prose and CEL examples with the actual API expression using resource.status.conditions to find the Ready condition (e.g., filter for c.type=="Ready" and check its value). Ensure references to CEL expressions and the table rows (the "**Ready** (`status.ready: true`)" and "**Not Ready** (`status.ready: false`)") are updated accordingly and include a brief note describing that this is derived from the Ready condition.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 7-22: The TOC links use unnumbered anchors but your renderer
generates numbered heading IDs (e.g., headings like "Introduction" and "What is
Sentinel" produce anchors "1-introduction" and "11-what-is-sentinel"); update
each TOC entry to use the numbered anchors that match the generated IDs (e.g.,
change "#introduction" → "#1-introduction", "#what-is-sentinel" →
"#11-what-is-sentinel", etc.), or regenerate the TOC using the same tool/command
that produces the document headings to ensure the TOC and header IDs remain in
sync (verify for entries such as "Core Concepts", "Decision Engine",
"Generation-Based Reconciliation" and "Time-Based Reconciliation (Max Age
Intervals)").
- Line 524: The anchor link "#broker-setup" is broken; update the link in the
sentence referencing Workload Identity Federation to point to the correct
heading id for the "Configure Authentication and Permissions" section (e.g.,
replace "#broker-setup" with "#configure-authentication-and-permissions" or the
actual slug used by the document generator) so the link navigates to the
Configure Authentication and Permissions section.
- Line 634: The example log line in the docs shows duration=125ms but the
runtime uses printf formatting with '%.3fs' (seconds to 3 decimals); update the
example log entry (the line beginning "Trigger cycle completed total=15
published=3 skipped=12 duration=125ms topic=hyperfleet-dev-clusters
subset=clusters") to use the seconds format produced by '%.3fs' (e.g.,
duration=0.125s) so the docs match the actual output.
- Line 420: The CloudEvents "type" example is incorrect; update the example
value to match the implementation which constructs the type as
"com.redhat.hyperfleet.<ResourceKind>.reconcile" (note singular, capitalized
ResourceKind from resource.Kind and the com.redhat. prefix). Locate the sample
line with `"type": "hyperfleet.clusters.reconcile"` and replace it with the form
produced by the code (e.g., `"type": "com.redhat.hyperfleet.Cluster.reconcile"`)
so it matches the actual behavior.
---
Nitpick comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 472-476: The docs currently present the BROKER_TOPIC pattern
"{namespace}-{resourceType}" as the topic name; update the text to clarify this
is the Helm chart default (a recommended convention) and not a hard requirement
so other deployments (e.g., GCP, ROSA) can use different naming schemes—refer to
the BROKER_TOPIC env var and explicitly state "Helm default:
{namespace}-{resourceType}" and mention that custom topic naming is supported
and may vary by deployment.
- Line 41: Replace the hardcoded broker examples in the intro line "Publish
CloudEvents to a message broker (RabbitMQ or Google Pub/Sub)" with a generic
phrase such as "Publish CloudEvents to a message broker" and move specific
implementations into the configuration/implementation section; update the
sentence containing "Publish CloudEvents" so it no longer names RabbitMQ or
Google Pub/Sub and ensure the docs elsewhere (configuration section) list the
currently supported brokers.
- Around line 214-229: The diagram currently implies a 1:1 mapping between
Broker Topics (B1, B2, B3) and Adapters (A1, A2, A3); add a short clarifying
note right below the Mermaid block stating that multiple adapters can subscribe
to the same Broker Topic (fan-out) so A1/A2/A3 are examples and not exclusive
consumers, and optionally show one example edge or parenthetical (e.g.,
"multiple adapters may consume from B1") to make the fan-out behavior explicit.
- Around line 661-753: Remove the duplicated detailed reference tables in
Appendix A and B and replace them with concise “Quick links” and a short
one-line summary pointing to the authoritative docs (e.g., configuration.md and
metrics.md); specifically, delete the large Configuration Fields Reference and
Metrics Catalog Reference tables and instead add links labeled (Configuration
reference → configuration.md) and (Metrics catalog → metrics.md) plus a brief
note that quick examples remain here for convenience; locate these changes
around the headings "Appendix A: Troubleshooting" / "Appendix B: Quick
Reference" and update any in-text references (like PromQL examples) to point
readers to the external docs for full details.
- Around line 515-519: Update the "Option 1: Use Application Default Credentials
(recommended for GKE)" section to clarify that the gcloud auth
application-default login command is intended for local development only (not
for production), and explicitly state that in production ADC should rely on
Workload Identity or a service account key as already mentioned; modify the
heading/text for "Option 1" to mark it as "local development" and add one
sentence pointing readers to the existing Workload Identity guidance or the
"Option 2" service account key option for production usage.
- Line 180: Update the heading "### 2.2 Resource Filtering and Sharding" and all
occurrences of "sharding" in this section to a more accurate term like "Resource
Filtering and Partitioning" or "Label-Based Resource Distribution"; also update
the explanatory text that references manual label-based filtering (around the
noted lines referring to operator responsibility) to reflect that distribution
is manual/label-based rather than automatic sharding, ensuring wording is
consistent across the section and any examples or labels shown.
- Line 56: Update the sentence "Sentinel operates independently of adapters and
uses a **dual-trigger reconciliation strategy**:" to clarify that
"independently" means Sentinel is adapter-agnostic (it does not track or call
specific downstream adapters), and explicitly describe that it publishes events
to a broker topic in a fan-out pattern so multiple adapters can independently
consume the same events; mention implications such as no built-in
adapter-specific retry/ack semantics and that adapters must handle their own
delivery guarantees. Reference the existing phrase "Sentinel operates
independently of adapters", "dual-trigger reconciliation strategy", and "broker
topic" when adding this explanatory expansion.
- Around line 33-35: The intro section titled "1.1 What is Sentinel?" is too
technical and lacks system-level context; update that paragraph to briefly state
Sentinel's role within the broader HyperFleet system (e.g., it acts as the event
trigger for cluster lifecycle orchestration), why it exists (to detect resource
drift and initiate reconciliations), and how it interacts with other components
(HyperFleet API, orchestrator, message broker) so operators understand its
purpose and place in workflows; modify the existing text under the "What is
Sentinel?" heading to include these sentences while keeping the existing line
about polling-based event publishing and CloudEvents.
- Around line 134-136: The docs reference a non-existent direct field
status.ready; update the text and CEL examples to either (a) explain that
status.ready is a computed/convenience property derived from the Ready condition
(i.e., the document should state that status.ready == true when
conditions[type=="Ready"].value == "True"), or (b) replace all occurrences of
status.ready in prose and CEL examples with the actual API expression using
resource.status.conditions to find the Ready condition (e.g., filter for
c.type=="Ready" and check its value). Ensure references to CEL expressions and
the table rows (the "**Ready** (`status.ready: true`)" and "**Not Ready**
(`status.ready: false`)") are updated accordingly and include a brief note
describing that this is derived from the Ready condition.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 652ff7c7-945e-4077-a440-de7f9c6901cb
📒 Files selected for processing (1)
docs/sentinel-operator-guide.md
There was a problem hiding this comment.
♻️ Duplicate comments (3)
docs/sentinel-operator-guide.md (3)
632-635:⚠️ Potential issue | 🟡 MinorMake the sample log duration match the real log format.
The example shows
125ms, but Sentinel logs this duration in seconds with fractional precision. Keeping the example as-is makes copy/paste verification misleading.🛠️ Proposed fix
Fetched resources count=15 label_selectors=1 topic=hyperfleet-dev-clusters subset=clusters - Trigger cycle completed total=15 published=3 skipped=12 duration=125ms topic=hyperfleet-dev-clusters subset=clusters + Trigger cycle completed total=15 published=3 skipped=12 duration=0.125s topic=hyperfleet-dev-clusters subset=clusters🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 632 - 635, The sample log in the snippet uses a millisecond duration ("duration=125ms") but real Sentinel logs emit durations in seconds with fractional precision; update the example line (the "Trigger cycle completed total=15 published=3 skipped=12 duration=125ms topic=hyperfleet-dev-clusters subset=clusters" entry) to use seconds with fractional precision (e.g., "duration=0.125s") so the sample matches the actual log format.
524-524:⚠️ Potential issue | 🟡 MinorFix the remaining broken section links.
These references still point to anchors that do not exist in the rendered document. The headings in this guide use numbered slugs, so these links should target those numbered fragments instead.
🛠️ Proposed fix
-When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Deployment: Broker Setup](`#broker-setup`) for details. +When deploying to GKE, use **Workload Identity Federation** for authentication instead of service account keys. See [Configure Authentication and Permissions](`#configure-authentication-and-permissions`) for details. - - Reference: [Resource Filtering and Sharding](`#resource-filtering-and-sharding`) + - Reference: [Resource Filtering and Sharding](`#22-resource-filtering-and-sharding`) - - Reference: [Optional Fields](`#optional-fields`) + - Reference: [Optional Fields](`#33-optional-fields`) -- [ ] Reference: [Message Data (CEL Expressions)](`#message-data-cel-expressions`) +- [ ] Reference: [Message Data (CEL Expressions)](`#35-message-data-cel-expressions`) - - Reference: [Required Fields](`#required-fields`) + - Reference: [Required Fields](`#32-required-fields`)Also applies to: 544-544, 552-552, 557-557, 563-563
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` at line 524, Links like "Deployment: Broker Setup" target non-existent anchors; update those Markdown link targets to the rendered numbered heading slugs used in this guide (replace the plain anchor fragments with the corresponding numbered-fragment slug for the "Deployment: Broker Setup" heading and the other similar links referenced around the same sections). Find the occurrences of the link text "Deployment: Broker Setup" (and the other broken anchors called out) and change their href targets to the guide's actual numbered slugs so they resolve to the rendered headings.
417-421:⚠️ Potential issue | 🟠 MajorUpdate the CloudEvent
typeexample to match what Sentinel emits.The example still documents
hyperfleet.clusters.reconcile, which does not match the actual event type format used by Sentinel. That makes the payload example misleading for downstream consumers and test fixtures.🛠️ Proposed fix
- "type": "hyperfleet.clusters.reconcile", + "type": "com.redhat.hyperfleet.Cluster.reconcile",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/sentinel-operator-guide.md` around lines 417 - 421, The CloudEvent `type` example currently shows "hyperfleet.clusters.reconcile" which is incorrect; locate the JSON example that sets the "type" field (the fragment containing "type": "hyperfleet.clusters.reconcile") and replace that string with the actual event type Sentinel emits (retrieve the exact value from the Sentinel emitter or event docs), then update any related examples/fixtures/tests to use the same authoritative event type so documentation and downstream consumers remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 632-635: The sample log in the snippet uses a millisecond duration
("duration=125ms") but real Sentinel logs emit durations in seconds with
fractional precision; update the example line (the "Trigger cycle completed
total=15 published=3 skipped=12 duration=125ms topic=hyperfleet-dev-clusters
subset=clusters" entry) to use seconds with fractional precision (e.g.,
"duration=0.125s") so the sample matches the actual log format.
- Line 524: Links like "Deployment: Broker Setup" target non-existent anchors;
update those Markdown link targets to the rendered numbered heading slugs used
in this guide (replace the plain anchor fragments with the corresponding
numbered-fragment slug for the "Deployment: Broker Setup" heading and the other
similar links referenced around the same sections). Find the occurrences of the
link text "Deployment: Broker Setup" (and the other broken anchors called out)
and change their href targets to the guide's actual numbered slugs so they
resolve to the rendered headings.
- Around line 417-421: The CloudEvent `type` example currently shows
"hyperfleet.clusters.reconcile" which is incorrect; locate the JSON example that
sets the "type" field (the fragment containing "type":
"hyperfleet.clusters.reconcile") and replace that string with the actual event
type Sentinel emits (retrieve the exact value from the Sentinel emitter or event
docs), then update any related examples/fixtures/tests to use the same
authoritative event type so documentation and downstream consumers remain
consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ad176138-e4b2-4478-802b-3e0fb3a3ea08
📒 Files selected for processing (1)
docs/sentinel-operator-guide.md
642fd69 to
9375468
Compare
1f2f9c3 to
de08269
Compare
Summary by CodeRabbit