Refactor architecture documentation with clearer component overview#2013
Refactor architecture documentation with clearer component overview#2013
Conversation
…l architecture - Rewrite architecture.rst around Agent/Platform/Data Sources model with new architecture diagram showing the in-cluster Agent, Robusta Platform (SaaS or self-hosted), data source integrations, and notification channels - Update oss-vs-saas.rst to highlight HolmesGPT as the flagship open source project for AI-powered root cause analysis and position Robusta Classic as the deterministic alert enrichment engine https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
|
✅ Docker image ready for
Use this tag to pull the image for testing. 📋 Copy commandsgcloud auth configure-docker us-central1-docker.pkg.dev
docker pull us-central1-docker.pkg.dev/robusta-development/temporary-builds/robusta-runner:c6583e3
docker tag us-central1-docker.pkg.dev/robusta-development/temporary-builds/robusta-runner:c6583e3 me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:c6583e3
docker push me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:c6583e3Patch Helm values in one line: helm upgrade --install robusta robusta/robusta \
--reuse-values \
--set runner.image=me-west1-docker.pkg.dev/robusta-development/development/robusta-runner-dev:c6583e3 |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReworks architecture and deployment docs to present HolmesGPT as the in-cluster Agent that directly accesses data sources and reports investigation results to the Robusta Platform; restructures docs navigation (added hidden toctrees), removes some FAQ/content, and updates sinks, security, and OSS-vs-SaaS guidance. (39 words) Changes
Sequence Diagram(s)sequenceDiagram
participant User as User (Operator)
participant Platform as Robusta Platform
participant Agent as HolmesGPT Agent (in-cluster)
participant Data as In-cluster Data Sources
participant Sink as Notification Sinks
User->>Platform: View investigations / request RCA
Platform->>Agent: Request investigation / query context
Agent->>Data: Fetch logs, metrics, events
Data-->>Agent: Return data
Agent->>Agent: Correlate evidence & run LLM analysis
Agent-->>Platform: Report findings / evidence / RCA
Platform-->>User: Display results (UI or API)
Agent->>Sink: Send notifications or enriched alerts (optional)
Sink-->>User: Deliver notification
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
- architecture.rst: Lead with HolmesGPT as the core AI investigation engine, frame the entire architecture around how HolmesGPT investigates alerts (receive → pull data → correlate → report) - oss-vs-saas.rst: Make HolmesGPT the headline open source project, demote Robusta Classic to a small footnote section at the bottom for readers who encounter references to it elsewhere in the docs https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/how-it-works/architecture.rst`:
- Line 29: Update the second occurrence of the Prometheus component name from
"AlertManager" to the correct capitalization "Alertmanager" in the architecture
text; search for the string "AlertManager" (the repeated Prometheus component
label) and replace that instance with "Alertmanager" to match the first
occurrence and Prometheus naming conventions.
- Line 17: Update the sentence "The in-cluster Agent receives the alert from
Prometheus AlertManager (or other sources)" in architecture.rst to use the
correct Prometheus component name "Alertmanager" (one word, lowercase 'm');
locate the exact string "Prometheus AlertManager" and replace it with
"Prometheus Alertmanager" to ensure accurate naming.
- Line 68: Replace the broken Sphinx doc reference string "And :doc:`many more
sinks <../configuration/configuring-sinks>`" with the correct target by updating
the reference to "And :doc:`many more sinks
<../notification-routing/configuring-sinks>`" so the link points to the existing
configuring-sinks document; locate the line containing the original doc
reference and edit the target path accordingly.
In `@docs/how-it-works/oss-vs-saas.rst`:
- Line 7: Update the HolmesGPT hyperlink in the sentence that currently links to
`https://github.com/robusta-dev/holmesGPT` so it points directly to the
canonical repository URL `https://github.com/HolmesGPT/holmesgpt`; locate the
inline link in the sentence referencing HolmesGPT and replace the redirected
robusta-dev URL with the canonical HolmesGPT URL to avoid the redirect.
---
Duplicate comments:
In `@docs/how-it-works/architecture.rst`:
- Line 4: Update the GitHub link text to use consistent casing across
documentation: change the "HolmesGPT <https://github.com/robusta-dev/holmesGPT>"
anchor in the architecture.rst entry to match the same repository URL casing
used in oss-vs-saas.rst (ensure both use the identical path e.g.,
https://github.com/robusta-dev/holmesgpt or
https://github.com/robusta-dev/holmesGPT), and verify the anchor label
"HolmesGPT" remains consistent across both files so links resolve reliably.
In `@docs/how-it-works/oss-vs-saas.rst`:
- Line 34: The markdown link `HolmesGPT on GitHub
<https://github.com/robusta-dev/holmesGPT>`_ in
docs/how-it-works/oss-vs-saas.rst may have incorrect casing; verify the URL
matches the repository's exact casing and update it if necessary, and also check
and correct the similar GitHub link referenced earlier (the one on line 7) so
both links point to the exact same, correctly-cased GitHub repository.
…move Notification Channels - Remove "Receiving alerts" and "Running HolmesGPT" bullets from Agent section (redundant with intro) - Remove standalone Notification Channels section; mention sinks briefly under Robusta Classic instead - Move Robusta Classic footnote from oss-vs-saas.rst to architecture.rst https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/how-it-works/architecture.rst`:
- Line 17: Update the wording that currently says "Prometheus AlertManager" to
use the official component name "Prometheus Alertmanager" (one word, lowercase
'm'); locate the phrase "Receiving alerts from Prometheus AlertManager and other
sources" and replace it with "Receiving alerts from Prometheus Alertmanager and
other sources".
- Line 56: The documentation reference is broken: in
docs/how-it-works/architecture.rst replace the doc link target string
"../configuration/configuring-sinks" with the correct
"../notification-routing/configuring-sinks" so the :doc:`many more sinks <...>`
cross-reference points to the proper document.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/how-it-works/architecture.rst (1)
6-8: Add alt text for the architecture diagram.This improves accessibility and makes the diagram meaningful for screen readers.
📝 Suggested update
.. image:: ../images/architecture-overview.png + :alt: Overview of Robusta architecture: in-cluster Agent, Platform, and Integrations/data sources :width: 800 :align: center🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/how-it-works/architecture.rst` around lines 6 - 8, The architecture diagram image directive lacks alt text for accessibility; update the RST image block (the .. image:: ../images/architecture-overview.png directive) to include an :alt: "..." option with a concise descriptive string (e.g., "Architecture overview showing components and data flows between services") so screen readers can convey the diagram meaning; keep the alt text short and descriptive and add it as another indented option line alongside :width: and :align:.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/how-it-works/architecture.rst`:
- Line 66: The cross-reference "configuring-sinks" used in
docs/how-it-works/architecture.rst may point to a nonexistent target; open
architecture.rst and verify the reference target string "configuring-sinks"
exists as a document name or label in the docs tree (search for
configuring-sinks.rst or a :ref: label), then update the reference to the
correct target (either change the link to the actual document filename or label,
e.g., "../configuration/actual-filename" or use :ref:`label-name
<configuring-sinks>`), ensuring the text in architecture.rst and the referenced
doc title/label match.
---
Nitpick comments:
In `@docs/how-it-works/architecture.rst`:
- Around line 6-8: The architecture diagram image directive lacks alt text for
accessibility; update the RST image block (the .. image::
../images/architecture-overview.png directive) to include an :alt: "..." option
with a concise descriptive string (e.g., "Architecture overview showing
components and data flows between services") so screen readers can convey the
diagram meaning; keep the alt text short and descriptive and add it as another
indented option line alongside :width: and :align:.
- Replace architecture-overview.png with new diagram - Update data sources to match diagram: Prometheus, Datadog, AWS, Grafana, Jira (remove New Relic, CloudWatch, NPAW, Conviva, ServiceNow) - Rewrite Kubernetes bullet as optional for K8s-specific use cases - Simplify Data Sources section to a single list https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Drop "across multiple clusters" to avoid Kubernetes-only framing. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Notification routing is a Robusta Classic feature. Replace the Platform bullet with a reference to the HolmesGPT bot for Slack and Teams. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
- Delete usage-faq.rst and its toctree entry - Rewrite landing page to focus on AI-powered investigation, data sources, Slack/Teams bot, and the Platform as SRE agent control center - Remove Kubernetes-centric OSS vs Pro comparison table - Replace Robusta Classic features (Smart Grouping, Enrichment, Remediation) with HolmesGPT-focused feature list https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
…eplacement These questions are no longer relevant given the updated messaging around data source integrations. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Move 5 sidebar sections (Send Alerts, Track Config Changes, Notification Sinks, Alert Routing, Advanced - Playbooks) under a single "Other Features" group. Add hidden toctrees to each section's index page so sub-pages remain accessible. No file paths changed so all existing URLs continue to work. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
- Move Connect Metrics and CRDs Monitoring under Other Features - Rename "Robusta Pro Features" to "HTTP APIs" - Move Holmes Chat API from AI Analysis to HTTP APIs - Rewrite robusta-pro-features.rst as HTTP APIs overview - Add hidden toctree to metric-providers.rst for sub-pages - All file paths unchanged so existing URLs keep working https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/notification-routing/configuring-sinks.rst (1)
40-40:⚠️ Potential issue | 🟡 MinorFix minor typos/grammar in sink docs.
Spotted a few small errors: “lets” → “let’s”, “previou” → “previous”, and “On some scenarios” → “In some scenarios”.
✍️ Proposed edits
-For example, lets add a :ref:`Microsoft Teams Sink <MS Teams>`: +For example, let's add a :ref:`Microsoft Teams Sink <MS Teams>`: -| | | sent a previou sink that set `stop: true`) | :ref:`Routing (scopes) <sink-scope-matching>` | +| | | sent a previous sink that set `stop: true`) | :ref:`Routing (scopes) <sink-scope-matching>` | -On some scenarios, you may want to ignore Sinks initialization errors. +In some scenarios, you may want to ignore Sinks initialization errors.Also applies to: 98-100, 124-124
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/notification-routing/configuring-sinks.rst` at line 40, Fix minor typos/grammar in the sink docs by correcting “lets” to “let’s” in the sentence that introduces the Microsoft Teams Sink (the phrase "For example, lets add a :ref:`Microsoft Teams Sink <MS Teams>`"), change any occurrences of “previou” to “previous”, and replace “On some scenarios” with “In some scenarios” in the affected paragraphs (also update the same corrections around the lines referenced: the earlier sentence and the block near lines 98–100 and line 124). Ensure the quoted phrases are updated verbatim in the docs/notification-routing/configuring-sinks.rst content so spelling and grammar are consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@docs/notification-routing/configuring-sinks.rst`:
- Line 40: Fix minor typos/grammar in the sink docs by correcting “lets” to
“let’s” in the sentence that introduces the Microsoft Teams Sink (the phrase
"For example, lets add a :ref:`Microsoft Teams Sink <MS Teams>`"), change any
occurrences of “previou” to “previous”, and replace “On some scenarios” with “In
some scenarios” in the affected paragraphs (also update the same corrections
around the lines referenced: the earlier sentence and the block near lines
98–100 and line 124). Ensure the quoted phrases are updated verbatim in the
docs/notification-routing/configuring-sinks.rst content so spelling and grammar
are consistent.
---
Duplicate comments:
In `@docs/how-it-works/architecture.rst`:
- Line 63: Update the broken Sphinx doc reference in the sentence that mentions
"Robusta Classic" by replacing the link target text ':doc:`other notification
channels <../configuration/configuring-sinks>`' with ':doc:`other notification
channels <../notification-routing/configuring-sinks>`' so the reference points
to the correct notification-routing page; ensure the surrounding markup remains
valid Sphinx ReST syntax and run a local build to verify the link resolves.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
docs/index.rst (2)
44-56: Consider feature discoverability under generic "Other Features" heading.The consolidation of previously top-level sections (Notification Sinks, Alert Routing, etc.) under "Other Features" may make it harder for users to discover these capabilities. While this aligns with the PR's goal of streamlining toward a HolmesGPT-centric narrative, consider whether a more descriptive caption (e.g., "Configuration & Integrations" or "Advanced Configuration") would improve navigation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/index.rst` around lines 44 - 56, The "Other Features" toctree caption reduces discoverability; update the toctree caption in the toctree block (the :caption: line) to a more descriptive label such as "Configuration & Integrations" or "Advanced Configuration" so users can find sections like Notification Sinks, Alert Routing, and Playbooks more easily—locate the toctree block containing ":caption: Other Features" and replace that caption text accordingly.
69-69: Line 69 is excessively long (345 characters) and introduces multiple concepts; consider breaking it into 2-3 shorter sentences.Line 69 is quite long and introduces many concepts at once (Robusta, HolmesGPT, AI agent, multiple data sources like Prometheus, Datadog, AWS, Grafana, Jira, and LLMs). Breaking it into multiple sentences would improve readability, especially for users encountering the product for the first time.
The external links and file references have been verified and are correct.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/index.rst` at line 69, Split the long sentence starting "Robusta is an AI-powered SRE agent that automatically investigates alerts and finds root causes..." into 2–3 shorter sentences: 1) introduce Robusta and its relationship to HolmesGPT (keep the existing HolmesGPT link), 2) list the supported data sources as a short comma-separated clause (Prometheus, Datadog, AWS, Grafana, Jira), and 3) state that Robusta uses LLMs to pinpoint issues; preserve the original link text and verified references while improving readability and sentence flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@docs/index.rst`:
- Around line 44-56: The "Other Features" toctree caption reduces
discoverability; update the toctree caption in the toctree block (the :caption:
line) to a more descriptive label such as "Configuration & Integrations" or
"Advanced Configuration" so users can find sections like Notification Sinks,
Alert Routing, and Playbooks more easily—locate the toctree block containing
":caption: Other Features" and replace that caption text accordingly.
- Line 69: Split the long sentence starting "Robusta is an AI-powered SRE agent
that automatically investigates alerts and finds root causes..." into 2–3
shorter sentences: 1) introduce Robusta and its relationship to HolmesGPT (keep
the existing HolmesGPT link), 2) list the supported data sources as a short
comma-separated clause (Prometheus, Datadog, AWS, Grafana, Jira), and 3) state
that Robusta uses LLMs to pinpoint issues; preserve the original link text and
verified references while improving readability and sentence flow.
- Remove "Robusta Pro Features - Detailed breakdown" from Learn More (page was renamed to HTTP APIs) - Add brief Robusta Classic section explaining the rule-based automation engine that predates HolmesGPT https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
…rding - Replace inline data source lists with link to holmesgpt.dev/data-sources/ - Point "Get started" to platform.robusta.dev/signup instead of install ref - Soften Robusta Classic wording to "can be installed as part of" https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
- Replace embedded videos and usage examples with link to home.robusta.dev - Add signup link to Next Steps on main-features - Remove "Robusta UI sink enabled" prerequisite from getting-started - Link Robusta SaaS account prerequisite to signup page https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
…rt tab - Change "GPT-4o" to "frontier models from Anthropic, OpenAI, and more" - Remove Test Your Setup section - Replace inline AI provider configs with link to holmesgpt.dev/ai-providers/ - Add ?tab=robusta-helm-chart to all holmesgpt.dev links (docs + conf.py redirects) - Remove redundant note about Helm chart configuration sections https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
…ve pages - Remove Common Issues and trim Next Steps in getting-started (keep only Configure Data Sources) - Fix sidebar label: "Managed Prometheus Alerts" -> "Send Alerts to Robusta" - Point GitHub Issue link to holmesgpt repo instead of robusta - Update README: mark as Robusta Classic, add HolmesGPT link at top - Remove Contributing page, docs-contributions, and Community Tutorials - Update conf.py redirects for removed pages https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
- Change repo_url and social link in conf.py to holmesGPT repo - Change Other Features toctree maxdepth from 4 to 1 so sub-pages nest under their parent (e.g. PagerDuty under Send Alerts to Robusta) instead of all appearing as flat top-level items https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
The sphinx_immaterial theme renders ALL pages from nested toctrees as flat items in the sidebar nav, ignoring maxdepth. Fix by removing the hidden toctree directives from parent pages under Other Features: - configuration/index.rst (Send Alerts to Robusta) - notification-routing/configuring-sinks.rst (Notification Sinks) - notification-routing/index.rst (Alert Routing) - playbook-reference/index.rst (Playbooks) - track-changes/kubernetes-changes.rst (Track Config Changes) - configuration/metric-providers.rst (Connect Metrics) Child pages remain accessible via in-page content links and grid cards. Also suppress toc.excluded warnings for these now-orphaned child pages. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Create how-it-works/index.rst as a parent container with toctree so that Architecture and Open Source vs SaaS pages appear nested under "Overview" in the sidebar, matching the pattern used by GitOps > ArgoCD. Previously they were listed flat as orphan pages. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Remove subtitle and HolmesGPT callout from header. Move navigation links (How it Works, Installation, etc.) under the What Can Robusta Do section. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Add hidden toctree listing all alertmanager-integration child pages so they appear properly in the sidebar navigation. Remove extra HolmesGPT sentence from oss-vs-saas page. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Remove toc.excluded and toc.not_readable from suppress_warnings so these warnings surface during builds. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Move toctree to end of configuration/index.rst (after content) to match the pattern used by gitops/index.rst. When the toctree was before the heading, Sphinx rendered children as flat siblings. With toctree after content, alertmanager pages now nest properly under "Send Alerts to Robusta" in the sidebar. Also set Other Features maxdepth to 4 to allow deep nesting. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Section headings (Prometheus & AlertManager, Other, Advanced) after the toctree caused Sphinx to nest toctree children under the last section in the sidebar. Moving the toctree before the headings (like actions/index.rst already does) keeps children flat under the page title while preserving proper RST section headings. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
59f8919 to
fd08b54
Compare
- Remove stale "Looking to get push notifications" note from kubernetes-changes.rst - Add LaunchDarkly to configuration/index.rst toctree and grid - Add toctree to configuring-sinks.rst so individual sink pages (Slack, Teams, PagerDuty, etc.) appear as sidebar children - Add toctree to notification-routing/index.rst so routing pages appear as sidebar children under Alert Routing - Add toctree to playbook-reference/index.rst so sub-sections appear as sidebar children under Playbooks https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
LaunchDarkly was originally a child of Track Config Changes on master, not Send Alerts. Move it there by adding a toctree to kubernetes-changes.rst and removing it from configuration/index.rst. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Replace how-it-works/index with its children (architecture, oss-vs-saas) directly in the root toctree so they appear as siblings of Welcome to Robusta instead of nested under a duplicate Overview entry. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
- Remove link to pro features from "complete feature set" text - Change MIT licensed to CNCF sandbox project - Simplify Robusta Classic migration paragraph https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Update sidebar caption and cross-reference in playbook-reference. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Change the Get Started link on the index page from the local installation docs to the platform signup URL. https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS
Summary
Restructured the architecture documentation to provide a clearer, more intuitive explanation of Robusta's three main components and how they interact. Updated the OSS vs SaaS documentation to better distinguish between the open source projects (HolmesGPT and Robusta Classic) and deployment options.
Key Changes
Architecture Documentation (
docs/how-it-works/architecture.rst)OSS vs SaaS Documentation (
docs/how-it-works/oss-vs-saas.rst)Assets
architecture-overview.pngdiagram to visualize the three-component architectureNotable Details
https://claude.ai/code/session_0184fwhrjC4ELrYQBMCbyhfS