Conversation
- passlib -> pwdlib - preciser limit enforcements for k8s pods, also psa labels - bumped up dependencies (last ones got CVEs inside), also added up passwords
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Kubernetes namespace hardening and RBAC updates, migrates password hashing from passlib to pwdlib, enables CSP nonce and ANSI-library changes in the frontend, requires Redis authentication and exposes deployment secrets, and adds pod cleanup on terminal events; documentation and CI updated accordingly. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(46,134,193,0.5)
participant Runner as RunK8sWorker
end
rect rgba(39,174,96,0.5)
participant Worker as KubernetesWorker
end
rect rgba(241,196,15,0.5)
participant NetAPI as networking_v1 API
participant CoreAPI as core/v1 API
participant AppsAPI as apps/v1 API
end
Runner->>Worker: start() / init()
Worker->>NetAPI: patch NetworkPolicy(namespace, default-deny)
NetAPI-->>Worker: patched
Worker->>CoreAPI: patch ResourceQuota(namespace)
CoreAPI-->>Worker: patched
Worker->>CoreAPI: patch Namespace labels (PSA)
CoreAPI-->>Worker: patched
Worker->>AppsAPI: patch_namespaced_daemon_set(pre-puller)
AppsAPI-->>Worker: patched
Worker-->>Runner: namespace security ensured
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
3 issues found across 11 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/secrets.example.toml">
<violation number="1" location="backend/secrets.example.toml:18">
P2: Avoid committing a real/default Redis password in the example secrets template; use a CHANGE_ME placeholder to prevent insecure deployments reusing a weak default.</violation>
</file>
<file name="backend/app/services/k8s_worker/worker.py">
<violation number="1" location="backend/app/services/k8s_worker/worker.py:302">
P1: Security controls fail silently: if applying the NetworkPolicy, ResourceQuota, or PSA labels fails (e.g., due to RBAC misconfiguration), the error is only logged and the worker starts up without security controls. For a security-critical path, these exceptions should propagate so the worker either retries or refuses to start without its security posture in place. Consider re-raising after logging, or at minimum returning a status from each helper so `ensure_namespace_security` can raise if any control failed.</violation>
</file>
<file name="frontend/nginx.conf.template">
<violation number="1" location="frontend/nginx.conf.template:17">
P2: `style-src` still includes `'unsafe-inline'`, which negates hash-based CSP hardening for inline styles and keeps CSS injection risks open. If the intent is to harden CSP, drop `unsafe-inline` and rely on hashes/nonces for inline styles.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 12 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docs/operations/nginx-configuration.md">
<violation number="1" location="docs/operations/nginx-configuration.md:259">
P3: Admonition content is no longer indented, so the warning block won’t render as a warning in MkDocs. Re-indent the lines under `!!! warning` to keep them part of the admonition.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Pull request overview
This PR focuses on tightening security across the stack (frontend HTTP headers, local dev infra, authentication hashing, and Kubernetes execution isolation) to reduce attack surface and strengthen runtime controls.
Changes:
- Hardened frontend CSP by removing
unsafe-inlinefromscript-srcand adding script/style hashes. - Secured local Redis with
requirepass, and pinned MongoDB/Redis image patch versions in docker-compose. - Added Kubernetes namespace security initialization (default-deny NetworkPolicy, ResourceQuota, PSA labels) and added optional RuntimeClass support for execution pods; migrated password hashing from
passlibtopwdlib[bcrypt].
Reviewed changes
Copilot reviewed 20 out of 22 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/nginx.conf.template | Updates CSP to use hashes and reduce inline script allowance. |
| docker-compose.yaml | Pins DB/cache image versions; adds Redis auth and updates healthcheck accordingly. |
| backend/workers/run_k8s_worker.py | Runs new namespace security bootstrap on worker startup. |
| backend/uv.lock | Updates lockfile for dependency changes (adds pwdlib/bcrypt, removes passlib). |
| backend/secrets.example.toml | Adds REDIS_PASSWORD example secret. |
| backend/scripts/seed_users.py | Migrates seed script hashing from passlib to pwdlib bcrypt. |
| backend/pyproject.toml | Replaces passlib with pwdlib[bcrypt] dependency. |
| backend/app/settings.py | Adds K8S_POD_RUNTIME_CLASS_NAME and includes Redis password setting in Settings. |
| backend/app/services/k8s_worker/worker.py | Adds namespace security enforcement (NetworkPolicy/Quota/PSA labels). |
| backend/app/services/k8s_worker/pod_builder.py | Adds RuntimeClass support and sets host_users=False for additional pod isolation. |
| backend/app/core/security.py | Migrates application password hashing/verification from passlib to pwdlib bcrypt. |
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (7)
docs/operations/nginx-configuration.md (1)
56-60: Consider documenting a production path with upstream TLS verification enabled.This section is clear, but adding a “preferred production setup” (
proxy_ssl_verify on+ trusted CA) would strengthen the guidance and avoid normalizing verification-off configurations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/operations/nginx-configuration.md` around lines 56 - 60, Add a recommended production configuration note that contrasts the existing insecure example: explain enabling TLS verification by setting proxy_ssl_verify on and configuring a trusted CA bundle via proxy_ssl_trusted_certificate (and optionally proxy_ssl_verify_depth), and mention using real upstream hostnames with proxy_pass (still set via envsubst) and preserving Host header with proxy_set_header Host $host; include a short guidance to mount or reference the trusted CA file into the container and avoid proxy_ssl_verify off in production.frontend/src/components/editor/OutputPanel.svelte (1)
98-118: Precompute rendered ANSI HTML to avoid repeated conversion on re-renders.Line 98 and Line 118 currently do conversion + sanitization in-template. Moving this to
$derivedkeeps rendering lighter and easier to reason about.♻️ Proposed refactor
+ const stdoutHtml = $derived.by(() => + result?.stdout ? sanitize(ansiConverter.ansi_to_html(result.stdout)) : '' + ); + + const stderrHtml = $derived.by(() => + result?.stderr ? sanitize(ansiConverter.ansi_to_html(result.stderr)) : '' + ); @@ - <pre class="output-pre custom-scrollbar" data-testid="output-pre">{`@html` sanitize(ansiConverter.ansi_to_html(result.stdout))}</pre> + <pre class="output-pre custom-scrollbar" data-testid="output-pre">{`@html` stdoutHtml}</pre> @@ - <pre class="text-xs text-red-600 dark:text-red-300 whitespace-pre-wrap break-words font-mono bg-transparent p-0 pr-8">{`@html` sanitize(ansiConverter.ansi_to_html(result.stderr))}</pre> + <pre class="text-xs text-red-600 dark:text-red-300 whitespace-pre-wrap break-words font-mono bg-transparent p-0 pr-8">{`@html` stderrHtml}</pre>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/components/editor/OutputPanel.svelte` around lines 98 - 118, The template is calling ansiConverter.ansi_to_html(...) and sanitize(...) inline for result.stdout and result.stderr which repeats work on every render; precompute sanitized HTML once using a reactive derived value (e.g., create reactive variables like stdoutHtml and stderrHtml or use a Svelte derived store) that call ansiConverter.ansi_to_html(result.stdout) and sanitize(...) (and likewise for stderr) and then replace the inline {`@html` ...} uses with those precomputed variables to avoid repeated conversion and sanitization; reference ansiConverter, sanitize, result.stdout, result.stderr, and the new reactive stdoutHtml/stderrHtml variables in the fix.backend/scripts/seed_users.py (1)
26-26: Avoid hardcoding bcrypt cost in the seed script.Using a fixed rounds value here can drift from the runtime setting and create weaker-or-inconsistent seeded credentials over time.
♻️ Proposed refactor
-pwd_hasher = PasswordHash((BcryptHasher(rounds=12),)) - - async def upsert_user( db: AsyncDatabase[dict[str, Any]], + pwd_hasher: PasswordHash, username: str, @@ async def seed_users(settings: Settings) -> None: """Seed default users using provided settings for MongoDB connection.""" + pwd_hasher = PasswordHash((BcryptHasher(rounds=settings.BCRYPT_ROUNDS),)) @@ await upsert_user( db, + pwd_hasher, username="user", @@ await upsert_user( db, + pwd_hasher, username="admin",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/scripts/seed_users.py` at line 26, The seed script hardcodes bcrypt cost in the PasswordHash instantiation (pwd_hasher = PasswordHash((BcryptHasher(rounds=12),))) — change this to read the bcrypt rounds from the application's runtime configuration or environment instead of using the literal 12; locate PasswordHash and BcryptHasher usage in seed_users.py and replace the hardcoded rounds with the shared config value (e.g., import the bcrypt rounds constant or a get_bcrypt_rounds() helper from the app settings or config module) and keep a sensible default fallback if the config is missing.backend/app/services/k8s_worker/worker.py (3)
296-303: Use structured logging fields in new security methods.The new logs interpolate values via f-strings. Switch to bound/keyword fields (
policy_name=...,namespace=...,reason=...) for safer and queryable logs.As per coding guidelines: "Use only structured logging via injected structlog.stdlib.BoundLogger ... pass user-controlled data as keyword arguments, never interpolate into log messages".
Also applies to: 330-337, 350-352
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/worker.py` around lines 296 - 303, Replace f-string interpolation in the logging calls with structured keyword fields: change self.logger.info(f"NetworkPolicy '{policy_name}' updated in namespace {namespace}") to self.logger.info("NetworkPolicy updated", policy_name=policy_name, namespace=namespace) and similarly for the create/info and error paths—use self.logger.info("NetworkPolicy created", policy_name=policy_name, namespace=namespace) and self.logger.error("Failed to apply NetworkPolicy", policy_name=policy_name, namespace=namespace, reason=e.reason) when catching ApiException from networking_v1.create_namespaced_network_policy; make the same pattern for the other logger usages referenced (the similar info/error calls around the other NetworkPolicy operations) so all user-controlled values are passed as keyword fields instead of interpolated into the message.
257-263: New docstrings should follow Google style sections.The added method docstrings are descriptive but missing explicit
Args/Returns/Raisessections required for backend Python files.As per coding guidelines: "Use Google-style docstrings with Args/Returns/Raises sections".
Also applies to: 270-270, 305-306, 339-339
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/worker.py` around lines 257 - 263, The docstring that begins "Apply security controls to the executor namespace at startup." is missing Google-style sections; update that function's docstring to include explicit Args:, Returns:, and Raises: sections describing each parameter (type and purpose), what the function returns (or None), and any exceptions it may raise; apply the same change to the other new docstrings referenced in the diff (the other startup/security-related docstrings), ensuring each uses Google-style formatting and accurate parameter/return/exception descriptions so linters and readers have clear API contracts.
341-343: Pinpod-security.kubernetes.io/enforce-versionto a specific Kubernetes minor version.Using
latestcan change enforcement behavior during cluster upgrades. Kubernetes best practices recommend pinningenforce-versionto your current cluster's Kubernetes minor version (e.g.,v1.35) to keep admission policy behavior stable. Setwarn-versionandaudit-versiontolatestto track upcoming standard changes before they become blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/worker.py` around lines 341 - 343, Update the pod-security label values in the labels dict where these keys are set (look for "pod-security.kubernetes.io/enforce", "pod-security.kubernetes.io/enforce-version", "pod-security.kubernetes.io/warn" in worker.py) by replacing the current enforce-version value "latest" with your cluster's Kubernetes minor version string (e.g., "v1.35"); also set "pod-security.kubernetes.io/warn-version" and "pod-security.kubernetes.io/audit-version" to "latest" if you want to track upcoming changes. Ensure you only change the label values (not key names) so admission behavior is pinned for enforce-version while warn/audit remain at latest.backend/app/services/k8s_worker/pod_builder.py (1)
109-110: Gatehost_users=Falsebehind Kubernetes version/feature-gate support.The
hostUsersfield requires Kubernetes v1.25+ with theUserNamespacesSupportfeature gate enabled (disabled by default until v1.33). Settinghost_users=Falseunconditionally will cause pod creation to fail on unsupported clusters (< v1.25) or clusters running v1.28–v1.32 without the feature gate explicitly enabled. Add a settings flag or cluster version check before including this field to prevent runtime pod creation failures in mixed or older cluster environments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/pod_builder.py` around lines 109 - 110, The pod spec currently sets host_users=False unconditionally which will break on clusters that don't support UserNamespacesSupport; update the PodBuilder creation logic (where host_users is passed, e.g., in the code referencing host_users and self._settings.K8S_POD_RUNTIME_CLASS_NAME) to only include the host_users field when the cluster supports user namespaces or when an explicit settings flag is enabled (add a new setting like K8S_ENABLE_USER_NAMESPACES or perform a cluster version/feature-gate check before setting host_users). Ensure the code path that builds the pod spec omits the host_users parameter entirely for unsupported clusters and document/validate the new setting in the settings accessor used by PodBuilder.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/services/k8s_worker/worker.py`:
- Around line 291-303: The ApiException handler in the NetworkPolicy apply logic
swallows non-404 errors (in the block calling
networking_v1.read_namespaced_network_policy /
replace_namespaced_network_policy), which lets startup continue without required
hardening; after logging the error (self.logger.error(...)), re-raise the
exception so startup fails fast. Update the except ApiException as e branches to
call raise after the error log for non-404 cases; apply the same change to the
similar handlers around the other NetworkPolicy operations (the blocks at the
other mentions, e.g., where
replace_namespaced_network_policy/create_namespaced_network_policy are used).
- Around line 317-323: The ResourceQuota "hard" dict is using hardcoded "128Mi"
and 1 CPU assumptions and must be computed from the per-pod settings so quotas
track changes to K8S_POD_*; update the construction of the hard dict (the block
that references self._settings.K8S_MAX_CONCURRENT_PODS) to derive values by
multiplying per-pod request/limit settings (e.g.
self._settings.K8S_POD_CPU_REQUEST, self._settings.K8S_POD_CPU_LIMIT,
self._settings.K8S_POD_MEMORY_REQUEST_MB, self._settings.K8S_POD_MEMORY_LIMIT_MB
or the actual K8S_POD_* names present in settings) by
self._settings.K8S_MAX_CONCURRENT_PODS, format CPU values as strings like "1" or
"0.5" and memory as "{N}Mi", and replace the hardcoded "128Mi"/implicit CPU
values so the quota matches configured per-pod resources.
- Around line 292-295: The code calls read_namespaced_network_policy and
read_namespaced_resource_quota but discards the returned objects before calling
replace_namespaced_network_policy and replace_namespaced_resource_quota, which
causes Kubernetes to reject updates; fix by capturing the read results (e.g.,
existing_policy = await
self.networking_v1.read_namespaced_network_policy(name=policy_name,
namespace=namespace) and existing_quota = await
self.core_v1.read_namespaced_resource_quota(name=quota_name,
namespace=namespace)) and copy existing_policy.metadata.resource_version into
policy.metadata.resource_version and existing_quota.metadata.resource_version
into quota.metadata.resource_version before calling
replace_namespaced_network_policy and replace_namespaced_resource_quota so the
optimistic concurrency resourceVersion is preserved.
In `@backend/secrets.example.toml`:
- Line 18: Replace the usable default Redis password value "redispassword" in
the sample secrets with a non-usable placeholder to prevent accidental
deployment; update the REDIS_PASSWORD entry to something like
"CHANGE_ME_REDIS_PASSWORD" (or similar clearly-not-secret text) so consumers
must deliberately set a real secret before use and it’s obvious in the
REDIS_PASSWORD field.
In `@docker-compose.yaml`:
- Around line 74-82: The compose file currently falls back to a default by using
${REDIS_PASSWORD:-redispassword}; update both places (the redis-server command
string and the healthcheck redis-cli invocation) to use the variable without a
default (use ${REDIS_PASSWORD}) so startup will fail fast when the secret is not
provided rather than silently using a known password; ensure REDIS_PASSWORD is
documented/provided in the environment or .env for deployments.
In `@frontend/public/index.html`:
- Around line 34-37: The spinner lacks accessible semantics—update the container
with id "app-loading" (and/or the inner wrapper "app-loading-inner") to include
role="status" and aria-live="polite" (and optionally aria-busy="true"), and add
a hidden textual status node (e.g., a <span> with a visually-hidden class or
aria-label text like "Loading…") inside the container so screen readers announce
the state; keep the decorative spinner element (class "app-loading-spinner")
aria-hidden="true".
---
Nitpick comments:
In `@backend/app/services/k8s_worker/pod_builder.py`:
- Around line 109-110: The pod spec currently sets host_users=False
unconditionally which will break on clusters that don't support
UserNamespacesSupport; update the PodBuilder creation logic (where host_users is
passed, e.g., in the code referencing host_users and
self._settings.K8S_POD_RUNTIME_CLASS_NAME) to only include the host_users field
when the cluster supports user namespaces or when an explicit settings flag is
enabled (add a new setting like K8S_ENABLE_USER_NAMESPACES or perform a cluster
version/feature-gate check before setting host_users). Ensure the code path that
builds the pod spec omits the host_users parameter entirely for unsupported
clusters and document/validate the new setting in the settings accessor used by
PodBuilder.
In `@backend/app/services/k8s_worker/worker.py`:
- Around line 296-303: Replace f-string interpolation in the logging calls with
structured keyword fields: change self.logger.info(f"NetworkPolicy
'{policy_name}' updated in namespace {namespace}") to
self.logger.info("NetworkPolicy updated", policy_name=policy_name,
namespace=namespace) and similarly for the create/info and error paths—use
self.logger.info("NetworkPolicy created", policy_name=policy_name,
namespace=namespace) and self.logger.error("Failed to apply NetworkPolicy",
policy_name=policy_name, namespace=namespace, reason=e.reason) when catching
ApiException from networking_v1.create_namespaced_network_policy; make the same
pattern for the other logger usages referenced (the similar info/error calls
around the other NetworkPolicy operations) so all user-controlled values are
passed as keyword fields instead of interpolated into the message.
- Around line 257-263: The docstring that begins "Apply security controls to the
executor namespace at startup." is missing Google-style sections; update that
function's docstring to include explicit Args:, Returns:, and Raises: sections
describing each parameter (type and purpose), what the function returns (or
None), and any exceptions it may raise; apply the same change to the other new
docstrings referenced in the diff (the other startup/security-related
docstrings), ensuring each uses Google-style formatting and accurate
parameter/return/exception descriptions so linters and readers have clear API
contracts.
- Around line 341-343: Update the pod-security label values in the labels dict
where these keys are set (look for "pod-security.kubernetes.io/enforce",
"pod-security.kubernetes.io/enforce-version", "pod-security.kubernetes.io/warn"
in worker.py) by replacing the current enforce-version value "latest" with your
cluster's Kubernetes minor version string (e.g., "v1.35"); also set
"pod-security.kubernetes.io/warn-version" and
"pod-security.kubernetes.io/audit-version" to "latest" if you want to track
upcoming changes. Ensure you only change the label values (not key names) so
admission behavior is pinned for enforce-version while warn/audit remain at
latest.
In `@backend/scripts/seed_users.py`:
- Line 26: The seed script hardcodes bcrypt cost in the PasswordHash
instantiation (pwd_hasher = PasswordHash((BcryptHasher(rounds=12),))) — change
this to read the bcrypt rounds from the application's runtime configuration or
environment instead of using the literal 12; locate PasswordHash and
BcryptHasher usage in seed_users.py and replace the hardcoded rounds with the
shared config value (e.g., import the bcrypt rounds constant or a
get_bcrypt_rounds() helper from the app settings or config module) and keep a
sensible default fallback if the config is missing.
In `@docs/operations/nginx-configuration.md`:
- Around line 56-60: Add a recommended production configuration note that
contrasts the existing insecure example: explain enabling TLS verification by
setting proxy_ssl_verify on and configuring a trusted CA bundle via
proxy_ssl_trusted_certificate (and optionally proxy_ssl_verify_depth), and
mention using real upstream hostnames with proxy_pass (still set via envsubst)
and preserving Host header with proxy_set_header Host $host; include a short
guidance to mount or reference the trusted CA file into the container and avoid
proxy_ssl_verify off in production.
In `@frontend/src/components/editor/OutputPanel.svelte`:
- Around line 98-118: The template is calling ansiConverter.ansi_to_html(...)
and sanitize(...) inline for result.stdout and result.stderr which repeats work
on every render; precompute sanitized HTML once using a reactive derived value
(e.g., create reactive variables like stdoutHtml and stderrHtml or use a Svelte
derived store) that call ansiConverter.ansi_to_html(result.stdout) and
sanitize(...) (and likewise for stderr) and then replace the inline {`@html` ...}
uses with those precomputed variables to avoid repeated conversion and
sanitization; reference ansiConverter, sanitize, result.stdout, result.stderr,
and the new reactive stdoutHtml/stderrHtml variables in the fix.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
backend/uv.lockis excluded by!**/*.lockfrontend/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (20)
.github/workflows/release-deploy.ymlbackend/app/core/security.pybackend/app/services/k8s_worker/pod_builder.pybackend/app/services/k8s_worker/worker.pybackend/app/settings.pybackend/pyproject.tomlbackend/scripts/seed_users.pybackend/secrets.example.tomlbackend/workers/run_k8s_worker.pydocker-compose.yamldocs/SECURITY.mddocs/operations/nginx-configuration.mddocs/security/policies.mdfrontend/nginx.conf.templatefrontend/package.jsonfrontend/public/index.htmlfrontend/src/app.cssfrontend/src/components/editor/CodeMirrorEditor.sveltefrontend/src/components/editor/OutputPanel.sveltefrontend/src/components/editor/__tests__/CodeMirrorEditor.test.ts
|
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
- index.html: aria labels for loading wheel - worker: well-typed objs for k8s-client internal stuff - seed user: moved pwd hasher creation to main, taking num of rounds from settings obj - docs: update on diff between style-src-elem vs. style-src-attr - docs: fix of identation in nginx-config doc
There was a problem hiding this comment.
🧹 Nitpick comments (4)
frontend/public/index.html (1)
41-48: Consider settingaria-busy="false"when hiding the loader.The loader correctly hides via
classList.add('app-loading-hidden'), butaria-busyremainstrueeven after the app loads. While the element is hidden, updating the ARIA state ensures proper semantics if the element is ever shown again or inspected by assistive tech.♿ Proposed fix
<script nonce="**CSP_NONCE**"> window.addEventListener('DOMContentLoaded', () => { setTimeout(() => { const loader = document.getElementById('app-loading'); - if (loader) loader.classList.add('app-loading-hidden'); + if (loader) { + loader.classList.add('app-loading-hidden'); + loader.setAttribute('aria-busy', 'false'); + } }, 100); }); </script>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/public/index.html` around lines 41 - 48, When hiding the loader inside the DOMContentLoaded handler (the setTimeout block that queries the element with id 'app-loading' and adds the 'app-loading-hidden' class), also update its ARIA state by setting aria-busy to "false" (e.g., element.setAttribute('aria-busy', 'false')) so the loader's accessible state reflects that loading has finished; locate the script that listens for 'DOMContentLoaded' and the code that modifies the 'app-loading' element and add the aria update immediately after adding the class.backend/scripts/seed_users.py (1)
27-36: Updateupsert_userdocstring to Google style (including newpwd_hasherarg).The function docstring should include
Args/Returns(andRaisesif applicable), especially now that the signature changed.Suggested docstring update
async def upsert_user( db: AsyncDatabase[dict[str, Any]], pwd_hasher: PasswordHash, username: str, @@ ) -> None: - """Create user or update if exists.""" + """Create a user or update an existing user. + + Args: + db: MongoDB async database handle. + pwd_hasher: Password hasher used to hash plaintext passwords. + username: Username to upsert. + email: Email address for the user. + password: Plaintext password to hash and store. + role: Role assigned to the user. + is_superuser: Whether the user has superuser privileges. + + Returns: + None + """As per coding guidelines: Use Google-style docstrings with Args/Returns/Raises sections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/scripts/seed_users.py` around lines 27 - 36, The docstring for upsert_user is missing Google-style Args/Returns (and Raises if applicable) and does not document the new pwd_hasher parameter; update the upsert_user docstring to Google style by adding an Args section that documents db (AsyncDatabase[dict[str, Any]]), pwd_hasher (PasswordHash), username (str), email (str), password (str), role (str), and is_superuser (bool), a Returns section indicating it returns None, and a Raises section for any exceptions the function may propagate (e.g., database errors) so callers know what to expect.backend/app/services/k8s_worker/worker.py (2)
384-386: Consider whether daemonset failures should halt startup.The error handling logs but doesn't re-raise, allowing startup to continue without the image pre-puller. This contrasts with the security hardening methods which fail fast.
If the pre-puller is purely an optimization (faster cold starts), the current graceful degradation is reasonable. If image availability is required for execution, this should re-raise. The distinction from security methods suggests this is intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/worker.py` around lines 384 - 386, The exception handlers around applying the DaemonSet (catching ApiException as e and the generic except Exception as e) currently log errors but swallow them, allowing startup to continue; decide whether image pre-puller failures must be fatal and implement accordingly: if image availability is required, re-raise the caught exceptions (or raise a custom exception) from the except blocks in the apply DaemonSet logic (referencing ApiException, daemonset_name and the generic except Exception handler) so startup fails fast; if pre-puller is optional, add an explicit comment and leave graceful degradation but change the log level/message to clearly indicate non-fatal behavior.
309-315: Parsing lacks validation for settings configuration errors.The
.removesuffix()approach assumes CPU values end in"m"and memory in"Mi", matching the documented TOML defaults ("1000m","128Mi"). However, there is no validation on these settings fields, so misconfiguration in the TOML file will cause silent bugs or runtime failures:
Misconfigured TOML value Behavior "1000m"✓ Works correctly "1"Calculates 1m instead of 1000m (silent bug) "0.5"Raises ValueErrorat runtime"256M"Fails to parse as integer at runtime Consider adding a Pydantic field validator to
Settingsto validate format and normalize units, ensuring configuration errors are caught early.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/worker.py` around lines 309 - 315, Add validation and normalization for K8S_POD_CPU_REQUEST, K8S_POD_CPU_LIMIT, K8S_POD_MEMORY_REQUEST, and K8S_POD_MEMORY_LIMIT on the Settings model instead of using .removesuffix() raw in worker.py; implement Pydantic validators on those fields (or a shared validator) to enforce allowed formats (e.g., CPU must end with "m" or be a plain integer convertible to millicores, memory must end with "Mi"), normalize and store them in a canonical form (e.g., "NNNm" and "NNNMi") or expose numeric millicore/MeBi values, and raise clear ValidationError on misconfiguration so the hard={...} block in k8s_worker.worker (which references those settings names) can safely multiply parsed integers without runtime ValueError or silent unit mistakes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/app/services/k8s_worker/worker.py`:
- Around line 384-386: The exception handlers around applying the DaemonSet
(catching ApiException as e and the generic except Exception as e) currently log
errors but swallow them, allowing startup to continue; decide whether image
pre-puller failures must be fatal and implement accordingly: if image
availability is required, re-raise the caught exceptions (or raise a custom
exception) from the except blocks in the apply DaemonSet logic (referencing
ApiException, daemonset_name and the generic except Exception handler) so
startup fails fast; if pre-puller is optional, add an explicit comment and leave
graceful degradation but change the log level/message to clearly indicate
non-fatal behavior.
- Around line 309-315: Add validation and normalization for K8S_POD_CPU_REQUEST,
K8S_POD_CPU_LIMIT, K8S_POD_MEMORY_REQUEST, and K8S_POD_MEMORY_LIMIT on the
Settings model instead of using .removesuffix() raw in worker.py; implement
Pydantic validators on those fields (or a shared validator) to enforce allowed
formats (e.g., CPU must end with "m" or be a plain integer convertible to
millicores, memory must end with "Mi"), normalize and store them in a canonical
form (e.g., "NNNm" and "NNNMi") or expose numeric millicore/MeBi values, and
raise clear ValidationError on misconfiguration so the hard={...} block in
k8s_worker.worker (which references those settings names) can safely multiply
parsed integers without runtime ValueError or silent unit mistakes.
In `@backend/scripts/seed_users.py`:
- Around line 27-36: The docstring for upsert_user is missing Google-style
Args/Returns (and Raises if applicable) and does not document the new pwd_hasher
parameter; update the upsert_user docstring to Google style by adding an Args
section that documents db (AsyncDatabase[dict[str, Any]]), pwd_hasher
(PasswordHash), username (str), email (str), password (str), role (str), and
is_superuser (bool), a Returns section indicating it returns None, and a Raises
section for any exceptions the function may propagate (e.g., database errors) so
callers know what to expect.
In `@frontend/public/index.html`:
- Around line 41-48: When hiding the loader inside the DOMContentLoaded handler
(the setTimeout block that queries the element with id 'app-loading' and adds
the 'app-loading-hidden' class), also update its ARIA state by setting aria-busy
to "false" (e.g., element.setAttribute('aria-busy', 'false')) so the loader's
accessible state reflects that loading has finished; locate the script that
listens for 'DOMContentLoaded' and the code that modifies the 'app-loading'
element and add the aria update immediately after adding the class.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/services/k8s_worker/worker.pybackend/scripts/seed_users.pydocs/operations/nginx-configuration.mdfrontend/public/index.html
There was a problem hiding this comment.
2 issues found across 4 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/app/services/k8s_worker/worker.py">
<violation number="1" location="backend/app/services/k8s_worker/worker.py:288">
P1: Loss of error isolation: the old code handled each security control independently (catch and log), so a failure in one wouldn't prevent the others. Now if `_ensure_executor_network_policy` throws (e.g., cluster doesn't support NetworkPolicy), `_ensure_executor_resource_quota` and `_apply_psa_labels` are skipped entirely. Consider wrapping each call in `ensure_namespace_security` with individual error handling, or restoring try/except in each method.</violation>
<violation number="2" location="backend/app/services/k8s_worker/worker.py:311">
P2: Brittle resource-value parsing: `removesuffix` + `int()` will crash with `ValueError` if settings don't use the exact expected suffix (e.g., CPU `"1"` instead of `"1000m"`, or memory `"1Gi"` instead of `"1024Mi"`). Consider a parsing helper that handles standard Kubernetes resource quantity formats, or at minimum validate the format at settings load time.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cert-generator/setup-k8s.sh`:
- Around line 145-148: The RBAC rules grant overly broad permissions for the
resources "networkpolicies" and "resourcequotas"; update the Role/ClusterRole
entries that list these resources so their verbs are reduced to only
["get","list","watch","patch"] (remove "delete" and any other unnecessary
verbs), ensuring the rules match the reconciliation code which uses server-side
apply (patch) only; locate the Role/ClusterRole block that references resources:
["networkpolicies"] and ["resourcequotas"] and replace the verbs array
accordingly.
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/app/services/pod_monitor/monitor.py">
<violation number="1" location="backend/app/services/pod_monitor/monitor.py:191">
P1: Pod deletion can occur even when terminal event publication fails, causing potential event loss. Only delete after confirming all terminal events were published successfully.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/app/services/pod_monitor/monitor.py (1)
222-238: Use structured logging with keyword arguments instead of f-string interpolation.Per coding guidelines, user-controlled or external data should be passed as keyword arguments to the logger rather than interpolated into log messages. While
pod_namecomes from Kubernetes metadata, using keyword arguments enables better log aggregation and filtering.♻️ Proposed refactor for structured logging
pod_name = pod.metadata.name try: await self._v1.delete_namespaced_pod( name=pod_name, namespace=pod.metadata.namespace, grace_period_seconds=0, ) - self.logger.info(f"Deleted completed pod {pod_name}") + self.logger.info("Deleted completed pod", pod_name=pod_name) except ApiException as e: if e.status == 404: - self.logger.debug(f"Pod {pod_name} already deleted") + self.logger.debug("Pod already deleted", pod_name=pod_name) else: - self.logger.warning(f"Failed to delete pod {pod_name}: {e.reason}") + self.logger.warning("Failed to delete pod", pod_name=pod_name, reason=e.reason)As per coding guidelines: "Use only structured logging via injected structlog.stdlib.BoundLogger... pass user-controlled data as keyword arguments, never interpolate into log messages."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/pod_monitor/monitor.py` around lines 222 - 238, In the async method _delete_pod, replace f‑string logging with structured logging by passing pod metadata and error details as keyword arguments to the injected logger instead of interpolating into messages; update the logger.info call that reports deletion, the logger.debug call for 404 cases, and the logger.warning call in the ApiException branch to use messages like "Deleted completed pod", "Pod already deleted", and "Failed to delete pod" respectively while supplying pod_name=pod_name (and namespace=pod.metadata.namespace) and for exceptions include error_reason=e.reason and error_status=e.status as keyword args so structured loggers can index the fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/app/services/pod_monitor/monitor.py`:
- Around line 222-238: In the async method _delete_pod, replace f‑string logging
with structured logging by passing pod metadata and error details as keyword
arguments to the injected logger instead of interpolating into messages; update
the logger.info call that reports deletion, the logger.debug call for 404 cases,
and the logger.warning call in the ApiException branch to use messages like
"Deleted completed pod", "Pod already deleted", and "Failed to delete pod"
respectively while supplying pod_name=pod_name (and
namespace=pod.metadata.namespace) and for exceptions include
error_reason=e.reason and error_status=e.status as keyword args so structured
loggers can index the fields.
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cert-generator/setup-k8s.sh">
<violation number="1" location="cert-generator/setup-k8s.sh:93">
P1: The cleanup selector is too broad: `status.phase!=Running` also deletes valid `Pending` executor pods, which can interrupt active executions during startup. Restrict cleanup to terminal phases only.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
cert-generator/setup-k8s.sh (1)
147-151:⚠️ Potential issue | 🟠 MajorStill over-permissive RBAC on policy/quota resources.
networkpoliciesandresourcequotasstill include destructive verbs beyond reconciliation needs. Keep these scoped to read/watch + patch.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cert-generator/setup-k8s.sh` around lines 147 - 151, The RBAC rules for resources "networkpolicies" and "resourcequotas" are over-permissive; update the verbs arrays for the entries that reference resources: ["networkpolicies"] and resources: ["resourcequotas"] so they only include read/watch plus patch (e.g., ["get","list","watch","patch"]); remove create/update/delete from those verbs and ensure the corresponding apiGroups entry is present/appropriate for each resource if required.
🧹 Nitpick comments (1)
backend/app/services/k8s_worker/worker.py (1)
293-294: Use structured log fields instead of f-strings in new log lines.Please switch these to static messages + keyword fields for consistency and safer log processing.
🔧 Suggested fix
- self.logger.info(f"NetworkPolicy '{policy_name}' applied in namespace {namespace}") + self.logger.info("NetworkPolicy applied", policy_name=policy_name, namespace=namespace) @@ - self.logger.info(f"ResourceQuota '{quota_name}' applied in namespace {namespace}") + self.logger.info("ResourceQuota applied", quota_name=quota_name, namespace=namespace) @@ - self.logger.info(f"Pod Security Admission labels applied to namespace {namespace}") + self.logger.info("Pod Security Admission labels applied", namespace=namespace) @@ - self.logger.info(f"DaemonSet '{daemonset_name}' applied successfully") + self.logger.info("DaemonSet applied successfully", daemonset_name=daemonset_name, namespace=namespace)As per coding guidelines "Use only structured logging via injected structlog.stdlib.BoundLogger ... pass user-controlled data as keyword arguments, never interpolate into log messages".
Also applies to: 324-325, 336-336, 404-404
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/k8s_worker/worker.py` around lines 293 - 294, Replace f-string log interpolation with structured logging: change calls like self.logger.info(f"NetworkPolicy '{policy_name}' applied in namespace {namespace}") to a static message and pass dynamic values as keyword fields (e.g., self.logger.info("NetworkPolicy applied", policy_name=policy_name, namespace=namespace)). Do the same for the other similar uses of self.logger.info/error in this file that interpolate policy_name, namespace, pod_name, config_map_name, etc., ensuring all user-controlled data is passed as keyword arguments rather than embedded in the message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/services/k8s_worker/worker.py`:
- Around line 311-314: The CPU/memory quota math assumes CPU strings end with
"m" and does int(removesuffix('m')) which breaks for values like "1" or "0.5";
implement a small parser (e.g., parse_cpu_to_milli) that converts any CPU string
to an integer milliCPU value: if it ends with "m" strip and int, else parse as
float and multiply by 1000 (round or int), then use
parse_cpu_to_milli(self._settings.K8S_POD_CPU_REQUEST) and
parse_cpu_to_milli(self._settings.K8S_POD_CPU_LIMIT) when computing
"requests.cpu" and "limits.cpu" (e.g., f"{parse_cpu_to_milli(...) * n}m"); keep
existing memory handling but consider a similar parser for
K8S_POD_MEMORY_REQUEST/K8S_POD_MEMORY_LIMIT if they might omit "Mi".
In `@cert-generator/setup-k8s.sh`:
- Around line 92-93: The current kubectl delete line (kubectl delete pods -n
integr8scode -l component=executor --field-selector=status.phase!=Running ...)
can remove non-terminal Pending pods; replace it with explicit deletes for
terminal phases only by invoking kubectl delete pods -n integr8scode -l
component=executor --field-selector='status.phase==Succeeded' ... and a second
kubectl delete pods -n integr8scode -l component=executor
--field-selector='status.phase==Failed' ..., each keeping --ignore-not-found and
the existing redirection/|| true so only Succeeded and Failed executor pods are
cleaned up and Pending/Running pods are preserved.
---
Duplicate comments:
In `@cert-generator/setup-k8s.sh`:
- Around line 147-151: The RBAC rules for resources "networkpolicies" and
"resourcequotas" are over-permissive; update the verbs arrays for the entries
that reference resources: ["networkpolicies"] and resources: ["resourcequotas"]
so they only include read/watch plus patch (e.g.,
["get","list","watch","patch"]); remove create/update/delete from those verbs
and ensure the corresponding apiGroups entry is present/appropriate for each
resource if required.
---
Nitpick comments:
In `@backend/app/services/k8s_worker/worker.py`:
- Around line 293-294: Replace f-string log interpolation with structured
logging: change calls like self.logger.info(f"NetworkPolicy '{policy_name}'
applied in namespace {namespace}") to a static message and pass dynamic values
as keyword fields (e.g., self.logger.info("NetworkPolicy applied",
policy_name=policy_name, namespace=namespace)). Do the same for the other
similar uses of self.logger.info/error in this file that interpolate
policy_name, namespace, pod_name, config_map_name, etc., ensuring all
user-controlled data is passed as keyword arguments rather than embedded in the
message.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/app/services/k8s_worker/worker.pybackend/app/services/pod_monitor/config.pycert-generator/setup-k8s.sh
…ated limits for system-wide k8s quota on cpu and memory
There was a problem hiding this comment.
1 issue found across 6 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docs/security/policies.md">
<violation number="1" location="docs/security/policies.md:66">
P2: The ResourceQuota description is behaviorally incorrect: exceeding quota does not keep extra pods Pending; Kubernetes rejects creation with `403 Forbidden (exceeded quota)`. This can mislead operators during incident debugging.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
1 issue found across 9 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cert-generator/setup-k8s.sh">
<violation number="1" location="cert-generator/setup-k8s.sh:95">
P1: Applying manifests from a remote URL without integrity verification introduces a supply-chain risk in cluster bootstrap. Prefer vendoring the manifest (or verifying checksum/signature before apply) so bootstrap does not trust mutable external content at runtime.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
2 issues found across 5 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cert-generator/setup-k8s.sh">
<violation number="1" location="cert-generator/setup-k8s.sh:96">
P1: Avoid predictable `/tmp` filenames for downloaded manifests; use a secure unique temp file (e.g., `mktemp`) to prevent symlink/race attacks.</violation>
</file>
<file name="backend/app/services/pod_monitor/monitor.py">
<violation number="1" location="backend/app/services/pod_monitor/monitor.py:211">
P1: A Kafka publish failure now aborts the rest of pod-event processing, which can skip terminal pod cleanup and remaining event publishes.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|



Summary by cubic
Hardens auth, Kubernetes isolation, and frontend CSP, and shifts pod concurrency to Kubernetes with Kueue. Applies default‑deny networking, PSA labels, and quotas; secures the pre‑puller; enforces Redis auth; pins DB images; and auto‑cleans executor pods.
New Features
Migration
Written for commit 43da70d. Summary will update on new commits.
Summary by CodeRabbit
New Features
Security
Updates
Documentation