Skip to content

feat: add language variant support#286

Open
userlaurin wants to merge 9 commits intospcl:masterfrom
userlaurin:feature/language-variants
Open

feat: add language variant support#286
userlaurin wants to merge 9 commits intospcl:masterfrom
userlaurin:feature/language-variants

Conversation

@userlaurin
Copy link
Contributor

@userlaurin userlaurin commented Feb 27, 2026

  • Add --language-variant CLI flag to local start and experiment invoke
  • Select variant-tagged Docker images (run.local.lang.variant.ver)
  • Apply per-variant patch files (benchmarks/bm/lang/variant/patch.diff) during code packaging
  • Extend cache keys to include variant; store variant in FunctionConfig
  • Add build_docker_images.py support for variant image tags
  • Added requirement for language variant patches "patch-ng" that can apply a patch for a variant to the default

Summary by CodeRabbit

  • New Features

    • Language variants support (e.g., bun, llrt, pypy) with per-variant runtime images, cache isolation, startup validation, and variant-aware build/publish flows.
    • Added --language-variant option to benchmark and image-build CLIs; tooling can target a specific variant or build all variants.
  • Documentation

    • Added Language Variants guide with strategies, directory layouts, image tagging/build commands, registration notes, and a checklist.
  • Chores

    • Added new Python dependency: patch-ng.

- Add --language-variant CLI flag to local start and experiment invoke
- Select variant-tagged Docker images (run.local.<lang>.<variant>.<ver>)
- Apply per-variant patch files (benchmarks/<bm>/<lang>/<variant>/patch.diff)
  during code packaging
- Extend cache keys to include variant; store variant in FunctionConfig
- Add build_docker_images.py support for variant image tags
@coderabbitai
Copy link

coderabbitai bot commented Feb 27, 2026

📝 Walkthrough

Walkthrough

Adds language-variant support: docs, CLI, runtime model, caching, local container naming, and Docker build tooling; introduces LanguageSpec and Variant enums, variant-aware packaging (patch/overlay), per-variant image paths/tags, and a new Python dependency for applying patch diffs.

Changes

Cohort / File(s) Summary
Documentation
docs/build.md
New "Language Variants" section describing variant strategies (patch.diff vs overlay), directory layouts, systems.json registration, Dockerfile.run placement, image tagging, build commands, and per-variant cache isolation.
Dependency
requirements.txt
Added patch-ng dependency for applying patch.diff variants.
CLI & Entry
sebs.py
Added --language-variant option and propagated language_variant through parse_common_params into experiments.runtime.variant in generated config.
Benchmark Model
sebs/benchmark.py
Added LanguageSpec, replaced languages with language_specs, added variant-aware properties/methods, extended hashing/copy/build paths to include variant, and implemented patch/overlay variant application in copy_code.
Runtime / Function Model
sebs/faas/function.py
Added Variant namespace (per-language enums) and Architecture enum; extended Runtime dataclass with variant (serialize/deserialize) and defaulting; FunctionConfig now uses variant from benchmark.
Caching Layer
sebs/cache.py
Switched to using cache_language_key (language+variant) for code/package cache operations and function entries.
Config Management
sebs/config.py
Added supported_variants(deployment_name, language_name) to read per-language variant lists from systems config with fallback ["default"].
Local Runner
sebs/local/local.py
Container naming updated to include variant when non-default (adds variant segment to container image/name).
AWS Provider
sebs/aws/aws.py
Minor stylistic/whitespace change in package_code; no functional behavior change.
Build Tooling
tools/build_docker_images.py
Added --language-variant CLI flag; build() accepts variant; resolves Dockerfile paths that may include .../<language>/<variant>/Dockerfile.*; image tags include variant; added build_variant_language helper and logic to build specific or all variants per language.

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI
    participant BuildTool as tools/build_docker_images
    participant Repo as Repository (code + variants)
    participant Docker as Docker Daemon
    participant Registry as Image Registry
    participant Benchmark as Benchmark flow
    participant Cache as Cache store

    CLI->>BuildTool: request build(system, language, variant?)
    BuildTool->>Repo: locate Dockerfile (system/language(/variant)/Dockerfile.run)
    BuildTool->>Docker: docker build (tag includes variant)
    Docker-->>BuildTool: image built
    BuildTool->>Registry: push image
    CLI->>Benchmark: run benchmark(language, variant)
    Benchmark->>Repo: resolve code (base + overlay OR apply patch.diff)
    Benchmark->>Cache: compute cache_language_key(language+variant) and check/store package
    Cache-->>Benchmark: return cache result / ack store
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

Suggested reviewers

  • mcopik

Poem

🐰 I nibble code and stitch a patch,
Bun hops in while pypy watches the latch.
Docker builds variants in a row,
Cache keys hop where outputs go.
New runtimes bloom — the rabbits glow!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: add language variant support' directly and clearly describes the main feature being added across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

laurin and others added 3 commits February 27, 2026 15:26
- Replace flat Variant enum with nested namespace class (Variant.Python,
  Variant.NodeJS) keyed by Language in Variant._LANG_MAP; new languages
  add one nested Enum + one map entry
- Fix Runtime.__post_init__ to skip variant resolution when language is
  None (used as sentinel in ExperimentConfig)
- Fix FunctionConfig._from_benchmark to carry variant through to Runtime
- Remove FIXME comments and unnecessary forward-reference quotes now that
  benchmark.py has from __future__ import annotations
- Document patch-based vs overlay-based variant formats in benchmark.py
- Simplify build_variant_language to delegate to build() so SeBS version
  suffix and future push support are handled consistently
@userlaurin userlaurin marked this pull request as ready for review March 2, 2026 14:45
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tools/build_docker_images.py (1)

47-55: ⚠️ Potential issue | 🔴 Critical

Critical: Image tag mismatch with local.py will cause container startup failures.

The build script appends the SeBS version suffix to image tags (line 55), producing tags like:

run.local.nodejs.bun.18-1.2.0

However, sebs/local/local.py lines 149-160 construct container names without this suffix:

# For variant != "default"
container_name = "{}:run.local.{}.{}.{}".format(
    self._system_config.docker_repository(),
    code_package.language_name,
    variant,
    code_package.language_version,
)

This produces: run.local.nodejs.bun.18

Docker will fail to find the images when starting containers because the tags don't match. Either the build script must stop appending the SeBS version suffix to tags, or local.py must be updated to include the suffix when referencing built images.

🧹 Nitpick comments (2)
tools/build_docker_images.py (1)

102-115: Consider extracting shared config iteration logic.

The config iteration logic (lines 104-112) duplicates build_language (lines 84-92). While acceptable for a small script, extracting this could reduce maintenance burden.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/build_docker_images.py` around lines 102 - 115, The loop that builds
the version/base_image list is duplicated between build_variant_language and
build_language; extract that logic into a helper (e.g., get_language_configs or
iterate_language_configs) that accepts language_config and args.language_version
and returns the list of (version, base_image) pairs (handling the "base_images"
-> "x64" branch and the fallback [None, None]); then replace the duplicated
loops in build_variant_language and build_language to call this helper and
iterate its result before calling build("run", ...) so behavior remains
identical.
docs/build.md (1)

117-119: Add language specifiers to fenced code blocks.

Static analysis flagged several code blocks showing directory structures without a language specifier. Consider using text or plaintext for consistency:

  • Lines 117-119
  • Lines 128-135
  • Lines 147-154
  • Lines 195-197
  • Lines 201-205
  • Lines 209-211
Example fix
-```
+```text
 benchmarks/<id>/<language>/<variant>/
</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/build.md around lines 117 - 119, Several fenced code blocks (e.g., the
block containing the literal "benchmarks////") are
missing language specifiers; update each of the listed fenced code blocks to use
a language token such as text or plaintext (for example change totext)
so static analysis stops flagging them; locate the exact fence instances by
searching for the literal snippets shown in the diff (e.g.,
"benchmarks////") and apply the same fix to the other
blocks mentioned in the comment.


</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @docs/build.md:

  • Around line 170-178: sebs/config.py currently uses the wrong key when reading
    language variants: update the SeBSConfig.supported_variants() (or the method
    that returns supported variants) to read "variant_images" from the languages
    entry instead of "supported_variants" so it matches docs and
    tools/build_docker_images.py; change the get call that currently does
    languages.get(language_name, {}).get("supported_variants", ["default"]) to use
    "variant_images" with the same default (["default"]) so configured
    variant_images are respected throughout the codebase.

In @sebs/aws/aws.py:

  • Around line 135-136: The comparison and dict lookup are using language.value
    (a string) against Language enum members and CONFIG_FILES keys (which are enum
    members), causing mismatches and KeyError; change the condition to compare the
    enum directly (e.g., use language == Language.PYTHON or language in
    (Language.PYTHON, Language.NODEJS)) and use the enum language (not
    language.value) when indexing CONFIG_FILES to retrieve package_config;
    alternatively, if CONFIG_FILES must use strings, normalize keys consistently but
    prefer the simpler fix of comparing and indexing with the Language enum
    (symbols: language, Language, CONFIG_FILES, package_config).
  • Line 155: The comparison uses language.value (a string) against an enum member
    (Language.CPP), causing mismatches; update the check in the same block (the elif
    that currently reads language.value == Language.CPP) to compare like-for-like —
    either compare the enum instance (language == Language.CPP) or compare the
    string value to the enum's value (language.value == Language.CPP.value) so the
    condition correctly detects the C++ branch.
  • Around line 177-182: The comparisons are incorrectly using language.value (a
    string) against Language enum members; update the conditional checks to compare
    the enum itself (e.g., change checks to language == Language.NODEJS, language ==
    Language.CPP, and language in [Language.PYTHON]) and keep using language.value
    when building return strings (e.g., f"{language.value}{language_version}" or
    f"{language.value}{language_version}.x" as appropriate) so the logic in aws.py
    matches enum semantics.

In @sebs/benchmark.py:

  • Around line 264-266: The hash calculation is passing a string
    (self.language_name) into Benchmark.hash_directory which expects a Language enum
    to index FILES and WRAPPERS, causing a KeyError; change the call to pass the
    Language enum (e.g., self.language or convert the name to the enum via
    Language[self.language_name] or Language(self.language_name) depending on how
    Language is defined) so Benchmark.hash_directory receives a Language enum rather
    than a string; ensure any other callers also pass the enum and update the call
    site that currently uses self.language_name accordingly.
  • Around line 53-61: LanguageSpec.deserialize currently imports Language from
    sebs.faas.function (wrong enum), causing missing CPP support; remove the local
    import in deserialize and use the module-level Language (sebs.types.Language)
    already imported at module scope so LanguageSpec.deserialize(...) resolves CPP
    and other language variants correctly, and ensure both branches (str and dict)
    call Language.deserialize from the module-level enum.

In @sebs/config.py:

  • Around line 54-63: The supported_variants method in sebs/config.py is reading
    the wrong key from the systems config; update the lookup inside
    supported_variants (the languages variable usage) to read "variant_images"
    instead of "supported_variants" so it matches tools/build_docker_images.py and
    docs, while preserving the fallback of ["default"]; adjust the get call on
    languages.get(language_name).get(...) to use "variant_images" and leave behavior
    otherwise unchanged.

In @sebs/faas/function.py:

  • Around line 335-339: The error message in deserialize(val: str) for
    Architecture is misleading because it logs the loop variable member instead of
    the invalid input; update the exception to include the original val (e.g., raise
    Exception(f"Unknown architecture type {val}")) so the error reports the actual
    bad config value; locate the deserialize function and replace the exception
    message that currently references member with one referencing val.
  • Around line 271-324: Remove the locally defined Language enum so the module
    uses the imported sebs.types.Language (avoids enum identity mismatches), update
    Variant._LANG_MAP to include a mapping for Language.CPP -> Variant., and ensure Variant.for_language continues
    to use the mapped enum class for lookups; also fix Architecture.deserialize to
    reference the input parameter (val) in its error message instead of the
    undefined variable member.

In @sebs/local/local.py:

  • Around line 147-160: The container image tag is missing the SeBS version
    suffix; retrieve the SeBS version from system config (e.g., sebs_version =
    self._system_config.sebs_version()) and include it in the formatted
    container_name strings so the tag matches build_docker_images.py output. Update
    both branches that set container_name (the two format calls that use
    self._system_config.docker_repository(), code_package.language_name, variant,
    and code_package.language_version) to insert the sebs_version token (e.g., as
    .{sebs_version} before the language_version) so both default and variant paths
    produce identical SeBS-versioned tags.

Nitpick comments:
In @docs/build.md:

  • Around line 117-119: Several fenced code blocks (e.g., the block containing
    the literal "benchmarks////") are missing language
    specifiers; update each of the listed fenced code blocks to use a language token
    such as text or plaintext (for example change totext) so static analysis
    stops flagging them; locate the exact fence instances by searching for the
    literal snippets shown in the diff (e.g.,
    "benchmarks////") and apply the same fix to the other
    blocks mentioned in the comment.

In @tools/build_docker_images.py:

  • Around line 102-115: The loop that builds the version/base_image list is
    duplicated between build_variant_language and build_language; extract that logic
    into a helper (e.g., get_language_configs or iterate_language_configs) that
    accepts language_config and args.language_version and returns the list of
    (version, base_image) pairs (handling the "base_images" -> "x64" branch and the
    fallback [None, None]); then replace the duplicated loops in
    build_variant_language and build_language to call this helper and iterate its
    result before calling build("run", ...) so behavior remains identical.

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: defaults

**Review profile**: CHILL

**Plan**: Pro

**Run ID**: `89084e4e-de4a-4e55-a9fb-905420aca9aa`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between ce1dacbbbad71f3fce877f7adcd15f131b6928e9 and 2aba9eacf9d7af2ab1863e769a87ddb1bb9d3488.

</details>

<details>
<summary>📒 Files selected for processing (10)</summary>

* `docs/build.md`
* `requirements.txt`
* `sebs.py`
* `sebs/aws/aws.py`
* `sebs/benchmark.py`
* `sebs/cache.py`
* `sebs/config.py`
* `sebs/faas/function.py`
* `sebs/local/local.py`
* `tools/build_docker_images.py`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

laurin added 2 commits March 4, 2026 14:54
Language was defined twice - once in sebs/types.py (str, Enum) and again
in sebs/faas/function.py (plain Enum, missing CPP). This caused identity
comparisons like `language == Language.NODEJS` to silently fail, requiring
the workaround of comparing `.value` instead which was removed again in this commit.

Remove the duplicate definition and update all import sites in benchmark.py
to use the canonical Language from sebs/types.py.
…) to get back to the current handling of the master repo
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
sebs/faas/function.py (2)

323-327: ⚠️ Potential issue | 🟡 Minor

Use val in the invalid-architecture error message.

Line 327 reports {member} instead of the bad input, so failures show the wrong value.

Proposed fix
-        raise Exception(f"Unknown architecture type {member}")
+        raise Exception(f"Unknown architecture type {val}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 323 - 327, In deserialize(val: str) in
function.py the raised Exception reports the last loop variable member instead
of the invalid input; change the error to include the provided val (e.g., raise
Exception(f"Unknown architecture type {val}")) so the message shows the actual
bad input when Architecture parsing fails; keep function and Architecture
references unchanged.

309-312: ⚠️ Potential issue | 🔴 Critical

Add C++ default variant mapping to prevent runtime breakage.

Line 309-312 omits Language.CPP, so Variant.for_language(...) fails for C++ in both Runtime.deserialize (Line 355) and FunctionConfig._from_benchmark (Line 374).

Proposed fix
 class Variant:
@@
     class NodeJS(Enum):
         DEFAULT = "default"
         BUN = "bun"
         LLRT = "llrt"
+
+    class CPP(Enum):
+        DEFAULT = "default"
@@
 Variant._LANG_MAP: Dict[Language, Type[Enum]] = {
     Language.PYTHON: Variant.Python,
     Language.NODEJS: Variant.NodeJS,
+    Language.CPP: Variant.CPP,
 }

Also applies to: 351-356, 371-375

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 309 - 312, The mapping Variant._LANG_MAP
is missing an entry for C++ which causes Variant.for_language(...) to fail in
Runtime.deserialize and FunctionConfig._from_benchmark; add Language.CPP:
Variant.Cpp (or the actual C++ Variant class name used) to Variant._LANG_MAP
alongside Language.PYTHON and Language.NODEJS so Variant.for_language can
resolve C++ variants correctly, and update any other similar maps around
Runtime.deserialize and FunctionConfig._from_benchmark if duplicated.
sebs/benchmark.py (1)

262-264: ⚠️ Potential issue | 🔴 Critical

Normalize language key type usage (Enum vs string) in hashing and overlay copy paths.

Line 263 passes self.language_name (string), while Line 355 and Line 376 index enum-keyed maps; Line 475 also switches to string-key indexing. This inconsistency can break variant hashing/copy behavior.

Proposed fix
-        self._hash_value = Benchmark.hash_directory(
-            path, self._deployment_name, self.language_name, self._language_variant
-        )
+        self._hash_value = Benchmark.hash_directory(
+            path, self._deployment_name, self.language, self._language_variant
+        )
@@
-    def hash_directory(directory: str, deployment: str, language: str, variant: str = "default"):
+    def hash_directory(
+        directory: str, deployment: str, language: Language, variant: str = "default"
+    ):
@@
             wrappers = project_absolute_path(
-                "benchmarks", "wrappers", deployment, language, wrapper
+                "benchmarks", "wrappers", deployment, language.value, wrapper
             )
@@
-                for file_type in FILES[self.language_name]:
+                for file_type in FILES[self.language]:

Also applies to: 341-355, 377-379, 475-476

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 262 - 264, The code mixes enum keys and
string keys when building hashes and accessing language-indexed maps: normalize
to a single key type by converting language_name to the Enum (or vice versa)
before use; update Benchmark.hash_directory calls and any places that index
language-keyed dicts (e.g., places that use self.language_name, overlay copy
logic, and any map lookups referenced around hash computation and copy paths) to
consistently use the Enum member (or consistently use its .name/.value) across
Benchmark.hash_directory, the overlay copy routines, and all map accesses so the
same key type is used everywhere.
🧹 Nitpick comments (1)
sebs/benchmark.py (1)

50-51: Return a copy from LanguageSpec.variants to protect internal state.

Exposing the backing list allows callers to mutate benchmark config state implicitly.

Proposed refactor
     `@property`
     def variants(self) -> List[str]:
-        return self._variants
+        return list(self._variants)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 50 - 51, LanguageSpec.variants currently
returns the internal list self._variants directly, exposing mutable internal
state; change the method to return a shallow copy (e.g., return
list(self._variants) or self._variants.copy()) so callers cannot mutate the
LanguageSpec._variants backing list, and ensure the return type remains
List[str] to preserve the signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sebs/faas/function.py`:
- Line 315: A new local enum class named Architecture (class Architecture(Enum))
conflicts with an imported Architecture symbol (imported at the top of the
file), causing an F811 redefinition; fix by either removing the imported
Architecture from the imports and updating any external references accordingly,
or rename the local enum (e.g., LocalArchitecture or BackendArchitecture) and
update all local references in this file (constructors, type hints, comparisons,
and any usages of Architecture in sebs/faas/function.py) to the new name so
there is no name collision.

---

Duplicate comments:
In `@sebs/benchmark.py`:
- Around line 262-264: The code mixes enum keys and string keys when building
hashes and accessing language-indexed maps: normalize to a single key type by
converting language_name to the Enum (or vice versa) before use; update
Benchmark.hash_directory calls and any places that index language-keyed dicts
(e.g., places that use self.language_name, overlay copy logic, and any map
lookups referenced around hash computation and copy paths) to consistently use
the Enum member (or consistently use its .name/.value) across
Benchmark.hash_directory, the overlay copy routines, and all map accesses so the
same key type is used everywhere.

In `@sebs/faas/function.py`:
- Around line 323-327: In deserialize(val: str) in function.py the raised
Exception reports the last loop variable member instead of the invalid input;
change the error to include the provided val (e.g., raise Exception(f"Unknown
architecture type {val}")) so the message shows the actual bad input when
Architecture parsing fails; keep function and Architecture references unchanged.
- Around line 309-312: The mapping Variant._LANG_MAP is missing an entry for C++
which causes Variant.for_language(...) to fail in Runtime.deserialize and
FunctionConfig._from_benchmark; add Language.CPP: Variant.Cpp (or the actual C++
Variant class name used) to Variant._LANG_MAP alongside Language.PYTHON and
Language.NODEJS so Variant.for_language can resolve C++ variants correctly, and
update any other similar maps around Runtime.deserialize and
FunctionConfig._from_benchmark if duplicated.

---

Nitpick comments:
In `@sebs/benchmark.py`:
- Around line 50-51: LanguageSpec.variants currently returns the internal list
self._variants directly, exposing mutable internal state; change the method to
return a shallow copy (e.g., return list(self._variants) or
self._variants.copy()) so callers cannot mutate the LanguageSpec._variants
backing list, and ensure the return type remains List[str] to preserve the
signature.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 399467d7-c174-4ceb-b76c-b7fdbd23409b

📥 Commits

Reviewing files that changed from the base of the PR and between 2aba9ea and 44b54b4.

📒 Files selected for processing (3)
  • sebs/aws/aws.py
  • sebs/benchmark.py
  • sebs/faas/function.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • sebs/aws/aws.py

@userlaurin userlaurin requested a review from mcopik March 4, 2026 20:49
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sebs/benchmark.py (1)

633-635: ⚠️ Potential issue | 🟠 Major

Variant cache hash misses versioned Node package files.

package.json.<version> is used in overlay variants (Line [789]-Line [793]), but the hash input patterns (Line [634], Line [659]) exclude it. Cache entries can stay “valid” after variant dependency changes.

Proposed fix
         FILES = {
             Language.PYTHON: ["*.py", "requirements.txt*"],
-            Language.NODEJS: ["*.js", "package.json"],
+            Language.NODEJS: ["*.js", "package.json*"],
             Language.CPP: ["*.cpp", "*.hpp", "dependencies.json"],
         }

Also applies to: 649-662, 789-793

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 633 - 635, The NodeJS hash input patterns
omit versioned package files (e.g., package.json.<version>) so variant cache
hashes miss changes; update the Language.NODEJS pattern lists used for hashing
to include versioned package files by adding "package.json.*" (or
"package.json*") alongside "package.json" wherever the hash input patterns are
defined (references: Language.NODEJS and the lists around the existing ["*.js",
"package.json"] entries and the later overlay/variant handling that uses
package.json.<version>); apply the same change to each occurrence of the NodeJS
pattern lists so variant dependency changes invalidate the cache correctly.
♻️ Duplicate comments (3)
sebs/faas/function.py (3)

567-570: ⚠️ Potential issue | 🔴 Critical

C++ flows break because Variant has no Language.CPP mapping.

Line [607] and Line [678] both rely on Variant.for_language(...); with the current map, any C++ benchmark can fail with ValueError("No variants defined for language ...").

Proposed fix
 class Variant:
@@
     class NodeJS(Enum):
         DEFAULT = "default"
         BUN = "bun"
         LLRT = "llrt"
+
+    class Cpp(Enum):
+        DEFAULT = "default"
@@
 Variant._LANG_MAP: Dict[Language, Type[Enum]] = {
     Language.PYTHON: Variant.Python,
     Language.NODEJS: Variant.NodeJS,
+    Language.CPP: Variant.Cpp,
 }

Also applies to: 605-607, 675-679

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 567 - 570, The _LANG_MAP in Variant is
missing an entry for Language.CPP causing Variant.for_language(...) to raise
ValueError for C++ benchmarks; add a mapping for Language.CPP to the appropriate
enum class (e.g., Variant.Cpp or the correct C++ variant Enum) inside
Variant._LANG_MAP so that Variant.for_language(Language.CPP) succeeds and C++
flows on lines referencing Variant.for_language no longer fail.

581-585: ⚠️ Potential issue | 🟡 Minor

Use the invalid input value in architecture deserialization errors.

Line [585] currently reports member instead of the actual invalid val, which makes debugging harder.

Proposed fix
-        raise Exception(f"Unknown architecture type {member}")
+        raise Exception(f"Unknown architecture type {val}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` around lines 581 - 585, The deserialize function in
sebs/faas/function.py returns an error that references the loop variable member
instead of the invalid input; update the exception in deserialize(val: str) to
include the actual invalid input val (e.g., raise Exception(f"Unknown
architecture type {val}")) so errors correctly report the value that failed to
deserialize; ensure the change is made in the deserialize function and preserves
the existing loop over Architecture members.

26-26: ⚠️ Potential issue | 🟠 Major

Resolve Architecture redefinition (Ruff F811).

Line [26] imports Architecture, then Line [573] redefines it in the same module. This is an avoidable redefinition error and makes the active type ambiguous.

Proposed fix
-from sebs.types import Language, Architecture
+from sebs.types import Language

Also applies to: 573-573

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/faas/function.py` at line 26, The module currently imports Architecture
from sebs.types and also defines a local Architecture (redefinition causing Ruff
F811); pick one source and remove the other: if the file should use the shared
type, delete the local Architecture definition (the one around the later
redefinition) and update any local references to use the imported Architecture;
if the local definition is intentional and distinct, rename that local symbol
(e.g., LocalArchitecture or FunctionArchitecture) and update all references in
this file to that new name so there is no name collision with the imported
Architecture.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/build.md`:
- Around line 118-120: Several fenced code blocks in docs/build.md are unlabeled
(MD040); update each triple-backtick block that currently contains examples like
"benchmarks/<id>/<language>/<variant>/", the longer benchmark trees (e.g.,
"benchmarks/100.webapps/110.dynamic-html/..."), the dockerfile paths
("dockerfiles/<system>/<language>/<variant>/Dockerfile.run",
"dockerfiles/local/nodejs/bun/Dockerfile.run", etc.), and the image tag example
("<docker_repository>:run.<system>.<language>.<variant>.<version>-<SeBS_version>")
to include an explicit language after the opening ``` (use text for plain
paths/trees, bash for shell examples, json where appropriate) so each fenced
block is labeled and satisfies markdownlint MD040; locate the blocks by their
content strings and add the appropriate language identifiers.

In `@sebs/benchmark.py`:
- Around line 785-787: FILES is keyed by the Language enum but the loop uses
self.language_name (a string), causing a KeyError for overlay variants; change
the lookup to use the enum value (e.g., self.language or convert the string to
Language) when iterating FILES in the block that copies files from variant_dir
to output_dir so the key types match (ensure you reference FILES,
self.language_name, and Language enum when making the fix).

---

Outside diff comments:
In `@sebs/benchmark.py`:
- Around line 633-635: The NodeJS hash input patterns omit versioned package
files (e.g., package.json.<version>) so variant cache hashes miss changes;
update the Language.NODEJS pattern lists used for hashing to include versioned
package files by adding "package.json.*" (or "package.json*") alongside
"package.json" wherever the hash input patterns are defined (references:
Language.NODEJS and the lists around the existing ["*.js", "package.json"]
entries and the later overlay/variant handling that uses
package.json.<version>); apply the same change to each occurrence of the NodeJS
pattern lists so variant dependency changes invalidate the cache correctly.

---

Duplicate comments:
In `@sebs/faas/function.py`:
- Around line 567-570: The _LANG_MAP in Variant is missing an entry for
Language.CPP causing Variant.for_language(...) to raise ValueError for C++
benchmarks; add a mapping for Language.CPP to the appropriate enum class (e.g.,
Variant.Cpp or the correct C++ variant Enum) inside Variant._LANG_MAP so that
Variant.for_language(Language.CPP) succeeds and C++ flows on lines referencing
Variant.for_language no longer fail.
- Around line 581-585: The deserialize function in sebs/faas/function.py returns
an error that references the loop variable member instead of the invalid input;
update the exception in deserialize(val: str) to include the actual invalid
input val (e.g., raise Exception(f"Unknown architecture type {val}")) so errors
correctly report the value that failed to deserialize; ensure the change is made
in the deserialize function and preserves the existing loop over Architecture
members.
- Line 26: The module currently imports Architecture from sebs.types and also
defines a local Architecture (redefinition causing Ruff F811); pick one source
and remove the other: if the file should use the shared type, delete the local
Architecture definition (the one around the later redefinition) and update any
local references to use the imported Architecture; if the local definition is
intentional and distinct, rename that local symbol (e.g., LocalArchitecture or
FunctionArchitecture) and update all references in this file to that new name so
there is no name collision with the imported Architecture.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8e024faa-2760-43e6-8f3b-83acb18a2858

📥 Commits

Reviewing files that changed from the base of the PR and between 44b54b4 and 762ba1d.

📒 Files selected for processing (8)
  • docs/build.md
  • requirements.txt
  • sebs/aws/aws.py
  • sebs/benchmark.py
  • sebs/cache.py
  • sebs/config.py
  • sebs/faas/function.py
  • sebs/local/local.py
✅ Files skipped from review due to trivial changes (1)
  • requirements.txt
🚧 Files skipped from review as they are similar to previous changes (3)
  • sebs/aws/aws.py
  • sebs/config.py
  • sebs/local/local.py

Comment on lines +118 to +120
```
benchmarks/<id>/<language>/<variant>/
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to fenced code blocks to satisfy markdownlint.

Line [118], Line [129], Line [148], Line [196], Line [202], and Line [210] use unlabeled fenced blocks (MD040). Add explicit languages (e.g., text, json, bash) for consistent rendering and lint compliance.

Suggested doc fix
-```
+```text
 benchmarks/<id>/<language>/<variant>/

- +text
benchmarks/100.webapps/110.dynamic-html/
nodejs/
function.js ← default implementation
package.json
llrt/
patch.diff ← unified diff applied over the default files


-```
+```text
benchmarks/100.webapps/110.dynamic-html/
  nodejs/
    function.js               ← default implementation
    package.json
    bun/
      package.json            ← replaces the default package.json for bun

- +text
dockerfiles////Dockerfile.run


-```
+```text
dockerfiles/local/nodejs/bun/Dockerfile.run
dockerfiles/local/nodejs/llrt/Dockerfile.run
dockerfiles/local/python/pypy/Dockerfile.run

- +text
<docker_repository>:run....-<SeBS_version>

Also applies to: 129-136, 148-155, 196-198, 202-206, 210-212

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 118-118: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/build.md` around lines 118 - 120, Several fenced code blocks in
docs/build.md are unlabeled (MD040); update each triple-backtick block that
currently contains examples like "benchmarks/<id>/<language>/<variant>/", the
longer benchmark trees (e.g., "benchmarks/100.webapps/110.dynamic-html/..."),
the dockerfile paths
("dockerfiles/<system>/<language>/<variant>/Dockerfile.run",
"dockerfiles/local/nodejs/bun/Dockerfile.run", etc.), and the image tag example
("<docker_repository>:run.<system>.<language>.<variant>.<version>-<SeBS_version>")
to include an explicit language after the opening ``` (use text for plain
paths/trees, bash for shell examples, json where appropriate) so each fenced
block is labeled and satisfies markdownlint MD040; locate the blocks by their
content strings and add the appropriate language identifiers.

Comment on lines +785 to +787
for file_type in FILES[self.language_name]:
for f in glob.glob(os.path.join(variant_dir, file_type)):
shutil.copy2(f, output_dir)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Overlay variant copy will crash due to enum/string key mismatch.

Line [785] uses self.language_name (string) to index FILES, but the dictionary is keyed by Language enum values. This raises KeyError for overlay variants.

Proposed fix
-                for file_type in FILES[self.language_name]:
+                for file_type in FILES[self.language]:
                     for f in glob.glob(os.path.join(variant_dir, file_type)):
                         shutil.copy2(f, output_dir)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for file_type in FILES[self.language_name]:
for f in glob.glob(os.path.join(variant_dir, file_type)):
shutil.copy2(f, output_dir)
for file_type in FILES[self.language]:
for f in glob.glob(os.path.join(variant_dir, file_type)):
shutil.copy2(f, output_dir)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sebs/benchmark.py` around lines 785 - 787, FILES is keyed by the Language
enum but the loop uses self.language_name (a string), causing a KeyError for
overlay variants; change the lookup to use the enum value (e.g., self.language
or convert the string to Language) when iterating FILES in the block that copies
files from variant_dir to output_dir so the key types match (ensure you
reference FILES, self.language_name, and Language enum when making the fix).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants