Skip to content

[Feature] Add grammar bundle generation API for PPL language features#5162

Open
mengweieric wants to merge 20 commits intoopensearch-project:mainfrom
mengweieric:feature/grammar-bundle
Open

[Feature] Add grammar bundle generation API for PPL language features#5162
mengweieric wants to merge 20 commits intoopensearch-project:mainfrom
mengweieric:feature/grammar-bundle

Conversation

@mengweieric
Copy link
Collaborator

@mengweieric mengweieric commented Feb 20, 2026

Description

Implements the backend grammar metadata API for PPL autocomplete support.

This endpoint serves a versioned grammar bundle containing ANTLR metadata required for downstream consumers (for example OpenSearch Dashboards) to reconstruct a functional PPL lexer/parser at runtime using antlr4ng interpreter APIs. This enables full client-side parsing/autocomplete with zero per-keystroke server calls, while keeping backend grammar as the source of truth.

What the bundle contains:

  • Serialized lexer and parser ATNs via ATNSerializer.serialize().toArray(), compatible with antlr4ng ATNDeserializer.deserialize()
  • Token vocabulary (literal/symbolic names), lexer/parser rule names, channel names, and mode names for interpreter reconstruction
  • tokenDictionary, ignoredTokens, and rulesToVisit for autocomplete behavior
  • grammarHash (ATNs + lexer/parser rule names + literal/symbolic vocabulary + ANTLR version) for client-side change detection
  • bundleVersion and antlrVersion for compatibility validation

Backend behavior:

  • Synchronized lazy initialization (bundle built on first request and cached for node lifecycle)
  • Deterministic output (same plugin JAR -> same bundle shape/content)
  • Endpoint marked @ExperimentalApi
  • Grammar endpoint now performs authorization via PPL transport action before serving bundle data (aligns with existing PPL permission model)

Also included:

  • REST API spec registration for ppl.grammar + YAML REST response-shape test
  • Unit tests for grammar REST handler (including authorization-failure path)
  • Security integration tests for grammar endpoint access with/without PPL permission
  • THIRD-PARTY updated to reflect ANTLR 4.13.2

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

📝 Walkthrough

Walkthrough

Adds a new GET endpoint /_plugins/_ppl/_grammar that serves a cached, serialized ANTLR grammar bundle; introduces GrammarBundle and PPLGrammarBundleBuilder, a RestPPLGrammarAction handler with tests, unit tests for the builder, and updates ANTLR third‑party metadata to 4.13.2.

Changes

Cohort / File(s) Summary
Third‑party notices
THIRD-PARTY
Updated ANTLR version from 4.7.1 to 4.13.2 and copyright year range from 2012-2017 to 2012-2024 in license headers.
Plugin registration
plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java
Imported and registered RestPPLGrammarAction to expose the new PPL grammar endpoint.
REST handler
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java
New REST handler serving GET /_plugins/_ppl/_grammar with lazy thread‑safe caching (double‑checked locking), JSON serialization via XContentBuilder, error handling (500), and test hooks (buildBundle, invalidateCache).
REST handler tests
plugin/src/test/java/org/opensearch/sql/plugin/rest/RestPPLGrammarActionTest.java
Unit tests for route/name, successful response structure, caching behavior, cache invalidation, and error path (500). Includes MockRestChannel to capture responses.
Grammar data model
ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java
New immutable Lombok @Value/@Builder container for serialized ANTLR grammar data (bundleVersion, antlrVersion, grammarHash, lexer/parser ATNs, rule names, channels, modes, vocabulary).
Bundle builder
ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilder.java
New builder that instantiates ANTLR lexer/parser, serializes ATNs, extracts token/rule/channel/mode names, computes SHA‑256 grammar hash, and constructs GrammarBundle.
Builder tests
ppl/src/test/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilderTest.java
Tests validating bundle contents, deterministic builds, hash format, start rule resolution, and non‑empty ATNs/names.
Manifest
pom.xml
Minor manifest changes related to dependency/metadata updates (ANTLR bump).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler as RestPPLGrammarAction
    participant Cache
    participant Builder as PPLGrammarBundleBuilder
    participant Serializer as XContentBuilder

    Client->>Handler: GET /_plugins/_ppl/_grammar
    Handler->>Cache: check cached bundle
    alt cache hit
        Cache-->>Handler: return Bundle
    else cache miss
        Handler->>Builder: buildBundle()
        Builder->>Builder: inspect lexer/parser, serialize ATNs, compute hash
        Builder-->>Handler: Bundle
        Handler->>Cache: store Bundle
    end
    Handler->>Serializer: serialize Bundle to JSON
    Serializer-->>Handler: JSON payload
    Handler-->>Client: HTTP 200 + JSON
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • ps48
  • kavithacm
  • derek-ho
  • joshuali925
  • penghuo
  • GumpacG
  • Swiddis
  • dai-chen
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main feature being added: a grammar bundle generation API for PPL language autocomplete support, which aligns with the primary changeset focus.
Description check ✅ Passed The pull request description clearly outlines the implementation of a backend grammar metadata API for PPL autocomplete support, detailing bundle contents, backend behavior, and related updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mengweieric mengweieric changed the title Feature/grammar bundle [Feature] Add grammar bundle generation for PPL autocomplete features Feb 20, 2026
@mengweieric mengweieric force-pushed the feature/grammar-bundle branch from d288392 to eabe8ec Compare February 20, 2026 23:00
@mengweieric mengweieric added PPL Piped processing language feature labels Feb 20, 2026
@mengweieric mengweieric changed the title [Feature] Add grammar bundle generation for PPL autocomplete features [Feature] Add grammar bundle generation API for PPL language features Feb 20, 2026
@mengweieric mengweieric force-pushed the feature/grammar-bundle branch 3 times, most recently from 3f36846 to c838750 Compare February 23, 2026 06:49
@mengweieric mengweieric marked this pull request as ready for review February 23, 2026 07:02
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 76e9c78.

PathLineSeverityDescription
plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java35lowThe new GET /_plugins/_ppl/_grammar endpoint exposes serialized ANTLR ATN data, token vocabularies, and parser rule structures without any explicit access control checks. While the grammar is open-source and this data is intended for client-side autocomplete, the endpoint provides detailed parser internals (rule indices, token type IDs, serialized automata) that could assist in crafting targeted injection payloads. No malicious intent detected — this appears to be a deliberate design choice consistent with other OpenSearch plugin endpoints — but the lack of role/permission checks is anomalous compared to sensitive REST endpoints in the same codebase.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

PR Reviewer Guide 🔍

(Review updated until commit da140ef)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: PPL Grammar Bundle data model and builder

Relevant files:

  • ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java
  • ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilder.java
  • ppl/src/test/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilderTest.java
  • THIRD-PARTY

Sub-PR theme: Grammar REST endpoint, transport integration, and API spec

Relevant files:

  • plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java
  • plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryAction.java
  • plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryRequest.java
  • plugin/src/main/java/org/opensearch/sql/plugin/SQLPlugin.java
  • plugin/src/test/java/org/opensearch/sql/plugin/rest/RestPPLGrammarActionTest.java
  • integ-test/src/yamlRestTest/resources/rest-api-spec/api/ppl.grammar.json
  • integ-test/src/yamlRestTest/resources/rest-api-spec/test/api/ppl.grammar.yml
  • docs/user/ppl/interfaces/endpoint.md

Sub-PR theme: Security integration tests for grammar endpoint

Relevant files:

  • integ-test/src/test/java/org/opensearch/sql/security/PPLPermissionsIT.java

⚡ Recommended focus areas for review

Authorization Bypass

The authorization mechanism reuses the PPL transport action by sending an empty query string with the grammar endpoint path. The transport action then checks isGrammarRequest() and short-circuits with a successful response. This means the actual security check is performed by the OpenSearch security plugin intercepting the transport action, not by any explicit permission check in the grammar handler itself. If the security plugin is not installed, any user can access the grammar endpoint without restriction. This design should be validated to ensure it correctly enforces permissions in all deployment configurations.

  return channel -> {
    try {
      authorizeRequest(
          client,
          new ActionListener<>() {
            @Override
            public void onResponse(TransportPPLQueryResponse ignored) {
              try {
                GrammarBundle bundle = getBundle();
                XContentBuilder builder = channel.newBuilder();
                serializeBundle(builder, bundle);
                channel.sendResponse(new BytesRestResponse(RestStatus.OK, builder));
              } catch (Exception e) {
                log.error("Error building or serializing PPL grammar", e);
                sendErrorResponse(channel, e);
              }
            }

            @Override
            public void onFailure(Exception e) {
              log.error("PPL grammar authorization failed", e);
              sendErrorResponse(channel, e);
            }
          });
    } catch (Exception e) {
      log.error("Error authorizing PPL grammar request", e);
      sendErrorResponse(channel, e);
    }
  };
}

@VisibleForTesting
protected void authorizeRequest(
    NodeClient client, ActionListener<TransportPPLQueryResponse> listener) {
  client.execute(
      PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("", null, ENDPOINT_PATH), listener);
}
Grammar Short-Circuit

The grammar request short-circuit in doExecute calls listener.onResponse immediately after authorization passes, bypassing all metrics, query context setup, and PPL service invocation. While intentional, this means grammar requests are not counted in PPL metrics (PPL_REQ_TOTAL, PPL_REQ_COUNT_TOTAL), which may be unexpected for observability. Consider whether grammar requests should be tracked separately or excluded intentionally.

TransportPPLQueryRequest transportRequest = TransportPPLQueryRequest.fromActionRequest(request);
if (transportRequest.isGrammarRequest()) {
  // Authorization is enforced by this transport action before returning grammar metadata in
  // REST.
  listener.onResponse(new TransportPPLQueryResponse("{}"));
  return;
}
Null Safety in Builder

The custom GrammarBundleBuilder.build() method calls copy() on fields like lexerSerializedATN, parserSerializedATN, etc. without null checks. If any of these fields are not set before build() is called, Arrays.copyOf will throw a NullPointerException. The @NonNull Lombok annotation only enforces non-null on the generated all-args constructor, not on the builder's intermediate state. Consider adding explicit null guards or relying solely on Lombok's @NonNull validation.

public static class GrammarBundleBuilder {
  public GrammarBundle build() {
    return new GrammarBundle(
        bundleVersion,
        antlrVersion,
        grammarHash,
        copy(lexerSerializedATN),
        copy(lexerRuleNames),
        copy(channelNames),
        copy(modeNames),
        copy(parserSerializedATN),
        copy(parserRuleNames),
        startRuleIndex,
        copy(literalNames),
        copy(symbolicNames),
        Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)),
        copy(ignoredTokens),
        copy(rulesToVisit));
  }
Typo in Rule Name

The rule name "renameClasue" in buildRulesToVisit appears to be a typo (should likely be "renameClause"). If the actual grammar rule is also misspelled, this will work but is fragile. If the grammar rule is ever corrected, this will throw an IllegalStateException at runtime on first request, taking down the grammar endpoint.

"renameClasue",
Missing Error Assertion

In testUserWithoutPPLPermissionCannotAccessGrammarEndpoint, the executeGrammarAsUser method asserts a 200 status code before returning. If the server returns 403, performRequest will throw a ResponseException before the assertion is reached, so the 200 assertion in executeGrammarAsUser is never evaluated for the failure case. However, if the server unexpectedly returns a non-403 error (e.g., 500), the test will fail with a misleading error from executeGrammarAsUser rather than a clear assertion failure.

public void testUserWithoutPPLPermissionCannotAccessGrammarEndpoint() throws IOException {
  try {
    executeGrammarAsUser(NO_PPL_USER);
    fail("Expected security exception for user without PPL permission");
  } catch (ResponseException e) {
    assertEquals(403, e.getResponse().getStatusLine().getStatusCode());
    String responseBody =
        org.opensearch.sql.legacy.TestUtils.getResponseBody(e.getResponse(), false);
    assertTrue(
        "Response should contain permission error message",
        responseBody.contains("no permissions")
            || responseBody.contains("Forbidden")
            || responseBody.contains("cluster:admin/opensearch/ppl"));
  }
}

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

PR Code Suggestions ✨

Latest suggestions up to da140ef

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Fix fragile array element assertions in YAML test

The assertions gt: {ignoredTokens.0: 0} and gt: {rulesToVisit.0: 0} only verify that
the first element is greater than 0, but token type 0 is typically EOF or an
internal token, so the first ignored token index could legitimately be 0. This could
cause false test failures. Consider using is_true or checking array length instead.

integ-test/src/yamlRestTest/resources/rest-api-spec/test/api/ppl.grammar.yml [17-18]

-- gt: {ignoredTokens.0: 0}
-- gt: {rulesToVisit.0: 0}
+- is_true: ignoredTokens
+- is_true: rulesToVisit
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that gt: {ignoredTokens.0: 0} could fail if the first token type index is 0 (e.g., EOF). Using is_true to verify the arrays are non-empty is a more robust assertion that avoids false failures based on specific token type values.

Low
Use non-empty sentinel value for grammar auth request

The authorization mechanism sends an empty PPL query ("") through the full
PPLQueryAction transport pipeline. If isGrammarRequest() check is bypassed or the
path matching fails, this empty query could be processed as a real PPL query.
Consider using a dedicated transport action for authorization rather than reusing
the PPL query transport with a sentinel empty string.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [89-90]

 client.execute(
-    PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("", null, ENDPOINT_PATH), listener);
+    PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("/* grammar auth */", null, ENDPOINT_PATH), listener);
Suggestion importance[1-10]: 3

__

Why: The suggestion proposes using a comment string as a sentinel, but the isGrammarRequest() check in TransportPPLQueryAction already short-circuits before any query processing occurs. The improved code only changes the empty string to a comment, which doesn't meaningfully improve safety since the path-based check is what matters.

Low
Possible issue
Add null validation in builder before defensive copies

The GrammarBundleBuilder.build() method does not perform null checks before calling
copy() on fields like tokenDictionary, which would throw a NullPointerException if
not set. Lombok's @NonNull on the fields only validates at construction time via the
all-args constructor, but the builder's custom build() bypasses Lombok's generated
null checks. Add explicit null validation before the copy calls.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java [131-150]

 public static class GrammarBundleBuilder {
   public GrammarBundle build() {
+    Objects.requireNonNull(bundleVersion, "bundleVersion is required");
+    Objects.requireNonNull(antlrVersion, "antlrVersion is required");
+    Objects.requireNonNull(grammarHash, "grammarHash is required");
+    Objects.requireNonNull(lexerSerializedATN, "lexerSerializedATN is required");
+    Objects.requireNonNull(lexerRuleNames, "lexerRuleNames is required");
+    Objects.requireNonNull(channelNames, "channelNames is required");
+    Objects.requireNonNull(modeNames, "modeNames is required");
+    Objects.requireNonNull(parserSerializedATN, "parserSerializedATN is required");
+    Objects.requireNonNull(parserRuleNames, "parserRuleNames is required");
+    Objects.requireNonNull(literalNames, "literalNames is required");
+    Objects.requireNonNull(symbolicNames, "symbolicNames is required");
+    Objects.requireNonNull(tokenDictionary, "tokenDictionary is required");
+    Objects.requireNonNull(ignoredTokens, "ignoredTokens is required");
+    Objects.requireNonNull(rulesToVisit, "rulesToVisit is required");
     return new GrammarBundle(
         bundleVersion,
         antlrVersion,
         grammarHash,
         copy(lexerSerializedATN),
         copy(lexerRuleNames),
         copy(channelNames),
         copy(modeNames),
         copy(parserSerializedATN),
         copy(parserRuleNames),
         startRuleIndex,
         copy(literalNames),
         copy(symbolicNames),
         Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)),
         copy(ignoredTokens),
         copy(rulesToVisit));
   }
 }
Suggestion importance[1-10]: 4

__

Why: The concern about @NonNull not being enforced in the custom build() method is valid — Lombok's @NonNull generates null checks in the all-args constructor, so the custom build() does actually invoke those checks when calling new GrammarBundle(...). However, the NPE would occur at copy() before reaching the constructor, so adding explicit null checks before copy() calls is a reasonable defensive improvement.

Low

Previous suggestions

Suggestions up to commit 2c5b14e
CategorySuggestion                                                                                                                                    Impact
General
Authorization reuses query action with empty query

The authorization mechanism sends an empty PPL query to the transport action, which
relies on the grammar path check to short-circuit execution. This is a fragile
approach that couples authorization to query execution logic. If the
isGrammarRequest() check is ever missed or the path changes, a real (empty) query
would be executed. Consider using a dedicated transport action or a more explicit
authorization mechanism for the grammar endpoint.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [92-93]

+// Consider creating a dedicated TransportPPLGrammarAction or using a cluster permission check
+// instead of piggybacking on PPLQueryAction with an empty query.
 client.execute(
     PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("", null, ENDPOINT_PATH), listener);
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a design concern where the grammar endpoint reuses PPLQueryAction with an empty query and relies on isGrammarRequest() path check to short-circuit. This is a valid architectural concern, but the 'improved_code' only adds a comment without actually changing the implementation, making it more of a design note than an actionable fix.

Low
Missing volatile on lazily-initialized cached field

The cachedBundle field is an instance variable on a BaseRestHandler subclass.
OpenSearch REST handlers are registered as singletons, so this works correctly in
practice. However, if the handler were ever instantiated multiple times (e.g., in
tests), each instance would have its own cache. More importantly, the field is not
declared volatile, which means that without the synchronized keyword on
getOrBuildBundle, a thread could see a stale null. While synchronized prevents this,
declaring the field volatile would make the intent explicit and allow a more
efficient double-checked locking pattern.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [41-42]

 // Lazy-initialized singleton bundle (built once per JVM lifecycle)
-private GrammarBundle cachedBundle;
+private volatile GrammarBundle cachedBundle;
Suggestion importance[1-10]: 4

__

Why: Adding volatile to cachedBundle is a valid thread-safety improvement for double-checked locking patterns. However, since getOrBuildBundle() is already synchronized, the current implementation is thread-safe, making this a minor improvement for clarity rather than a correctness fix.

Low
Fragile array index assertions in YAML test

The test asserts ignoredTokens.0 > 0 and rulesToVisit.0 > 0, but token type IDs
start at 1 (token type 0 is EOF). If the first element of either array happens to be
token type 1 (which is > 0), the test passes, but if the grammar changes and the
first ignored token has a higher index, the test still passes vacuously. More
critically, if the arrays are empty, accessing index 0 would fail with a different
error rather than a clear assertion failure. Consider using is_true checks on the
arrays themselves or asserting array length.

integ-test/src/yamlRestTest/resources/rest-api-spec/test/api/ppl.grammar.yml [17-18]

+- is_true: ignoredTokens
+- is_true: rulesToVisit
 - gt: {ignoredTokens.0: 0}
 - gt: {rulesToVisit.0: 0}
Suggestion importance[1-10]: 3

__

Why: The suggestion to add is_true checks on the arrays before index assertions is a minor improvement for test robustness. However, the improved_code keeps the original gt assertions and just adds is_true checks, which doesn't fully address the concern about empty arrays causing different errors.

Low
Possible issue
Null pointer risk in builder before validation

The GrammarBundleBuilder.build() method calls copy() on fields like
lexerSerializedATN and tokenDictionary without null-checking them first. If any
@NonNull-annotated field is not set before calling build(), the copy() call will
throw a NullPointerException before Lombok's @NonNull validation in the constructor
can produce a meaningful error. The tokenDictionary field specifically will throw
NPE in new LinkedHashMap<>(tokenDictionary) if null.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java [131-150]

 public static class GrammarBundleBuilder {
   public GrammarBundle build() {
+    // Lombok @NonNull will validate required fields in the constructor.
+    // Null checks here ensure copy() doesn't NPE before that validation.
     return new GrammarBundle(
         bundleVersion,
         antlrVersion,
         grammarHash,
-        copy(lexerSerializedATN),
-        copy(lexerRuleNames),
-        copy(channelNames),
-        copy(modeNames),
-        copy(parserSerializedATN),
-        copy(parserRuleNames),
+        lexerSerializedATN != null ? copy(lexerSerializedATN) : null,
+        lexerRuleNames != null ? copy(lexerRuleNames) : null,
+        channelNames != null ? copy(channelNames) : null,
+        modeNames != null ? copy(modeNames) : null,
+        parserSerializedATN != null ? copy(parserSerializedATN) : null,
+        parserRuleNames != null ? copy(parserRuleNames) : null,
         startRuleIndex,
-        copy(literalNames),
-        copy(symbolicNames),
-        Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)),
-        copy(ignoredTokens),
-        copy(rulesToVisit));
+        literalNames != null ? copy(literalNames) : null,
+        symbolicNames != null ? copy(symbolicNames) : null,
+        tokenDictionary != null ? Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)) : null,
+        ignoredTokens != null ? copy(ignoredTokens) : null,
+        rulesToVisit != null ? copy(rulesToVisit) : null);
   }
 }
Suggestion importance[1-10]: 3

__

Why: While technically correct that copy() could NPE before Lombok's @NonNull validation, the improved code passes null values through to the constructor which would then fail with a Lombok @NonNull NullPointerException anyway. The suggestion trades one NPE for another, providing minimal practical benefit.

Low
Suggestions up to commit 2c5b14e
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix incorrect zero-value assertions for token arrays

The assertions gt: {ignoredTokens.0: 0} and gt: {rulesToVisit.0: 0} check that the
first element is greater than 0, but token type 0 is a valid and common value (e.g.,
EOF or the first token type). This could cause false test failures if the first
ignored token or rule index happens to be 0. Consider using is_true or checking
array length instead.

integ-test/src/yamlRestTest/resources/rest-api-spec/test/api/ppl.grammar.yml [17-18]

-- gt: {ignoredTokens.0: 0}
-- gt: {rulesToVisit.0: 0}
+- is_true: ignoredTokens
+- is_true: rulesToVisit
Suggestion importance[1-10]: 6

__

Why: The assertion gt: {ignoredTokens.0: 0} could fail if the first ignored token type is 0 (e.g., EOF), making the test brittle. Switching to is_true checks for array presence is a more robust approach for this kind of validation.

Low
Prevent NullPointerException in builder copy calls

The GrammarBundleBuilder.build() method calls copy() on arrays that may be null if
not set by the builder, which would cause a NullPointerException since Arrays.copyOf
does not handle null inputs. The @NonNull Lombok annotation only validates at
construction time via the generated constructor, but the builder's custom build()
bypasses Lombok's null checks. Add explicit null checks before copying, or validate
fields before calling copy().

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java [131-150]

 public static class GrammarBundleBuilder {
   public GrammarBundle build() {
+    Objects.requireNonNull(lexerSerializedATN, "lexerSerializedATN must not be null");
+    Objects.requireNonNull(parserSerializedATN, "parserSerializedATN must not be null");
+    Objects.requireNonNull(ignoredTokens, "ignoredTokens must not be null");
+    Objects.requireNonNull(rulesToVisit, "rulesToVisit must not be null");
+    Objects.requireNonNull(tokenDictionary, "tokenDictionary must not be null");
     return new GrammarBundle(
         bundleVersion,
         antlrVersion,
         grammarHash,
         copy(lexerSerializedATN),
-        ...
+        copy(lexerRuleNames),
+        copy(channelNames),
+        copy(modeNames),
+        copy(parserSerializedATN),
+        copy(parserRuleNames),
+        startRuleIndex,
+        copy(literalNames),
+        copy(symbolicNames),
         Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)),
         copy(ignoredTokens),
         copy(rulesToVisit));
   }
 }
Suggestion importance[1-10]: 5

__

Why: The concern about NullPointerException when copy() is called on null arrays is valid since the custom build() method bypasses Lombok's @NonNull checks. However, the existing_code snippet uses ... as a placeholder which doesn't match the actual PR code exactly, and the @NonNull annotations on the fields do provide some protection via the generated all-args constructor.

Low
General
Fragile authorization via empty query reuse

The authorization mechanism reuses the PPL query transport action with an empty
query string, which will attempt to parse and execute an empty PPL query after the
grammar check short-circuits. While isGrammarRequest() does short-circuit in
TransportPPLQueryAction, this approach is fragile and couples grammar authorization
to query execution infrastructure. If the path check ever fails or changes, an empty
query could be executed. Consider adding a dedicated authorization check or at
minimum verifying this path is robust against future changes.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [92-93]

 client.execute(
-    PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("", null, ENDPOINT_PATH), listener);
+    PPLQueryAction.INSTANCE, new TransportPPLQueryRequest("", null, "/_plugins/_ppl/_grammar"), listener);
Suggestion importance[1-10]: 3

__

Why: The suggestion points out a valid design concern about coupling grammar authorization to the PPL query transport action, but the improved_code is identical to the existing_code — it just changes the string constant to the same value already used (ENDPOINT_PATH). The suggestion doesn't actually propose a concrete fix, making it low value.

Low
Suggestions up to commit 94f29ef
CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard against null return from bundle builder

If buildBundle() throws an exception, cachedBundle remains null, which is correct.
However, if buildBundle() returns null (as tested in
testGetGrammar_NullBundle_Returns500), cachedBundle is set to null and every
subsequent request will attempt to rebuild — but the null return will be passed to
serializeBundle() causing a NullPointerException rather than a clean error. Add a
null check after buildBundle() to throw a descriptive exception immediately.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [67-72]

     private synchronized GrammarBundle getOrBuildBundle() {
       if (cachedBundle == null) {
-        cachedBundle = buildBundle();
+        GrammarBundle built = buildBundle();
+        if (built == null) {
+          throw new IllegalStateException("buildBundle() returned null");
+        }
+        cachedBundle = built;
       }
       return cachedBundle;
     }
Suggestion importance[1-10]: 6

__

Why: If buildBundle() returns null, cachedBundle is set to null and serializeBundle() will throw a NullPointerException on every subsequent request rather than a clean error. The test testGetGrammar_NullBundle_Returns500 expects a 500 response, but the current code would produce an NPE in serializeBundle rather than a controlled error path, making the null-return case poorly handled.

Low
Add null validation before defensive copies in builder

The custom build() method in GrammarBundleBuilder does not validate that required
@NonNull fields (e.g., bundleVersion, antlrVersion, tokenDictionary) are non-null
before constructing the object. If any of these fields are null, a
NullPointerException will be thrown inside new LinkedHashMap<>(tokenDictionary) or
Arrays.copyOf(...) without a clear error message. Add explicit null checks or rely
on Lombok's @NonNull validation by calling the generated Lombok builder's build()
instead of overriding it.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java [131-150]

     public static class GrammarBundleBuilder {
       public GrammarBundle build() {
+        Objects.requireNonNull(bundleVersion, "bundleVersion must not be null");
+        Objects.requireNonNull(antlrVersion, "antlrVersion must not be null");
+        Objects.requireNonNull(grammarHash, "grammarHash must not be null");
+        Objects.requireNonNull(lexerSerializedATN, "lexerSerializedATN must not be null");
+        Objects.requireNonNull(lexerRuleNames, "lexerRuleNames must not be null");
+        Objects.requireNonNull(channelNames, "channelNames must not be null");
+        Objects.requireNonNull(modeNames, "modeNames must not be null");
+        Objects.requireNonNull(parserSerializedATN, "parserSerializedATN must not be null");
+        Objects.requireNonNull(parserRuleNames, "parserRuleNames must not be null");
+        Objects.requireNonNull(literalNames, "literalNames must not be null");
+        Objects.requireNonNull(symbolicNames, "symbolicNames must not be null");
+        Objects.requireNonNull(tokenDictionary, "tokenDictionary must not be null");
+        Objects.requireNonNull(ignoredTokens, "ignoredTokens must not be null");
+        Objects.requireNonNull(rulesToVisit, "rulesToVisit must not be null");
         return new GrammarBundle(
             bundleVersion,
             antlrVersion,
             grammarHash,
             copy(lexerSerializedATN),
-            ...
+            copy(lexerRuleNames),
+            copy(channelNames),
+            copy(modeNames),
+            copy(parserSerializedATN),
+            copy(parserRuleNames),
+            startRuleIndex,
+            copy(literalNames),
+            copy(symbolicNames),
             Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)),
             copy(ignoredTokens),
             copy(rulesToVisit));
       }
     }
Suggestion importance[1-10]: 3

__

Why: The @NonNull annotations on the fields already cause Lombok to generate null checks in the all-args constructor, so the NPE would still have a clear origin. The custom builder overrides Lombok's generated build(), but the null checks would fire when new LinkedHashMap<>(null) or Arrays.copyOf(null, ...) is called. Adding explicit checks improves clarity but is not critical.

Low
General
Fix fragile array value assertions in integration test

The test asserts ignoredTokens.0 > 0 and rulesToVisit.0 > 0, but token type 0 is
typically EOF or an internal token whose ID could legitimately be 0. This assertion
would fail if the first element happens to be 0. Use is_true or check array length
instead of a gt: 0 value assertion on the first element.

integ-test/src/yamlRestTest/resources/rest-api-spec/test/api/ppl.grammar.yml [17-18]

-  - gt: {ignoredTokens.0: 0}
-  - gt: {rulesToVisit.0: 0}
+  - is_true: ignoredTokens
+  - is_true: rulesToVisit
Suggestion importance[1-10]: 5

__

Why: Asserting gt: 0 on the first element of ignoredTokens and rulesToVisit is fragile because token type IDs can legitimately be 0. Using is_true on the arrays themselves would be a more robust check for non-empty presence.

Low
Fail explicitly when start rule is missing

If the "root" rule is not found, indexOf returns -1 and Math.max(-1, 0) silently
returns 0, which may point to the wrong rule. This could cause incorrect
autocomplete behavior without any warning. It would be safer to throw an
IllegalStateException when the root rule is not found, similar to how
buildRulesToVisit handles missing rules.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilder.java [183-186]

     private static int resolveStartRuleIndex(String[] ruleNames) {
       int idx = Arrays.asList(ruleNames).indexOf("root");
-      return Math.max(idx, 0);
+      if (idx < 0) {
+        throw new IllegalStateException(
+            "Parser rule 'root' not found in grammar — "
+                + "was it renamed or removed from OpenSearchPPLParser.g4?");
+      }
+      return idx;
     }
Suggestion importance[1-10]: 4

__

Why: Silently falling back to index 0 when "root" is not found could cause incorrect autocomplete behavior without any diagnostic. Throwing an IllegalStateException (consistent with buildRulesToVisit) would make grammar regressions immediately visible, improving maintainability.

Low
Suggestions up to commit 1040600
CategorySuggestion                                                                                                                                    Impact
General
Fail fast when start rule is missing

If the "root" rule is not found, indexOf returns -1 and Math.max(-1, 0) silently
returns 0, which may point to the wrong rule. This could cause incorrect
autocomplete behavior without any warning. Similar to buildRulesToVisit, this should
throw an IllegalStateException if the rule is not found, to catch grammar renames
early.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilder.java [195-198]

 private static int resolveStartRuleIndex(String[] ruleNames) {
   int idx = Arrays.asList(ruleNames).indexOf("root");
-  return Math.max(idx, 0);
+  if (idx < 0) {
+    throw new IllegalStateException(
+        "Parser rule 'root' not found in grammar — "
+            + "was it renamed or removed from OpenSearchPPLParser.g4?");
+  }
+  return idx;
 }
Suggestion importance[1-10]: 5

__

Why: The silent fallback to index 0 when "root" is not found could cause subtle autocomplete bugs without any diagnostic information. Throwing an IllegalStateException is consistent with the approach used in buildRulesToVisit and would catch grammar renames early, making this a reasonable defensive improvement.

Low
Ensure visibility of cached field across threads

The cachedBundle field is accessed via synchronized methods, but the field itself is
not volatile. While the synchronized keyword on getOrBuildBundle() and
invalidateCache() provides mutual exclusion, declaring the field volatile ensures
visibility guarantees across threads in all JVM implementations, which is a best
practice for lazily-initialized fields in concurrent contexts.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [37]

 // Lazy-initialized singleton bundle (built once per JVM lifecycle)
-private GrammarBundle cachedBundle;
+private volatile GrammarBundle cachedBundle;
Suggestion importance[1-10]: 3

__

Why: While volatile can improve visibility guarantees, the synchronized keyword on both getOrBuildBundle() and invalidateCache() already provides full memory visibility through the happens-before relationship. Adding volatile is redundant when all accesses are synchronized, making this a low-impact suggestion.

Low
Assert arrays are non-empty before index access

The test asserts ignoredTokens.0 > 0 and rulesToVisit.0 > 0, but these checks only
validate that the first element is a positive integer. They do not verify that the
arrays are non-empty (i.e., have more than one element). If the arrays are empty,
accessing index 0 would fail with a different error rather than a meaningful
assertion failure. Consider adding is_true: ignoredTokens and is_true: rulesToVisit
checks to explicitly assert the arrays are non-empty.

integ-test/src/yamlRestTest/resources/rest-api-spec/test/api/ppl.grammar.yml [17-18]

+- is_true: ignoredTokens
 - gt: {ignoredTokens.0: 0}
+- is_true: rulesToVisit
 - gt: {rulesToVisit.0: 0}
Suggestion importance[1-10]: 3

__

Why: Adding is_true checks for ignoredTokens and rulesToVisit before the gt index checks would provide clearer failure messages if the arrays are empty, but the gt check on index 0 would already fail if the array is empty, making this a minor test quality improvement.

Low
Possible issue
Add null check before copying map field

The custom GrammarBundleBuilder.build() method does not validate that required
@NonNull fields (like tokenDictionary) are non-null before calling new
LinkedHashMap<>(tokenDictionary), which would throw a NullPointerException if
tokenDictionary is not set. Lombok's @Builder with @NonNull performs null checks in
the constructor, but the custom build() bypasses Lombok's generated null-check logic
for the copy operations. Add explicit null checks before copying, or rely on the
constructor's @NonNull validation by passing the fields directly without
pre-copying.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/GrammarBundle.java [131-150]

 public static class GrammarBundleBuilder {
   public GrammarBundle build() {
+    Objects.requireNonNull(tokenDictionary, "tokenDictionary is required");
     return new GrammarBundle(
         bundleVersion,
         antlrVersion,
         grammarHash,
         copy(lexerSerializedATN),
-        ...
+        copy(lexerRuleNames),
+        copy(channelNames),
+        copy(modeNames),
+        copy(parserSerializedATN),
+        copy(parserRuleNames),
+        startRuleIndex,
+        copy(literalNames),
+        copy(symbolicNames),
         Collections.unmodifiableMap(new LinkedHashMap<>(tokenDictionary)),
         copy(ignoredTokens),
         copy(rulesToVisit));
   }
 }
Suggestion importance[1-10]: 4

__

Why: The @NonNull annotation on tokenDictionary in the constructor will already throw a NullPointerException if null is passed, but the NPE would occur at new LinkedHashMap<>(tokenDictionary) before reaching the constructor check. Adding an explicit null check improves the error message clarity, but this is a minor defensive improvement since the builder pattern typically ensures fields are set before calling build().

Low
Suggestions up to commit 76e9c78
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix typo in parser rule name

The rule name "renameClasue" appears to be a typo — it should likely be
"renameClause". If the actual grammar rule is named "renameClause", this will cause
an IllegalStateException at runtime when the bundle is first built. Verify the exact
rule name in OpenSearchPPLParser.g4 and correct the spelling.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilder.java [141-148]

 private static int[] buildRulesToVisit(String[] ruleNames) {
     List<String> ruleNamesToVisit = Arrays.asList(
         "statsFunctionName", "takeAggFunction", "integerLiteral", "decimalLiteral",
-        "keywordsCanBeId", "renameClasue", "qualifiedName", "tableQualifiedName",
+        "keywordsCanBeId", "renameClause", "qualifiedName", "tableQualifiedName",
         "wcQualifiedName", "positionFunctionName", "searchableKeyWord", "stringLiteral",
         "searchCommand", "searchComparisonOperator", "comparisonOperator", "sqlLikeJoinType"
     );
Suggestion importance[1-10]: 7

__

Why: The rule name "renameClasue" appears to be a typo for "renameClause". If the actual grammar rule uses the correct spelling, this would cause an IllegalStateException at runtime when the bundle is first built, making this a potentially critical bug worth fixing.

Medium
Guard against failure when sending error response

The BytesRestResponse constructor that accepts a RestChannel and Exception typically
calls channel.newErrorBuilder() internally, which may itself throw an IOException.
If that secondary failure occurs, it will propagate as an unhandled exception from
within the catch block, potentially leaving the channel without a response. Consider
wrapping the error response construction in its own try-catch to ensure a response
is always sent.

plugin/src/main/java/org/opensearch/sql/plugin/rest/RestPPLGrammarAction.java [62]

-channel.sendResponse(new BytesRestResponse(channel, RestStatus.INTERNAL_SERVER_ERROR, e));
+try {
+  channel.sendResponse(new BytesRestResponse(channel, RestStatus.INTERNAL_SERVER_ERROR, e));
+} catch (Exception inner) {
+  log.error("Failed to send error response", inner);
+}
Suggestion importance[1-10]: 4

__

Why: While the concern about secondary failures in error response construction is valid, this is an edge case that adds defensive boilerplate. The BytesRestResponse constructor with a channel and exception is a standard OpenSearch pattern, and wrapping it adds complexity for a rare failure scenario.

Low
General
Remove or utilize dead code method

The resolveTokenType method is defined but never called anywhere in the class. Dead
code can cause confusion and maintenance burden. Either remove it or use it in
buildTokenDictionary instead of the hardcoded OpenSearchPPLLexer constants, which
would make the dictionary more robust to grammar changes.

ppl/src/main/java/org/opensearch/sql/ppl/autocomplete/PPLGrammarBundleBuilder.java [165-172]

-/** Resolve a token type ID from the vocabulary by symbolic name. Returns -1 if not found. */
-private static int resolveTokenType(Vocabulary vocabulary, String name) {
-  for (int i = 0; i <= vocabulary.getMaxTokenType(); i++) {
-    if (name.equals(vocabulary.getSymbolicName(i))) {
-      return i;
-    }
-  }
-  return -1;
+// Remove the unused method, or replace hardcoded constants in buildTokenDictionary:
+private static Map<String, Integer> buildTokenDictionary(Vocabulary vocabulary) {
+  Map<String, Integer> dict = new LinkedHashMap<>();
+  dict.put("WHITESPACE", resolveTokenType(vocabulary, "WHITESPACE"));
+  dict.put("FROM", resolveTokenType(vocabulary, "FROM"));
+  dict.put("OPENING_BRACKET", resolveTokenType(vocabulary, "LT_PRTHS"));
+  dict.put("CLOSING_BRACKET", resolveTokenType(vocabulary, "RT_PRTHS"));
+  dict.put("SEARCH", resolveTokenType(vocabulary, "SEARCH"));
+  dict.put("SOURCE", resolveTokenType(vocabulary, "SOURCE"));
+  dict.put("PIPE", resolveTokenType(vocabulary, "PIPE"));
+  dict.put("ID", resolveTokenType(vocabulary, "ID"));
+  dict.put("EQUAL", resolveTokenType(vocabulary, "EQUAL"));
+  dict.put("IN", resolveTokenType(vocabulary, "IN"));
+  dict.put("COMMA", resolveTokenType(vocabulary, "COMMA"));
+  dict.put("BACKTICK_QUOTE", resolveTokenType(vocabulary, "BQUOTA_STRING"));
+  dict.put("DOT", resolveTokenType(vocabulary, "DOT"));
+  return dict;
 }
Suggestion importance[1-10]: 5

__

Why: The resolveTokenType method is indeed unused dead code. The suggestion to use it in buildTokenDictionary instead of hardcoded OpenSearchPPLLexer constants would make the code more robust to grammar changes, though the improved_code changes a different method than the existing_code snippet.

Low

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit 1040600

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit 94f29ef

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
…y, and tests

- Hash full 32-bit ints in grammarHash to avoid collisions with ANTLR 4.13.2 ATN serialization
- Use RuntimeMetaData.getRuntimeVersion() instead of unreliable JAR manifest lookup
- Make GrammarBundle immutable with @value instead of @DaTa
- Update THIRD-PARTY to reflect ANTLR 4.13.2
- Harden tests with JSON parsing and add antlrVersion assertion

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
- Assert ATN serialization version 4 for both lexer and parser to enforce
  antlr4ng compatibility contract
- Resolve startRuleIndex by looking up "root" rule name instead of hardcoding 0
- Fix MockRestChannel.bytesOutput() to return real BytesStreamOutput
- Document nullable elements in literalNames/symbolicNames Javadoc
- Rename test methods to follow testXxx() convention per ppl/plugin modules

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Consistent with buildBundle() which is also @VisibleForTesting protected.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
@mengweieric mengweieric force-pushed the feature/grammar-bundle branch from e85a38b to 2c5b14e Compare March 3, 2026 17:06
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit 2c5b14e

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit 2c5b14e

Comment on lines +104 to +122
// Thread-safe lazy initialization.
private synchronized GrammarBundle getOrBuildBundle() {
if (cachedBundle == null) {
cachedBundle = buildBundle();
}
return cachedBundle;
}

/** Constructs the grammar bundle. Override in tests to inject a custom or failing builder. */
@VisibleForTesting
protected GrammarBundle buildBundle() {
return new PPLGrammarBundleBuilder().build();
}

/** Invalidate the cached bundle, forcing a rebuild on the next request. */
@VisibleForTesting
protected synchronized void invalidateCache() {
cachedBundle = null;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PPLGrammarBundleBuilder has no constructor and no instance state — its build() method creates a lexer/parser locally, does all the work, and returns a GrammarBundle. So it's effectively a stateless utility class.

Option : Static holder pattern (lazy, thread-safe, no synchronization on hot path)

public class PPLGrammarBundleBuilder {

  private static class BundleHolder {
      static final GrammarBundle INSTANCE = new PPLGrammarBundleBuilder().build();
  }

  public static GrammarBundle getBundle() {
      return BundleHolder.INSTANCE;
  }

  // existing build logic stays private
  private GrammarBundle build() { ... }

}

Caller: PPLGrammarBundleBuilder.getBundle()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented, thanks for the suggestion.

I refactored PPLGrammarBundleBuilder to use a static holder singleton (getBundle()), made it stateless with a private constructor, and updated RestPPLGrammarAction to consume it directly (removed handler-level cache/synchronization). Tests were updated and all relevant checks are passing.

Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

Persistent review updated to latest commit da140ef

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants