[Feature] Add grammar bundle generation API for PPL language features#5162
[Feature] Add grammar bundle generation API for PPL language features#5162mengweieric wants to merge 20 commits intoopensearch-project:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds a new GET endpoint /_plugins/_ppl/_grammar that serves a cached, serialized ANTLR grammar bundle; introduces GrammarBundle and PPLGrammarBundleBuilder, a RestPPLGrammarAction handler with tests, unit tests for the builder, and updates ANTLR third‑party metadata to 4.13.2. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Handler as RestPPLGrammarAction
participant Cache
participant Builder as PPLGrammarBundleBuilder
participant Serializer as XContentBuilder
Client->>Handler: GET /_plugins/_ppl/_grammar
Handler->>Cache: check cached bundle
alt cache hit
Cache-->>Handler: return Bundle
else cache miss
Handler->>Builder: buildBundle()
Builder->>Builder: inspect lexer/parser, serialize ATNs, compute hash
Builder-->>Handler: Bundle
Handler->>Cache: store Bundle
end
Handler->>Serializer: serialize Bundle to JSON
Serializer-->>Handler: JSON payload
Handler-->>Client: HTTP 200 + JSON
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
d288392 to
eabe8ec
Compare
3f36846 to
c838750
Compare
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 76e9c78.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Reviewer Guide 🔍(Review updated until commit da140ef)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to da140ef Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 2c5b14e
Suggestions up to commit 2c5b14e
Suggestions up to commit 94f29ef
Suggestions up to commit 1040600
Suggestions up to commit 76e9c78
|
|
Persistent review updated to latest commit 1040600 |
|
Persistent review updated to latest commit 94f29ef |
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
…y, and tests - Hash full 32-bit ints in grammarHash to avoid collisions with ANTLR 4.13.2 ATN serialization - Use RuntimeMetaData.getRuntimeVersion() instead of unreliable JAR manifest lookup - Make GrammarBundle immutable with @value instead of @DaTa - Update THIRD-PARTY to reflect ANTLR 4.13.2 - Harden tests with JSON parsing and add antlrVersion assertion Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
- Assert ATN serialization version 4 for both lexer and parser to enforce antlr4ng compatibility contract - Resolve startRuleIndex by looking up "root" rule name instead of hardcoding 0 - Fix MockRestChannel.bytesOutput() to return real BytesStreamOutput - Document nullable elements in literalNames/symbolicNames Javadoc - Rename test methods to follow testXxx() convention per ppl/plugin modules Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Consistent with buildBundle() which is also @VisibleForTesting protected. Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
e85a38b to
2c5b14e
Compare
|
Persistent review updated to latest commit 2c5b14e |
1 similar comment
|
Persistent review updated to latest commit 2c5b14e |
| // Thread-safe lazy initialization. | ||
| private synchronized GrammarBundle getOrBuildBundle() { | ||
| if (cachedBundle == null) { | ||
| cachedBundle = buildBundle(); | ||
| } | ||
| return cachedBundle; | ||
| } | ||
|
|
||
| /** Constructs the grammar bundle. Override in tests to inject a custom or failing builder. */ | ||
| @VisibleForTesting | ||
| protected GrammarBundle buildBundle() { | ||
| return new PPLGrammarBundleBuilder().build(); | ||
| } | ||
|
|
||
| /** Invalidate the cached bundle, forcing a rebuild on the next request. */ | ||
| @VisibleForTesting | ||
| protected synchronized void invalidateCache() { | ||
| cachedBundle = null; | ||
| } |
There was a problem hiding this comment.
PPLGrammarBundleBuilder has no constructor and no instance state — its build() method creates a lexer/parser locally, does all the work, and returns a GrammarBundle. So it's effectively a stateless utility class.
Option : Static holder pattern (lazy, thread-safe, no synchronization on hot path)
public class PPLGrammarBundleBuilder {
private static class BundleHolder {
static final GrammarBundle INSTANCE = new PPLGrammarBundleBuilder().build();
}
public static GrammarBundle getBundle() {
return BundleHolder.INSTANCE;
}
// existing build logic stays private
private GrammarBundle build() { ... }
}
Caller: PPLGrammarBundleBuilder.getBundle()
There was a problem hiding this comment.
Implemented, thanks for the suggestion.
I refactored PPLGrammarBundleBuilder to use a static holder singleton (getBundle()), made it stateless with a private constructor, and updated RestPPLGrammarAction to consume it directly (removed handler-level cache/synchronization). Tests were updated and all relevant checks are passing.
Signed-off-by: Eric Wei <mengwei.eric@gmail.com>
|
Persistent review updated to latest commit da140ef |
Description
Implements the backend grammar metadata API for PPL autocomplete support.
This endpoint serves a versioned grammar bundle containing ANTLR metadata required for downstream consumers (for example OpenSearch Dashboards) to reconstruct a functional PPL lexer/parser at runtime using antlr4ng interpreter APIs. This enables full client-side parsing/autocomplete with zero per-keystroke server calls, while keeping backend grammar as the source of truth.
What the bundle contains:
ATNSerializer.serialize().toArray(), compatible with antlr4ngATNDeserializer.deserialize()tokenDictionary,ignoredTokens, andrulesToVisitfor autocomplete behaviorgrammarHash(ATNs + lexer/parser rule names + literal/symbolic vocabulary + ANTLR version) for client-side change detectionbundleVersionandantlrVersionfor compatibility validationBackend behavior:
@ExperimentalApiAlso included:
ppl.grammar+ YAML REST response-shape testTHIRD-PARTYupdated to reflect ANTLR 4.13.2Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.