Skip to content

Use Calcite's validation system for type checking & coercion#4892

Open
yuancu wants to merge 112 commits intoopensearch-project:mainfrom
yuancu:issues/4636
Open

Use Calcite's validation system for type checking & coercion#4892
yuancu wants to merge 112 commits intoopensearch-project:mainfrom
yuancu:issues/4636

Conversation

@yuancu
Copy link
Collaborator

@yuancu yuancu commented Dec 2, 2025

Description

This PR migrates from our custom type checking mechanism to Apache Calcite's native validation system by introducing a SqlNode validation layer. This addresses the lack of a proper SQL validation phase and resolves compatibility issues with user-defined types (UDTs).

This implements Approach 1 described in #3865.

How It Works

The PR introduces a validation layer in the query execution pipeline:

RelNode → SqlNode → [Validation & Type Coercion] → SqlNode → RelNode

This approach leverages Calcite's robust type system while preserving OpenSearch's custom type semantics through careful type mapping and restoration.

Benefits

  • Comprehensive type checking: Access to Calcite's full type checker suite (variable-length operands, same-operand constraints, OR-relationship type checkers) vs. our limited family-based checking
  • Function optimization: Ability to rewrite functions at the SqlNode level (e.g., sqrt(x)pow(x, 0.5)) before generating physical plans
  • Automatic type coercion: Implicit casting where type-safe and appropriate, improving query compatibility
  • Reduced maintenance: Leverage Calcite's battle-tested validation logic instead of maintaining custom implementations

Work Items / Implementation Details

Core Infrastructure

  • Introduced SqlNode validation layer with round-trip conversion (RelNode ↔ SqlNode)
  • Extended Calcite validator to support OpenSearch UDTs via type substitution
  • Implemented custom deriveType, coerceOperandType, and type inference methods
  • Made validator thread-safe with proper synchronization

Type System Enhancements

  • UDT Support: To support OpenSearch-specific UDTs (which Calcite doesn't natively support), we bridge Calcite's SqlTypeName enum with OpenSearch UDTs through dynamic type mapping:
    1. Convert UDTs to standard SQL types (e.g., EXPR_TIMESqlTypeName.TIME) before validation
    2. Apply Calcite's validation and type coercion
    3. Restore UDTs after validation to maintain plan correctness
  • Cross-type Comparisons: Override commonTypeForBinaryComparison to enable datetime cross-type comparisons (DATE vs TIME, etc.)
  • IP Type Handling: Emulated IP type using standard SQL types for validation, with special handling in operators
  • Composite Types: Extended SQL-UDT conversion to handle nested/composite types
  • Type Coercion: Use SAFE_CAST for string-to-number conversions to tolerate malformatted data; use CAST for literal numbers

Function & Operator Handling

  • Defined operand type checkers for all built-in functions:
    • Array functions (array_slice, reduce, mvappend, etc.) with proper type inference
    • JSON functions (json_extract, json_set, etc.)
    • Mathematical functions (corrected atan overloading, percentile approximations)
    • Aggregation functions (DISTINCT_COUNT_APPROX, COUNT(*) rewriting, etc.)
  • Function overloading support:
    • ADD operator: String concatenation vs numeric addition
    • ATAN: Single-operand vs two-operand versions
    • GEOIP: String overrides due to UDT erasure
  • Arithmetic operations between strings and numerics (implicit coercion)
  • Define SqlKind for DIVIDE and MOD UDFs

Query Construct Support

  • Preserve null ordering and collation through SqlNode round-trips
  • Support SEMI and ANTI joins in SQL conversion
  • Preserve sort orders in subqueries
  • Pass RelHint through conversions (added SqlHint for LogicalAggregate)
  • Handle windowed aggregates and bucket_nullable flags
  • Support IN/NOT IN with tuple inputs via row constructor rewriting

Dialect & Compatibility

  • Extended OpenSearch Spark SQL dialect for custom syntax
  • Fixed interval semantics mismatch between SQL and PPL
  • Fixed quarter interval bug in Calcite
  • Handle identifier expansion in aggregate functions
  • Properly handle LogicalValues for empty row generation

Edge Cases & Fixes

  • Skip validation for unsupported patterns:
    • Bin-on-timestamp operations (not yet implemented)
    • Group-by window functions (return original plan gracefully)
    • Specific aggregation patterns with LogicalValues
  • Fixed nullability attribute preservation through SAFE_CAST
  • Trim unused fields after SQL→Rel conversion
  • Ensure RelToSqlConverter is instantiated per-use (stateful component)
  • Fixed float literal handling with explicit casts
  • Remove JSON_TYPE operator insertion where inappropriate

Test Fixes

  • Fixed integration tests across multiple suites:
    • CalcitePPLDateTimeBuiltinFunctionIT: Interval semantics
    • CalcitePPLBuiltinFunctionIT: LOG function, sarg deserialization
    • CalciteArrayFunctionIT: Type checkers, reduce function inference
    • CalciteMathematicalFunctionIT, CalcitePPLAggregationIT
    • CalciteBinCommandIT: Timestamp operations, windowed aggregates in GROUP BY
    • CalciteStreamstatsCommandIT: Sort columns, bucket_nullable
    • CalcitePPLJsonBuiltinFunctionIT: String conversion
  • Updated all explain integration tests for plan changes
  • Fixed YAML tests, doctests, and unit tests
  • Updated ClickBench query plans

Code Cleanup

  • Removed legacy type checking from PPLFuncImpTable
  • Deprecated UDFOperandMetadata.wrapUDT interface
  • Removed unused classes: CalciteFuncSignature, PPLTypeChecker
  • Consolidated EnhancedCoalesce into built-in Coalesce
  • Removed type checkers from operator registration (now handled by Calcite)

Optimizations

  • Eliminated SAFE_CAST on non-string literal numbers (use CAST for better performance)
  • Investigated and addressed dedup optimization issues

Performance Impact

Dataset Scenario Analyze (ms) Optimize (ms) Execute (ms) Total Avg (ms)
clickbench Without Validation 7.93 8.29 0.95 17.53
With Validation 9.59 8.72 1.00 19.69
Difference +1.66 (+20.9%) +0.43 (+5.2%) +0.05 (+5.3%) +2.16 (+12.3%)
big5 Without Validation 3.52 4.27 0.80 8.99
With Validation 3.37 3.80 0.72 8.22
Difference -0.15 (-4.3%) -0.47 (-11.0%) -0.08 (-10.0%) -0.77 (-8.6%)
tpch Without Validation 8.65 692.00 183.15 1028.80
With Validation 7.91 689.16 175.63 1008.77
Difference -0.74 (-8.6%) -2.84 (-0.4%) -7.52 (-4.1%) -20.03 (-1.9%)

Profiled on personal laptop, each test runs twice, then averaged.

Conclusion: No significant performance degradation. ClickBench shows slight overhead (+12.3%) during analyze phase due to validation, but big5 and TPCH show improvements, likely from better query optimization enabled by proper type information.

Related Issues

Resolves #4636, resolves #3865

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@opensearch-project opensearch-project deleted a comment from coderabbitai bot Dec 2, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 3, 2025

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Calcite-based validation with implicit type coercion and LEFT SEMI/ANTI JOIN support; expanded IP and datetime casting; stronger UDF operand validation and broader variadic support.
  • Bug Fixes
    • More reliable window/ORDER BY behavior, float/interval literal handling, mixed-type IN/BETWEEN predicates, datetime binning, and safer dynamic-field handling in spath.
  • Documentation
    • Explain outputs moved to YAML; several examples marked non-executable (ignore) pending fixes.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Introduces extensive Calcite-based PPL validation and type-coercion plumbing, new PPL-specific validator/convertlets/coercion rules, UDT/type utilities, many UDF operand-metadata updates, Rex/Rel shuttles and Rel↔Sql converters, removal of legacy coercion/type-checker code, and large test/expected-output updates across integ and unit tests.

Changes

Cohort / File(s) Summary
Calcite validation core & providers
core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java, core/src/main/java/org/opensearch/sql/calcite/validate/PplValidator.java, core/src/main/java/org/opensearch/sql/calcite/validate/SqlOperatorTableProvider.java, core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercion.java, core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercionRule.java, core/src/main/java/org/opensearch/sql/calcite/validate/PplConvertletTable.java
Add provider interface, new PplValidator, PPL-specific TypeCoercion/Rule and convertlet table; expose validator creation and operator-table injection.
Rel/Sql conversion & shuttles
core/src/main/java/org/opensearch/sql/calcite/validate/converters/OpenSearchRelToSqlConverter.java, .../OpenSearchSqlToRelConverter.java, core/src/main/java/org/opensearch/sql/calcite/validate/shuttles/PplRelToSqlRelShuttle.java, .../SqlRewriteShuttle.java, .../SkipRelValidationShuttle.java
New Rel↔Sql converters and shuttles to translate RelNode→SqlNode→validate→RelNode, handle joins/hints, literal fixes, and selective validation skipping.
QueryService validation flow
core/src/main/java/org/opensearch/sql/executor/QueryService.java
Integrates new validation step that converts RelNode→SqlNode, validates via Calcite, converts back; supports tolerant fallback and configurable skip.
Type utilities & factory changes
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeUtil.java, .../OpenSearchTypeFactory.java, .../OpenSearchTypeFactory.leastRestrictive, core/src/main/java/org/opensearch/sql/calcite/validate/ValidationUtils.java
Add OpenSearchTypeUtil helpers, move UDT handling there, implement leastRestrictive override and utilities for syncing attributes/UDT creation.
Operand/type coercion & operator table refactor
core/src/main/java/org/opensearch/sql/calcite/utils/PPLOperandTypes.java, core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java, core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
Replace composite/family wrappers with direct OperandTypes usage, introduce SCALAR checkers, migrate many operators from SqlOperator to SqlFunction, and simplify function/agg registries to single-implementation maps.
Rex/Rel visitors & builders
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java, .../CalciteRexNodeVisitor.java, .../ExtendedRexBuilder.java, core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java, core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java
Embed plan collations into RexOver windows, simplify window construction, adjust casting/type-derivation for mixed numeric/char, and small plan utility replacements to use OpenSearchTypeUtil.
UDF operand metadata & nullability annotations
many files under core/src/main/java/org/opensearch/sql/expression/function/... (e.g., CollectionUDF/*, jsonUDF/*, udf/*, udf/datetime/*, udf/math/*, udf/ip/*, UserDefinedFunctionBuilder.java, UDFOperandMetadata.java, UserDefinedFunctionBuilder.java)
Make numerous getOperandMetadata() returns non-null, replace nulls with UDFOperandMetadata.wrap(...), migrate many operand checks to Calcite SqlOperandTypeChecker/OperandTypes, add @NonNull annotations, and introduce new custom checkers (e.g., IP, MAP repeats, transform/variadic semantics).
Removed legacy type-coercion framework
core/src/main/java/org/opensearch/sql/expression/function/CoercionUtils.java, core/src/main/java/org/opensearch/sql/expression/function/PPLTypeChecker.java, core/src/main/java/org/opensearch/sql/expression/function/CalciteFuncSignature.java
Remove legacy CoercionUtils and PPLTypeChecker/CalciteFuncSignature artifacts; their responsibilities moved into Calcite validation/coercion paths.
New/changed validation helpers & convertlets
core/src/main/java/org/opensearch/sql/calcite/validate/PplRelToSqlRelShuttle.java, core/src/main/java/org/opensearch/sql/calcite/validate/PplConvertletTable.java
Add convertlet table and rel→sql shuttle for literal normalization and operator remapping (e.g., IP operators).
Tests & expected outputs
test additions/changes under core/src/test/..., integ-test/src/test/..., integ-test/src/test/resources/expectedOutput/calcite/*, api/src/test/...
Large test surface updates: many new/updated unit tests for coercion/validation/shuttles/type utils; numerous integration expected-plan YAML changes (many JSON→YAML resources replaced/added); update transpiler test import path.
Docs & examples
docs/user/ppl/...
Minor docs/example updates (make code blocks ignore execution or formatting tweaks).

Sequence Diagram(s)

sequenceDiagram
    participant Rel as RelNode (PPL plan)
    participant Shuttle as PplRelToSqlRelShuttle
    participant RelToSql as OpenSearchRelToSqlConverter
    participant Validator as SqlValidator / PplValidator
    participant SqlToRel as OpenSearchSqlToRelConverter
    participant Planner as QueryService (convertToCalcitePlan)

    Rel->>Shuttle: traverse & normalize RexLiterals
    Shuttle->>RelToSql: convert RelNode -> SqlNode
    RelToSql->>Validator: validate SqlNode (type coercion, implicit casts)
    Validator-->>SqlToRel: produce validated SqlNode
    SqlToRel->>Planner: convert validated SqlNode -> RelNode (validated RelNode)
    Planner->>Planner: continue planning / optimize using validated RelNode
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Suggested labels

backport 2.19-dev

Suggested reviewers

  • ps48
  • kavithacm
  • derek-ho
  • joshuali925
  • dai-chen
🚥 Pre-merge checks | ✅ 4 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.72% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title concisely and accurately describes the primary change: adding Calcite-based validation for type checking and coercion.
Description check ✅ Passed Description clearly documents the SqlNode validation round-trip, UDT mapping, and type-coercion work and matches the changeset.
Linked Issues check ✅ Passed Implements SqlNode round-trip, PplValidator, PplTypeCoercion/Rule, UDT mapping and implicit coercion per linked objectives [#4636][#3865].
Out of Scope Changes check ✅ Passed Changes align with the migration to Calcite validation/coercion; refactors, removals, and test updates appear directly related and in-scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@yuancu yuancu added the enhancement New feature or request label Dec 3, 2025
@yuancu yuancu force-pushed the issues/4636 branch 5 times, most recently from e085f81 to fc6dd27 Compare December 15, 2025 13:40
…checking

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

# Conflicts:
#	core/src/main/java/org/opensearch/sql/executor/QueryService.java

# Conflicts:
#	core/src/main/java/org/opensearch/sql/executor/QueryService.java
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
… logics

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
- 2 more ITs passed in PPLBuiltinFunctionIT

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

# Conflicts:
#	core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
- this fix testRand, where desrialization of sarg does not restore its type
- todo: update the toRex in ExtendedRelJson to the align with the latest version

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…estamp; (time, timestamp) -> timestamp (1240/1599)

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…2/1872)

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
- allow type cast
- rewrite call to sql compare to custom ip comapre

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>

# Conflicts:
#	core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…1356/1599 | 1476/1915)

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…d in mvindex's implementation (1580/2015)

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…iting (1579/2015)

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…pe hint

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…e inference (1701/2015)

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (1)

471-492: Add the missing import for UDFOperandMetadata.

UDFOperandMetadata is referenced here but isn’t imported, which will fail compilation unless it’s in the same package. Add the import (or fully qualify the reference).

🛠️ Suggested fix
 import org.opensearch.sql.calcite.utils.PPLOperandTypes;
 import org.opensearch.sql.calcite.utils.PPLReturnTypes;
+import org.opensearch.sql.calcite.utils.UDFOperandMetadata;
 import org.opensearch.sql.calcite.utils.UserDefinedFunctionUtils;
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (2)

399-425: Add null guards for external operator registrations.

Fail fast if operator / aggFunction is null to avoid deferred NPEs during planning.

🔧 Proposed fix
 public void registerExternalOperator(BuiltinFunctionName functionName, SqlOperator operator) {
+  if (operator == null) {
+    throw new IllegalArgumentException(
+        String.format(Locale.ROOT, "Operator cannot be null for function %s", functionName));
+  }
   if (externalFunctionRegistry.containsKey(functionName)) {
     logger.warn(
         String.format(Locale.ROOT, "Function %s is registered multiple times", functionName));
   }
   externalFunctionRegistry.put(functionName, (builder, args) -> builder.makeCall(operator, args));
 }

 public void registerExternalAggOperator(
     BuiltinFunctionName functionName, SqlUserDefinedAggFunction aggFunction) {
+  if (aggFunction == null) {
+    throw new IllegalArgumentException(
+        String.format(Locale.ROOT, "Agg function cannot be null for %s", functionName));
+  }
   if (aggExternalFunctionRegistry.containsKey(functionName)) {
     logger.warn(
         String.format(
             Locale.ROOT, "Aggregate function %s is registered multiple times", functionName));
   }
   AggHandler handler =
       (distinct, field, argList, ctx) ->
           UserDefinedFunctionUtils.makeAggregateCall(
               aggFunction, List.of(field), argList, ctx.relBuilder);
   aggExternalFunctionRegistry.put(functionName, handler);
 }

428-475: Add JavaDoc for the public resolve methods.

These public APIs currently lack JavaDoc with @param/@return (and @throws if applicable).
As per coding guidelines, Public methods MUST have JavaDoc with @param, @return, and @throws.

🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 1860-1917: The makeOver call in embedExistingCollationsIntoOver is
missing the required RexWindowExclusion argument; update the call in method
embedExistingCollationsIntoOver (where context.rexBuilder.makeOver is invoked
for RexOver) to insert RexWindowExclusion.EXCLUDE_NO_OTHER between
window.getUpperBound() and window.isRows(), preserving the rest of the
parameters (type, agg operator, operands, partitionKeys,
orderKeys/rexCollations, lowerBound, upperBound, exclude, isRows,
allowPartial(true), isNullWhenCountZero(false), over.isDistinct(),
over.ignoreNulls()) so the parameter ordering matches RexBuilder.makeOver.

In `@core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java`:
- Around line 306-313: Add a unit test that exercises
DynamicFieldsHelper.createEmptyMap to cover the empty-keys map creation path and
assert the returned RexNode is a non-null NULL literal of MAP type (i.e., not an
unexpected Java null and its type is a MAP of VARCHAR->VARCHAR). Create a test
in the corresponding test class (e.g., DynamicFieldsHelperTest) that constructs
or mocks a CalcitePlanContext with a RexBuilder/TypeFactory (or reuse existing
test helpers), calls createEmptyMap(context), and asserts the result is not null
and result.getType().isMap() (and that the map key/value types are VARCHAR).
Ensure the new test method name clearly indicates it verifies empty-keys map
creation (e.g., testCreateEmptyMapReturnsMapNullLiteral).
- Around line 293-303: Add unit tests covering the two new private helpers
createEmptyMap() and getFieldsAsMap(): write tests that exercise
createEmptyMap() returning a NULL-cast-to-MAP literal and verify consumer
behavior (e.g., MapAppendFunctionImpl.mapAppend() preserves non-null maps when
one arg is NULL), and tests for getFieldsAsMap() producing the expected
MAP_FROM_ARRAYS RexNode for given keys; also update the JavaDoc on
createEmptyMap() to clarify it returns a NULL literal cast to MAP (not a
concrete empty map) so the comment matches behavior. Ensure tests live alongside
other core Java tests and reference the helper methods by name (createEmptyMap,
getFieldsAsMap) when locating/covering the code.

In
`@core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java`:
- Around line 370-388: In OpenSearchTypeFactory, add an explicit early check for
the all-NULL case before the current conditional chain: if nullCount ==
types.size() return the NULL SQL type (or delegate to the existing
leastRestrictive behavior) instead of falling through to the date/time/ip
checks; update the method that builds udt (the block that calls
createUDT(ExprUDT.EXPR_DATE, ...), createUDT(...EXPR_TIME...),
createUDT(...EXPR_IP...), createUDT(...EXPR_BINARY...), and
createSqlType(SqlTypeName.VARCHAR, ...)) to handle the all-null case first so an
all-NULL input does not incorrectly yield ExprUDT.EXPR_DATE.
🧹 Nitpick comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSpathCommandIT.java (1)

160-165: Make the dynamic-field test assert extracted values (not just 0 rows).

verifyNumOfRows(result, 0) can pass even if dynamic-field extraction regresses, and it’s brittle if the logs fixture changes. Consider asserting at least one known row/value (or inserting a deterministic fixture) so the test validates actual SPATH extraction behavior.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)

1839-1846: Track the new TODO with an issue or reference.

This will make the push‑down regression easier to triage later. I can open a follow‑up issue if you want.

core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)

332-397: Add JavaDoc for the public leastRestrictive override.

Per coding guidelines, public methods should have JavaDoc with @param, @return, and @throws tags. This method implements important OpenSearch UDT semantics that differ from standard Calcite behavior and would benefit from documentation.

Additionally, the timestamp promotion logic on line 384 (binaryCount == 0 && ipCount == 0) includes cases where dateCount + timeCount + timestampCount + nullCount == types.size(), but timestampCount is never tracked. This appears intentional (any mix of date/time UDTs becomes timestamp), but documenting this behavior would improve clarity.

📝 Suggested JavaDoc
   `@Override`
+  /**
+   * Returns the least restrictive type among the given types, with special handling for OpenSearch
+   * UDTs.
+   *
+   * <p>For UDTs:
+   * <ul>
+   *   <li>Homogeneous DATE/TIME/IP/BINARY types (plus NULLs) return the corresponding UDT</li>
+   *   <li>Mixed DATE/TIME/TIMESTAMP types are promoted to TIMESTAMP</li>
+   *   <li>Mixed UDT types that cannot be unified return VARCHAR</li>
+   * </ul>
+   *
+   * <p>For standard SQL types, delegates to Calcite with PPL coercion rules, converting CHAR to
+   * VARCHAR to avoid result padding.
+   *
+   * `@param` types the list of types to find the least restrictive common type for
+   * `@return` the least restrictive type, or null if none exists
+   */
   public `@Nullable` RelDataType leastRestrictive(List<RelDataType> types) {

Comment on lines +1860 to +1917
/**
* Embed existing collation into window function's over clauses.
*
* <p>Window functions with frame specifications like {@code ROWS n PRECEDING} require ORDER BY to
* determine row order. Without it, results are non-deterministic.
*
* <p>Without this fix, the initial plan has ORDER BY separate from window functions:
*
* <pre>
* LogicalProject(SUM($5) OVER (ROWS 1 PRECEDING)) ← Missing ORDER BY
* LogicalSort(sort0=[$5])
* </pre>
*
* <p>This causes problems during validation as the order is not bound to the window. With this
* fix, sort collations are embeded into each {@code RexOver} window:
*
* <pre>
* LogicalProject(SUM($5) OVER (ORDER BY $5 ROWS 1 PRECEDING)) ← ORDER BY embedded
* </pre>
*
* @param overExpressions Window function expressions (may contain nested {@link RexOver})
* @param context Plan context for building RexNodes
* @return Expressions with ORDER BY embedded in all window specifications
*/
private List<RexNode> embedExistingCollationsIntoOver(
List<RexNode> overExpressions, CalcitePlanContext context) {
RelCollation existingCollation = context.relBuilder.peek().getTraitSet().getCollation();
List<@NonNull RelFieldCollation> relCollations =
existingCollation == null ? List.of() : existingCollation.getFieldCollations();
ImmutableList<@NonNull RexFieldCollation> rexCollations =
relCollations.stream()
.map(f -> relCollationToRexCollation(f, context.relBuilder))
.collect(ImmutableList.toImmutableList());
return overExpressions.stream()
.map(
n ->
n.accept(
new RexShuttle() {
@Override
public RexNode visitOver(RexOver over) {
RexWindow window = over.getWindow();
return context.rexBuilder.makeOver(
over.getType(),
over.getAggOperator(),
over.getOperands(),
window.partitionKeys,
rexCollations,
window.getLowerBound(),
window.getUpperBound(),
window.isRows(),
true,
false,
over.isDistinct(),
over.ignoreNulls());
}
}))
.collect(Collectors.toList());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for RexOver class definition and its methods
rg -n "class RexOver" --type=java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Search for makeOver method signature to understand all parameters
rg -n "makeOver\(" --type=java -A 10 | grep -E "makeOver|allowPartial|nullWhenCountZero" | head -50

Repository: opensearch-project/sql

Length of output: 1319


🏁 Script executed:

#!/bin/bash
# Check if RexOver has getters for allowPartial or isNullWhenCountZero
rg -n "getAllowPartial|getNullWhenCountZero|isAllowPartial|isNullWhenCountZero|allowPartial|nullWhenCountZero" --type=java | head -40

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Look at the specific method implementation in CalciteRelNodeVisitor
sed -n '1860,1917p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Repository: opensearch-project/sql

Length of output: 2631


🏁 Script executed:

#!/bin/bash
# Check PlanUtils.makeOver method signature and implementation
sed -n '157,200p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java

Repository: opensearch-project/sql

Length of output: 2000


🏁 Script executed:

#!/bin/bash
# Compare other makeOver calls to see parameter patterns
rg -B2 -A12 "rexBuilder.makeOver|PlanUtils.makeOver" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -80

Repository: opensearch-project/sql

Length of output: 2920


🏁 Script executed:

#!/bin/bash
# Check ExtendedRelJson.java usage of makeOver (line 448) for reference
sed -n '440,465p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java

Repository: opensearch-project/sql

Length of output: 1011


🏁 Script executed:

#!/bin/bash
# Search for methods called on RexOver objects to see what properties are available
rg -n "over\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "over\.[a-zA-Z]+" | head -30

Repository: opensearch-project/sql

Length of output: 395


🏁 Script executed:

#!/bin/bash
# Check if there are imports or references to Calcite's RexOver methods
rg -n "isAllowPartial|isNullWhenCountZero|allowPartial|nullWhenCountZero" core/src/main/java/ opensearch/src/main/java/

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Look at the Calcite version constraints to understand API compatibility
cat pom.xml | grep -A 5 -B 5 "calcite" | head -50

Repository: opensearch-project/sql

Length of output: 105


🏁 Script executed:

#!/bin/bash
# Check if there's any documentation or comments about makeOver parameters
rg -B5 "makeOver" core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java | grep -E "makeOver|//|/\*" | head -40

Repository: opensearch-project/sql

Length of output: 92


🏁 Script executed:

#!/bin/bash
# Search for allowPartial and nullWhenCountZero in the entire codebase
find . -name "*.java" -type f | xargs grep -l "allowPartial\|nullWhenCountZero" 2>/dev/null

Repository: opensearch-project/sql

Length of output: 50


🏁 Script executed:

#!/bin/bash
# Look at the actual Calcite RexBuilder.makeOver signatures in any comments or docs
rg -B20 "rexBuilder\.makeOver" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "^[0-9]+-.*(\*|makeOver)" | head -40

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if other usages in the codebase try to preserve these values from source RexOver
rg -B5 -A15 "makeOver.*over\." core/src/main/java/org/opensearch/sql/calcite/ | grep -E "over\.|makeOver|allowPartial|nullWhenCountZero" | head -50

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Look for the full PlanUtils method to see the sumOver and countOver helper functions
sed -n '157,300p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java

Repository: opensearch-project/sql

Length of output: 5371


🏁 Script executed:

#!/bin/bash
# Search for helper methods that might extract these properties from RexOver
rg -n "getallowPartial|getNullWhenCountZero|getAllowPartial|isNullWhenCountZero" --type=java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if PlanUtils has a method that wraps makeOver calls with parameter handling
sed -n '200,250p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java

Repository: opensearch-project/sql

Length of output: 1577


🏁 Script executed:

#!/bin/bash
# Look for comments or documentation about the makeOver parameters, especially those two
rg -B10 "true.*false" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "(1910|1911|true|false|allow|null|makeOver)" | head -30

Repository: opensearch-project/sql

Length of output: 546


🏁 Script executed:

#!/bin/bash
# Check what Calcite version this project uses - may help understand the API
find . -name "pom.xml" -type f | xargs grep -h "calcite" | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Look at all pom.xml files in the repo to find Calcite version
find . -type f -name "pom.xml" | xargs grep -l "calcite" | head -5

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check the gradle files if any
find . -type f -name "build.gradle" -o -name "build.gradle.kts" | xargs grep -l "calcite" 2>/dev/null | head -5

Repository: opensearch-project/sql

Length of output: 175


🏁 Script executed:

#!/bin/bash
# Search for RexBuilder in imports to understand which Calcite version is used
rg -n "import.*RexBuilder|import.*RexOver" --type=java | head -20

Repository: opensearch-project/sql

Length of output: 2761


🏁 Script executed:

#!/bin/bash
# Look at the complete line context for the makeOver call to see if there are inline comments
sed -n '1895,1920p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Repository: opensearch-project/sql

Length of output: 1128


🏁 Script executed:

#!/bin/bash
# Check the core build.gradle for Calcite version
sed -n '1,100p' core/build.gradle | grep -i calcite

Repository: opensearch-project/sql

Length of output: 168


🏁 Script executed:

#!/bin/bash
# Look for any usage of RexStandardizer which imports RexOver
sed -n '1,150p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java | head -80

Repository: opensearch-project/sql

Length of output: 3279


🏁 Script executed:

#!/bin/bash
# Search for how RexOver is used elsewhere in the codebase to understand available methods
rg -A5 "visitOver|RexOver" core/src/main/java/org/opensearch/sql/calcite/ | grep -E "over\.|RexOver" | head -40

Repository: opensearch-project/sql

Length of output: 1447


🏁 Script executed:

#!/bin/bash
# Search for tests of embedExistingCollationsIntoOver
rg -n "embedExistingCollationsIntoOver" --type=java

Repository: opensearch-project/sql

Length of output: 359


🏁 Script executed:

#!/bin/bash
# Check what parameters RexBuilder.makeOver accepts in Calcite 1.41.0
# by looking at all makeOver invocations with their parameter counts
rg -A20 "\.makeOver\(" core/src/main/java/org/opensearch/sql/calcite/ | grep -E "makeOver\(|,\s*\)|over\.|window\." | head -60

Repository: opensearch-project/sql

Length of output: 2140


🏁 Script executed:

#!/bin/bash
# Look for any documentation or comments about what allowPartial and isNullWhenCountZero mean
rg -B5 -A5 "allowPartial|nullWhenCountZero" . --type=java 2>/dev/null | head -40

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if RexOver provides any getter methods beyond what we've already seen
rg -n "over\.get|over\.is" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -v "getWindow\|getType\|getAggOperator\|getOperands\|isDistinct\|ignoreNulls"

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Look more carefully at the exact makeOver signature by checking line counts
sed -n '1901,1913p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | cat -n

Repository: opensearch-project/sql

Length of output: 766


🏁 Script executed:

#!/bin/bash
# Check if there's an exclude parameter being used elsewhere
rg -n "RexWindowExclusion\|getExclude" --type=java | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Count the parameters in the ExtendedRelJson makeOver call more carefully
sed -n '448,465p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | cat -n

Repository: opensearch-project/sql

Length of output: 814


🏁 Script executed:

#!/bin/bash
# Search for documentation or comments about what parameters 9 and 10 represent
rg -B10 "true,\s*false," core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "^\s*\*|makeOver|true.*false" | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# The two makeOver calls have different signatures. Let's search for documentation or other usages
# to determine which is correct. Check if exclude is a newer addition
rg -n "exclude.*EXCLUDE" --type=java | head -20

Repository: opensearch-project/sql

Length of output: 218


🏁 Script executed:

#!/bin/bash
# Search for RexWindowExclusion usage
rg -n "RexWindowExclusion" --type=java

Repository: opensearch-project/sql

Length of output: 1408


🏁 Script executed:

#!/bin/bash
# Look at the imports in ExtendedRelJson to see if it uses exclude
head -50 opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | grep -E "import|RexWindow"

Repository: opensearch-project/sql

Length of output: 1888


🏁 Script executed:

#!/bin/bash
# Count total parameters by looking at ExtendedRelJson more carefully with all makeOver calls
rg -B5 -A20 "rexBuilder.makeOver" opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | head -80

Repository: opensearch-project/sql

Length of output: 1048


🏁 Script executed:

#!/bin/bash
# Check what RexWindow exposes - does it have getExclude or similar?
rg -n "window\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "window\.(get|is)" | head -20

Repository: opensearch-project/sql

Length of output: 229


🏁 Script executed:

#!/bin/bash
# Compare the parameter order and count: ExtendedRelJson has 13 params, CalciteRelNodeVisitor has 12
# Let's check if window has an exclude field
rg -n "getExclude\|exclusion" --type=java | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Let's look at what the original RexOver contains - does it have exclude info?
rg -B5 -A10 "class RexWindow\|RexWindow(" opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | head -40

Repository: opensearch-project/sql

Length of output: 170


🏁 Script executed:

#!/bin/bash
# Critical: Check if parameter mismatch - ExtendedRelJson has exclude param, CalciteRelNodeVisitor doesn't
# Let's see ALL the parameters being passed to understand the order
echo "=== CalciteRelNodeVisitor version ===" && sed -n '1901,1913p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java && echo "" && echo "=== ExtendedRelJson version ===" && sed -n '448,461p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java

Repository: opensearch-project/sql

Length of output: 1204


🏁 Script executed:

#!/bin/bash
# This is critical - the parameter lists don't match. Let's see if RexWindow has getExclude
rg -n "\.getExclude|Exclusion" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Repository: opensearch-project/sql

Length of output: 223


🏁 Script executed:

#!/bin/bash
# Check if RexWindow exposes exclusion information
sed -n '1900,1915p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "window\."

Repository: opensearch-project/sql

Length of output: 264


🏁 Script executed:

#!/bin/bash
# Look for RexWindow.getExclude or similar in the whole codebase
rg -n "getExclude" --type=java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# The signatures don't match - CalciteRelNodeVisitor is missing the exclude parameter
# Let's verify the exact Calcite RexBuilder.makeOver signature by checking javadoc or other hints
# Search for how many parameters makeOver should have in different contexts
rg -n "makeOver" --type=java -A1 | grep -E "makeOver.*\(" | wc -l

Repository: opensearch-project/sql

Length of output: 68


🏁 Script executed:

#!/bin/bash
# Search online Calcite documentation or check if there's version info about when exclude was added
# For now, let's verify whether window object has exclusion info
rg -n "window\.get|RexWindow\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -20

Repository: opensearch-project/sql

Length of output: 179


🏁 Script executed:

#!/bin/bash
# Check if there are overloaded versions of makeOver in Calcite
# by looking at how it's being called in different ways
rg -B2 -A15 "rexBuilder\.makeOver" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -50

Repository: opensearch-project/sql

Length of output: 886


🏁 Script executed:

#!/bin/bash
# Check the git history to see if this code was recently added or if there's a comment about it
git log --oneline -20 core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java 2>/dev/null | head -20

Repository: opensearch-project/sql

Length of output: 134


🏁 Script executed:

#!/bin/bash
# Search for any TODO or FIXME comments near the embedExistingCollationsIntoOver method
sed -n '1850,1920p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "TODO|FIXME|BUG|XXX"

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if the code compiles without errors by looking for any compilation markers
rg -n "PARAM_MISMATCH|parameter.*mismatch" --type=java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

web_search
Apache Calcite 1.41.0 RexBuilder.makeOver method signature parameters

Repository: opensearch-project/sql

Length of output: 161


🏁 Script executed:

#!/bin/bash
# Let's try to use web_search properly with the capability
# For now, check if there are any recent changes or PRs related to this
git log --all --oneline --grep="makeOver\|window\|collation" 2>/dev/null | head -20

Repository: opensearch-project/sql

Length of output: 1400


🏁 Script executed:

#!/bin/bash
# Check if RexWindow has an exclusion getter that should be used
rg -n "RexWindow\|class.*Window" core/src/main/java/org/opensearch/sql/calcite/ --type=java | grep -v ".class" | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Critical check: Does RexWindow from over.getWindow() have an exclusion?
# by checking what RexWindow methods are available
rg -n "over\.getWindow\(\)" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java -A5 | head -30

Repository: opensearch-project/sql

Length of output: 409


🏁 Script executed:

#!/bin/bash
# Let's check if there's a variant of makeOver that takes fewer parameters
# by looking at how PlanUtils helper methods wrap it
sed -n '200,250p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java | grep -E "makeOver|relBuilder\."

Repository: opensearch-project/sql

Length of output: 395


🏁 Script executed:

#!/bin/bash
# Critical discovery: the parameter lists differ significantly
# ExtendedRelJson has 14 params, CalciteRelNodeVisitor has 12
# AND ExtendedRelJson includes RexWindowExclusion which is missing from CalciteRelNodeVisitor
# Let's verify if the code even compiles by checking for any build errors
find . -name "*.log" -o -name "*error*" 2>/dev/null | head -10

Repository: opensearch-project/sql

Length of output: 454


🏁 Script executed:

#!/bin/bash
# Let's check the git commit that added embedExistingCollationsIntoOver
git log --all -p --follow -- core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -B20 -A20 "embedExistingCollationsIntoOver" | head -100

Repository: opensearch-project/sql

Length of output: 152


🏁 Script executed:

#!/bin/bash
# Check if there's any test that validates this method works correctly
find . -name "*Test.java" -type f | xargs grep -l "embedExistingCollationsIntoOver\|window.*collation" 2>/dev/null

Repository: opensearch-project/sql

Length of output: 50


🏁 Script executed:

#!/antml:tool_start
```web_search
Calcite 1.41.0 RexBuilder makeOver signature documentation
```
</tool_start>

Repository: opensearch-project/sql

Length of output: 267


🏁 Script executed:

#!/bin/bash
# Let me check the actual failing scenario more carefully
# The parameter mismatch is concerning - let's see if CalciteRelNodeVisitor was recently modified
git log --all --oneline -5 -- core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Repository: opensearch-project/sql

Length of output: 367


🏁 Script executed:

#!/bin/bash
# Check the PR that added this method
git log --all -p -- core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -B50 "embedExistingCollationsIntoOver" | grep "^commit\|^Author\|^Date" | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# The critical issue: parameter 8 in CalciteRelNodeVisitor is window.isRows()
# but in ExtendedRelJson it's exclude (RexWindowExclusion)
# Let's verify what RexBuilder.makeOver actually expects
# by checking if window has an exclusion property
rg -n "getExclusion\|getExclude\|exclusion\|Exclusion" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/antml:thinking_start
Looking at the parameter discrepancy:

CalciteRelNodeVisitor version (12 params):
1. type
2. operator  
3. operands
4. partitionKeys
5. orderKeys (as rexCollations)
6. lowerBound
7. upperBound
8. isRows (boolean)
9. true
10. false
11. distinct
12. ignoreNulls

ExtendedRelJson version (13 params):
1. type
2. operator
3. operands
4. partitionKeys
5. orderKeys (as orderKeys)
6. lowerBound
7. upperBound
8. exclude (RexWindowExclusion)
9. physical (boolean)
10. true
11. false
12. distinct
13. false

The issue is clear: CalciteRelNodeVisitor is missing the `exclude` parameter and has the parameter types wrong. Parameter 8 should be RexWindowExclusion, not isRows().

This looks like a genuine bug in the embedExistingCollationsIntoOver method - it's missing the exclude parameter which should come from window.getExclusion() or default to EXCLUDE_NO_OTHER.

Let me search for whether RexWindow has a getExclusion method.
</thinking_start>

Repository: opensearch-project/sql

Length of output: 241


🏁 Script executed:

#!/antml:thinking_end

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check if RexWindow class has getExclusion method by looking at how it's used elsewhere
rg -n "window\.getExclusion|RexWindow" core/src/main/java/org/opensearch/sql/calcite/ --type=java | grep -i "exclusion" | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

#!/antml:thinking_start
The user hasn't provided results yet. Let me think about what I know:

1. ExtendedRelJson passes a RexWindowExclusion parameter to makeOver
2. It gets this from window.get("exclude") or defaults to RexWindowExclusion.EXCLUDE_NO_OTHER
3. CalciteRelNodeVisitor doesn't pass any RexWindowExclusion parameter
4. The parameter at position 8 in CalciteRelNodeVisitor is window.isRows() which seems wrong

This is a real bug - the makeOver signature requires RexWindowExclusion but the code doesn't provide it. The question is whether window.getExclusion() exists or if we need to default to EXCLUDE_NO_OTHER like ExtendedRelJson does.
</thinking_end>

Repository: opensearch-project/sql

Length of output: 287


Fix missing RexWindowExclusion parameter in makeOver call.

The makeOver call is missing a required parameter. Comparing with ExtendedRelJson.java (line 448), the Calcite RexBuilder.makeOver signature expects RexWindowExclusion at parameter position 8, but the current code passes window.isRows() instead.

The correct order should be:

  • type
  • operator
  • operands
  • partitionKeys
  • orderKeys
  • lowerBound
  • upperBound
  • exclude (RexWindowExclusion) — currently missing
  • followed by remaining parameters

Add the missing exclude parameter (defaulting to RexWindowExclusion.EXCLUDE_NO_OTHER like in ExtendedRelJson):

return context.rexBuilder.makeOver(
    over.getType(),
    over.getAggOperator(),
    over.getOperands(),
    window.partitionKeys,
    rexCollations,
    window.getLowerBound(),
    window.getUpperBound(),
    RexWindowExclusion.EXCLUDE_NO_OTHER,  // Add this
    window.isRows(),
    true,
    false,
    over.isDistinct(),
    over.ignoreNulls());

The original concern about preserving allowPartial and isNullWhenCountZero from the source RexOver is not actionable—the RexOver API does not expose these properties as getters, making hardcoding acceptable.

🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`
around lines 1860 - 1917, The makeOver call in embedExistingCollationsIntoOver
is missing the required RexWindowExclusion argument; update the call in method
embedExistingCollationsIntoOver (where context.rexBuilder.makeOver is invoked
for RexOver) to insert RexWindowExclusion.EXCLUDE_NO_OTHER between
window.getUpperBound() and window.isRows(), preserving the rest of the
parameters (type, agg operator, operands, partitionKeys,
orderKeys/rexCollations, lowerBound, upperBound, exclude, isRows,
allowPartial(true), isNullWhenCountZero(false), over.isDistinct(),
over.ignoreNulls()) so the parameter ordering matches RexBuilder.makeOver.

Comment on lines 293 to 303
// Handle empty keys case - Calcite ARRAY[] requires at least 1 element
if (keys.isEmpty()) {
return createEmptyMap(context);
}
RexNode keysArray = getStringLiteralArray(keys, context);
List<RexNode> values =
keys.stream().map(key -> context.relBuilder.field(key)).collect(Collectors.toList());
keys.stream()
.map(key -> context.relBuilder.cast(context.relBuilder.field(key), SqlTypeName.VARCHAR))
.collect(Collectors.toList());
RexNode valuesArray = makeArray(values, context);
return context.rexBuilder.makeCall(BuiltinFunctionName.MAP_FROM_ARRAYS, keysArray, valuesArray);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cd /repo && find . -name "DynamicFieldsHelper.java" -type f

Repository: opensearch-project/sql

Length of output: 121


🏁 Script executed:

cd /repo && wc -l core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java

Repository: opensearch-project/sql

Length of output: 121


🏁 Script executed:

cd /repo && cat -n core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '280,330p'

Repository: opensearch-project/sql

Length of output: 121


🏁 Script executed:

cd /repo && cat -n core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '1,50p'

Repository: opensearch-project/sql

Length of output: 121


🏁 Script executed:

cd /repo && grep -n "createEmptyMap\|MAP_FROM_ARRAYS\|MAP_APPEND" core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java

Repository: opensearch-project/sql

Length of output: 121


🏁 Script executed:

cd /repo && find . -path "*/test*" -name "*DynamicFieldsHelperTest*" -type f

Repository: opensearch-project/sql

Length of output: 121


🏁 Script executed:

find . -name "DynamicFieldsHelper.java" -type f

Repository: opensearch-project/sql

Length of output: 138


🏁 Script executed:

wc -l ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java

Repository: opensearch-project/sql

Length of output: 142


🏁 Script executed:

cat -n ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '280,330p'

Repository: opensearch-project/sql

Length of output: 2638


🏁 Script executed:

cat -n ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '1,100p'

Repository: opensearch-project/sql

Length of output: 4829


🏁 Script executed:

grep -n "createEmptyMap\|MAP_FROM_ARRAYS\|MAP_APPEND" ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java

Repository: opensearch-project/sql

Length of output: 649


🏁 Script executed:

find . -path "*/test*" -name "*DynamicFieldsHelperTest*" -type f

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

find . -name "*DynamicFieldsHelper*" -type f

Repository: opensearch-project/sql

Length of output: 138


🏁 Script executed:

find . -path "*/test/*" -name "*DynamicFields*" -type f

Repository: opensearch-project/sql

Length of output: 157


🏁 Script executed:

grep -r "DynamicFieldsHelper" --include="*Test.java" | head -20

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

cat -n ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '160,250p'

Repository: opensearch-project/sql

Length of output: 4505


🏁 Script executed:

grep -n "MAP_FROM_ARRAYS\|MAP_APPEND" ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java -B 2 -A 2

Repository: opensearch-project/sql

Length of output: 1275


🏁 Script executed:

grep -r "getFieldsAsMap" --include="*.java"

Repository: opensearch-project/sql

Length of output: 676


🏁 Script executed:

find . -path "*/test/*" -name "*.java" -exec grep -l "createEmptyMap\|getFieldsAsMap" {} \;

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

find . -name "pom.xml" -type f | head -5

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

cat ./pom.xml | grep -A 5 "calcite"

Repository: opensearch-project/sql

Length of output: 107


🏁 Script executed:

find . -name "pom.xml" -type f -exec grep -l "calcite" {} \;

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

grep -r "MAP_APPEND\|MAP_FROM_ARRAYS" --include="*.java" | head -30

Repository: opensearch-project/sql

Length of output: 3167


🏁 Script executed:

find . -path "*/test/*" -name "*.java" -exec grep -l "MAP_APPEND\|MAP_FROM_ARRAYS" {} \;

Repository: opensearch-project/sql

Length of output: 155


🏁 Script executed:

grep -r "makeNullLiteral" --include="*.java" | head -20

Repository: opensearch-project/sql

Length of output: 3067


🏁 Script executed:

cat -n ./integ-test/src/test/java/org/opensearch/sql/calcite/standalone/MapAppendFunctionIT.java

Repository: opensearch-project/sql

Length of output: 6404


🏁 Script executed:

cat -n ./core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MapAppendFunctionImpl.java

Repository: opensearch-project/sql

Length of output: 4823


🏁 Script executed:

find . -name "pom.xml" -type f | xargs grep -l "calcite" | head -1 | xargs cat | grep -A 5 "calcite-core\|calcite.version"

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

find . -name "*.xml" -type f | xargs grep -l "calcite" | head -1

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

find . -name "pom.xml" -type f -exec grep -l "properties\|version" {} \; | head -3

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

cat ./pom.xml 2>/dev/null || find . -maxdepth 2 -name "pom.xml" -type f -exec head -30 {} \;

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

ls -la | grep -i pom

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

find . -maxdepth 3 -name "pom.xml" -type f

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

grep -n "private static RexNode getFieldsAsMap\|private static RexNode createEmptyMap" core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java -A 1

Repository: opensearch-project/sql

Length of output: 325


🏁 Script executed:

find . -path "*/test/*" -name "*.java" -exec grep -l "getFieldsAsMap\|createEmptyMap" {} \;

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

grep -r "excludeMetaFields\|isMetadataField" core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java -B 2 -A 2

Repository: opensearch-project/sql

Length of output: 1003


🏁 Script executed:

find . -path "*/test/*" -name "*.java" | xargs grep -l "DynamicFieldsHelper" 2>/dev/null

Repository: opensearch-project/sql

Length of output: 50


🏁 Script executed:

grep -n "new DynamicFieldsHelper\|DynamicFieldsHelper\." core/src/main/java/org/opensearch/sql/calcite/*.java | head -20

Repository: opensearch-project/sql

Length of output: 2272


🏁 Script executed:

git log --oneline -20 -- core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java 2>/dev/null | head -10

Repository: opensearch-project/sql

Length of output: 134


🏁 Script executed:

git show HEAD:core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java 2>/dev/null | grep -n "createEmptyMap\|getFieldsAsMap" | head -5

Repository: opensearch-project/sql

Length of output: 450


🏁 Script executed:

cat -n core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '287,315p'

Repository: opensearch-project/sql

Length of output: 1606


Add unit tests for new helper functions per coding guidelines.

The coding guidelines require that new functions in core/src/main/java/**/*.java must have unit tests in the same commit. Both createEmptyMap() and getFieldsAsMap() are new private utilities and need test coverage.

Additionally, note that createEmptyMap() returns a NULL literal (not an actual empty map), which is semantically different from its JavaDoc comment "Create an empty map by casting NULL to MAP type". However, this is not a functional bug: MAP_APPEND has explicit null-handling logic (see MapAppendFunctionImpl.mapAppend() lines 69-81) that returns the non-null map when one argument is NULL, so the dynamic fields map will not be wiped out. The existing test suite confirms this behavior (MapAppendFunctionIT lines 87-113). Still, consider updating the comment for clarity.

🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java`
around lines 293 - 303, Add unit tests covering the two new private helpers
createEmptyMap() and getFieldsAsMap(): write tests that exercise
createEmptyMap() returning a NULL-cast-to-MAP literal and verify consumer
behavior (e.g., MapAppendFunctionImpl.mapAppend() preserves non-null maps when
one arg is NULL), and tests for getFieldsAsMap() producing the expected
MAP_FROM_ARRAYS RexNode for given keys; also update the JavaDoc on
createEmptyMap() to clarify it returns a NULL literal cast to MAP (not a
concrete empty map) so the comment matches behavior. Ensure tests live alongside
other core Java tests and reference the helper methods by name (createEmptyMap,
getFieldsAsMap) when locating/covering the code.

Comment on lines +306 to +313
/** Create an empty map by casting NULL to MAP type */
private static RexNode createEmptyMap(CalcitePlanContext context) {
RelDataType varcharType =
context.rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR);
RelDataType mapType =
context.rexBuilder.getTypeFactory().createMapType(varcharType, varcharType);
return context.rexBuilder.makeNullLiteral(mapType);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add unit test coverage for empty-keys map creation.

A new helper function was added to core/src/main/java; please add a unit test that exercises the empty-keys path (and ensures _MAP is not unexpectedly NULL).

As per coding guidelines: New functions MUST have unit tests in the same commit.

🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java`
around lines 306 - 313, Add a unit test that exercises
DynamicFieldsHelper.createEmptyMap to cover the empty-keys map creation path and
assert the returned RexNode is a non-null NULL literal of MAP type (i.e., not an
unexpected Java null and its type is a MAP of VARCHAR->VARCHAR). Create a test
in the corresponding test class (e.g., DynamicFieldsHelperTest) that constructs
or mocks a CalcitePlanContext with a RexBuilder/TypeFactory (or reuse existing
test helpers), calls createEmptyMap(context), and asserts the result is not null
and result.getType().isMap() (and that the map key/value types are VARCHAR).
Ensure the new test method name clearly indicates it verifies empty-keys map
creation (e.g., testCreateEmptyMapReturnsMapNullLiteral).

Comment on lines +370 to +388
if (dateCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_DATE, nullable);
} else if (timeCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_TIME, nullable);
}
// There are cases where UDT IP interleaves with its intermediate SQL type for validation
// OTHER, we check otherCount to patch such cases
else if (ipCount + nullCount == types.size() || otherCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_IP, nullable);
} else if (binaryCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_BINARY, nullable);
}
// There exists a mix of time, date, and timestamp (and optionally null)
else if (binaryCount == 0 && ipCount == 0) {
udt = createUDT(ExprUDT.EXPR_TIMESTAMP, nullable);
} else {
udt = createSqlType(SqlTypeName.VARCHAR, nullable);
}
return udt;
Copy link
Contributor

@coderabbitai coderabbitai bot Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Edge case: all-NULL input may incorrectly return DATE UDT.

When types contains only NULL types, nullCount == types.size() and dateCount + nullCount == types.size() evaluates to true (since dateCount == 0), causing the method to return EXPR_DATE. This may not be the intended behavior—an all-NULL list should likely return NULL type or delegate to the standard leastRestrictive.

🐛 Proposed fix
       if (anyCount == 0) {
         RelDataType udt;
         boolean nullable = nullableCount > 0 || nullCount > 0;
-        if (dateCount + nullCount == types.size()) {
+        if (dateCount > 0 && dateCount + nullCount == types.size()) {
           udt = createUDT(ExprUDT.EXPR_DATE, nullable);
-        } else if (timeCount + nullCount == types.size()) {
+        } else if (timeCount > 0 && timeCount + nullCount == types.size()) {
           udt = createUDT(ExprUDT.EXPR_TIME, nullable);
         }
         // There are cases where UDT IP interleaves with its intermediate SQL type for validation
         // OTHER, we check otherCount to patch such cases
-        else if (ipCount + nullCount == types.size() || otherCount + nullCount == types.size()) {
+        else if ((ipCount > 0 && ipCount + nullCount == types.size()) || (otherCount > 0 && otherCount + nullCount == types.size())) {
           udt = createUDT(ExprUDT.EXPR_IP, nullable);
-        } else if (binaryCount + nullCount == types.size()) {
+        } else if (binaryCount > 0 && binaryCount + nullCount == types.size()) {
           udt = createUDT(ExprUDT.EXPR_BINARY, nullable);
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (dateCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_DATE, nullable);
} else if (timeCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_TIME, nullable);
}
// There are cases where UDT IP interleaves with its intermediate SQL type for validation
// OTHER, we check otherCount to patch such cases
else if (ipCount + nullCount == types.size() || otherCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_IP, nullable);
} else if (binaryCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_BINARY, nullable);
}
// There exists a mix of time, date, and timestamp (and optionally null)
else if (binaryCount == 0 && ipCount == 0) {
udt = createUDT(ExprUDT.EXPR_TIMESTAMP, nullable);
} else {
udt = createSqlType(SqlTypeName.VARCHAR, nullable);
}
return udt;
if (dateCount > 0 && dateCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_DATE, nullable);
} else if (timeCount > 0 && timeCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_TIME, nullable);
}
// There are cases where UDT IP interleaves with its intermediate SQL type for validation
// OTHER, we check otherCount to patch such cases
else if ((ipCount > 0 && ipCount + nullCount == types.size()) || (otherCount > 0 && otherCount + nullCount == types.size())) {
udt = createUDT(ExprUDT.EXPR_IP, nullable);
} else if (binaryCount > 0 && binaryCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_BINARY, nullable);
}
// There exists a mix of time, date, and timestamp (and optionally null)
else if (binaryCount == 0 && ipCount == 0) {
udt = createUDT(ExprUDT.EXPR_TIMESTAMP, nullable);
} else {
udt = createSqlType(SqlTypeName.VARCHAR, nullable);
}
return udt;
🤖 Prompt for AI Agents
In
`@core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java`
around lines 370 - 388, In OpenSearchTypeFactory, add an explicit early check
for the all-NULL case before the current conditional chain: if nullCount ==
types.size() return the NULL SQL type (or delegate to the existing
leastRestrictive behavior) instead of falling through to the date/time/ip
checks; update the method that builds udt (the block that calls
createUDT(ExprUDT.EXPR_DATE, ...), createUDT(...EXPR_TIME...),
createUDT(...EXPR_IP...), createUDT(...EXPR_BINARY...), and
createSqlType(SqlTypeName.VARCHAR, ...)) to handle the all-null case first so an
all-NULL input does not incorrectly yield ExprUDT.EXPR_DATE.

Copy link
Collaborator Author

@yuancu yuancu Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai nope, all null casts won't pass the initial condition if (types.stream().anyMatch(OpenSearchTypeUtil::isUserDefinedType)), thus won't fall into the date UDT case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuancu You’re right—thanks for the correction. Because the UDT branch in OpenSearchTypeFactory.leastRestrictive() executes only when at least one input type is a UDT (via OpenSearchTypeUtil.isUserDefinedType), an all-NULL list won’t enter that block and will fall back to the standard leastRestrictive(types, PplTypeCoercionRule.assignmentInstance()) path. The earlier suggestion about EXPR_DATE on all-NULL was incorrect—no change needed.

As a minor guardrail, consider a unit test asserting the all-NULL case takes the standard path, but otherwise this looks good.


✏️ Learnings added
Learnt from: yuancu
Repo: opensearch-project/sql PR: 4892
File: core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java:370-388
Timestamp: 2026-01-29T06:05:23.585Z
Learning: OpenSearchTypeFactory.leastRestrictive(List<RelDataType>) enters the UDT-specific logic only if any input is a user-defined type (OpenSearchTypeUtil.isUserDefinedType). An all-NULL input list does not enter this branch and uses the standard leastRestrictive with PplTypeCoercionRule.assignmentInstance() instead.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2026-01-26T22:19:54.446Z
Learning: Applies to **/calcite/**/*.java : Ensure version compatibility with Calcite in integration code

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2026-01-26T22:19:54.446Z
Learning: Applies to **/calcite/**/*.java : Document Calcite-specific workarounds in code comments

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2026-01-26T22:19:54.446Z
Learning: Applies to **/*.java : Maintain Java 11 compatibility when possible for OpenSearch 2.x support

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)

393-413: Add complete JavaDoc for these public APIs.

resolve/resolveAgg lack JavaDoc, and the registerExternal* methods don’t include full tag coverage. Please add @param, @return, and @throws as required.
As per coding guidelines: Public methods MUST have JavaDoc with @param, @return, and @throws.

Also applies to: 428-475

🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 1899-1913: The current embedExistingCollationsIntoOver logic
always overwrites window ORDER BY keys (rexCollations) when building RexOver,
dropping any explicit window ordering; change the visitOver in
embedExistingCollationsIntoOver (the code that calls context.rexBuilder.makeOver
with rexCollations) to first inspect the RexWindow (window) for existing order
keys and only use the plan collation (rexCollations) when the window has no
explicit ordering, otherwise pass the window's existing order keys through
unchanged so explicit window ORDER BY is preserved.

In
`@core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java`:
- Around line 477-483: The file references UDFOperandMetadata in the
INTERNAL_PATTERN declaration inside PPLBuiltinOperators
(createUserDefinedAggFunction call for LogPatternAggFunction "pattern") but
missing its import causes compilation failure; add the proper import for
UDFOperandMetadata at the top of the file (matching the package where
UDFOperandMetadata is defined) so the reference in the INTERNAL_PATTERN constant
resolves.
- Around line 494-520: Add unit tests covering the MAP_FROM_ARRAYS function
declared as PPLBuiltinOperators.MAP_FROM_ARRAYS: create tests that (1) verify
correct map output when given two arrays of matching length (e.g., keys and
values), (2) assert behavior with empty arrays, (3) assert handling of null
inputs and null elements, and (4) validate behavior when key/value arrays have
mismatched lengths (expected error or defined behavior). Put tests in the same
test suite that covers other PPLBuiltinOperators (reuse existing test utilities
used for ATAN/REGEXP/MAP_CONCAT) and invoke the function through the same
evaluation path used by other function tests so the SqlFunction resolution and
type handling (SqlLibraryOperators -> PPLBuiltinOperators.MAP_FROM_ARRAYS) are
exercised.

In
`@core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java`:
- Around line 414-425: registerExternalAggOperator currently calls
UserDefinedFunctionUtils.makeAggregateCall with List.of(field) which NPEs for
zero-arg aggregates when field is null; modify the AggHandler lambda in
registerExternalAggOperator to pass an empty list (e.g., Collections.emptyList()
or List.of()) when field==null and List.of(field) only when field!=null, then
call makeAggregateCall(aggFunction, fieldList, argList, ctx.relBuilder) so
zero-arg external aggregates (like count()) are handled safely; keep the same
functionName->handler registration in aggExternalFunctionRegistry.

Comment on lines +1899 to +1913
public RexNode visitOver(RexOver over) {
RexWindow window = over.getWindow();
return context.rexBuilder.makeOver(
over.getType(),
over.getAggOperator(),
over.getOperands(),
window.partitionKeys,
rexCollations,
window.getLowerBound(),
window.getUpperBound(),
window.isRows(),
true,
false,
over.isDistinct(),
over.ignoreNulls());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full context around the visitOver method
cat -n core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | sed -n '1880,1930p'

Repository: opensearch-project/sql

Length of output: 2823


🏁 Script executed:

# Search for where rexCollations is defined in the visitOver context
grep -n "rexCollations" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -20

Repository: opensearch-project/sql

Length of output: 180


🏁 Script executed:

# Look for the broader method context to understand how rexCollations is populated
ast-grep --pattern 'visitOver($$$) {
  $$$
}'

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

# Search for RexWindow.orderKeys references
rg "window\.orderKeys" core/src/main/java/org/opensearch/sql/calcite/

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

# Search for RexWindow creation or usage patterns
rg "RexWindow" core/src/main/java/org/opensearch/sql/calcite/ -A 3 -B 1

Repository: opensearch-project/sql

Length of output: 14719


🏁 Script executed:

# Look for similar patterns of window handling elsewhere in the file
rg "makeOver\(" core/src/main/java/org/opensearch/sql/calcite/ -B 5 -A 5

Repository: opensearch-project/sql

Length of output: 12259


🏁 Script executed:

# Check if there are any imports or usage of RexWindow that show field names
grep -n "import.*RexWindow\|RexWindow\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Repository: opensearch-project/sql

Length of output: 159


🏁 Script executed:

# Search for window.orderKeys usage in the codebase
rg "\.orderKeys" core/src/main/java/ -t java

Repository: opensearch-project/sql

Length of output: 48


🏁 Script executed:

# Search for window field access patterns
rg "window\.(partition|order|collation)" core/src/main/java/ -t java

Repository: opensearch-project/sql

Length of output: 188


🏁 Script executed:

# Check pom.xml for Calcite version
cat pom.xml | grep -A 2 "calcite" | head -20

Repository: opensearch-project/sql

Length of output: 105


🌐 Web query:

Apache Calcite RexWindow class API orderKeys field structure

💡 Result:

In org.apache.calcite.rex.RexWindow, the orderKeys field is:

  • public final ImmutableList<RexFieldCollation> orderKeys — an ordered list describing the window’s ORDER BY items. [1]

Each element (RexFieldCollation) is structured as:

  • RexFieldCollation extends Pair<RexNode, ImmutableSet<SqlKind>>
    • left (RexNode): the sort expression (e.g., an input ref, computed expression, etc.)
    • right (Set<SqlKind>): sort flags such as DESCENDING and null-direction flags (e.g., NULLS FIRST/LAST) [2]

Convenience accessors on RexFieldCollation interpret those flags as enums:

  • getDirection()RelFieldCollation.Direction (ASC/DESC)
  • getNullDirection()RelFieldCollation.NullDirection [2]

Sources: [1] [2]


🏁 Script executed:

# Search for RexBuilder.makeOver method to see its signature
rg "makeOver\(" --type java -B 2 -A 8 | head -100

Repository: opensearch-project/sql

Length of output: 10225


🏁 Script executed:

# Look for patterns where window.orderKeys is checked or used
find . -name "*.java" -type f | xargs grep -l "orderKeys" 2>/dev/null | head -10

Repository: opensearch-project/sql

Length of output: 322


🏁 Script executed:

# Search for the method name "embedExistingCollationsIntoOver" and surrounding context
rg "embedExistingCollationsIntoOver" -B 5 -A 30

Repository: opensearch-project/sql

Length of output: 8562


🌐 Web query:

RexBuilder.makeOver signature parameter orderKeys Calcite

💡 Result:

In Apache Calcite, RexBuilder.makeOver takes orderKeys after partitionKeys.

Signature (current javadocAggregate):
makeOver(RelDataType type, SqlAggFunction operator, List<RexNode> exprs, List<RexNode> partitionKeys, ImmutableList<RexFieldCollation> orderKeys, RexWindowBound lowerBound, RexWindowBound upperBound, RexWindowExclusion exclude, boolean rows, boolean allowPartial, boolean nullWhenCountZero, boolean distinct, boolean ignoreNulls) [1]

Older/shorter overload (no exclude, no ignoreNulls):
makeOver(RelDataType type, SqlAggFunction operator, List<RexNode> exprs, List<RexNode> partitionKeys, ImmutableList<RexFieldCollation> orderKeys, RexWindowBound lowerBound, RexWindowBound upperBound, boolean physical, boolean allowPartial, boolean nullWhenCountZero, boolean distinct) [2]

So the relevant order is: exprs, partitionKeys, orderKeys, then bounds/flags. [1][2]


Preserve existing window ORDER BY when embedding plan collations.

The embedExistingCollationsIntoOver method unconditionally replaces all RexOver order keys with the plan collation, dropping any explicit window ORDER BY. This can change query semantics. Only inject the plan collation when the window lacks existing order keys.

🛠️ Suggested fix
-                        return context.rexBuilder.makeOver(
+                        ImmutableList<@NonNull RexFieldCollation> orderKeys =
+                            window.orderKeys.isEmpty() ? rexCollations : window.orderKeys;
+                        return context.rexBuilder.makeOver(
                             over.getType(),
                             over.getAggOperator(),
                             over.getOperands(),
                             window.partitionKeys,
-                            rexCollations,
+                            orderKeys,
                             window.getLowerBound(),
                             window.getUpperBound(),
                             window.isRows(),
                             true,
                             false,
                             over.isDistinct(),
                             over.ignoreNulls());
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`
around lines 1899 - 1913, The current embedExistingCollationsIntoOver logic
always overwrites window ORDER BY keys (rexCollations) when building RexOver,
dropping any explicit window ordering; change the visitOver in
embedExistingCollationsIntoOver (the code that calls context.rexBuilder.makeOver
with rexCollations) to first inspect the RexWindow (window) for existing order
keys and only use the plan collation (rexCollations) when the window has no
explicit ordering, otherwise pass the window's existing order keys through
unchanged so explicit window ORDER BY is preserved.

Comment on lines 414 to +425
public void registerExternalAggOperator(
BuiltinFunctionName functionName, SqlUserDefinedAggFunction aggFunction) {
PPLTypeChecker typeChecker =
wrapSqlOperandTypeChecker(aggFunction.getOperandTypeChecker(), functionName.name(), true);
CalciteFuncSignature signature = new CalciteFuncSignature(functionName.getName(), typeChecker);
if (aggExternalFunctionRegistry.containsKey(functionName)) {
logger.warn(
String.format(
Locale.ROOT, "Aggregate function %s is registered multiple times", functionName));
}
AggHandler handler =
(distinct, field, argList, ctx) ->
UserDefinedFunctionUtils.makeAggregateCall(
aggFunction, List.of(field), argList, ctx.relBuilder);
aggExternalFunctionRegistry.put(functionName, Pair.of(signature, handler));
}

public List<RexNode> validateAggFunctionSignature(
BuiltinFunctionName functionName,
RexNode field,
List<RexNode> argList,
RexBuilder rexBuilder) {
var implementation = getImplementation(functionName);
return validateFunctionArgs(implementation, functionName, field, argList, rexBuilder);
aggExternalFunctionRegistry.put(functionName, handler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, let's examine the file structure and locate the method in question
cd "$(git rev-parse --show-toplevel)" 2>/dev/null || cd .

# Find the file and show context around the method
fd -t f "PPLFuncImpTable.java" --exec cat -n {} \; | head -100

Repository: opensearch-project/sql

Length of output: 8647


🏁 Script executed:

#!/bin/bash
# Search for all calls to registerExternalAggOperator
rg -n "registerExternalAggOperator" --type=java -C 5

Repository: opensearch-project/sql

Length of output: 3156


🏁 Script executed:

#!/bin/bash
# Search for AggHandler usages to understand how field is passed
rg -n "AggHandler" --type=java -C 3

Repository: opensearch-project/sql

Length of output: 13249


🏁 Script executed:

#!/bin/bash
# Look for resolveAgg and similar methods that invoke handlers
rg -n "aggExternalFunctionRegistry|resolveAgg" --type=java -C 5

Repository: opensearch-project/sql

Length of output: 10778


🏁 Script executed:

#!/bin/bash
# Look at the built-in aggregate handler implementations in AggBuilder
cat -n core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java | sed -n '996,1050p'

Repository: opensearch-project/sql

Length of output: 2934


🏁 Script executed:

#!/bin/bash
# Search for calls to resolveAgg to see how field is passed
rg -n "\.resolveAgg\(" --type=java -B 2 -A 2

Repository: opensearch-project/sql

Length of output: 574


🏁 Script executed:

#!/bin/bash
# Search for zero-arg aggregate functions or COUNT() without field
rg -n "COUNT|count\(\s*\)" --type=java | head -30

Repository: opensearch-project/sql

Length of output: 4974


🏁 Script executed:

#!/bin/bash
# Look for how COUNT(*) is handled in plan building
rg -n "RexBuilder\|relBuilder.*COUNT\|COUNT.*null" --type=java -B 2 -A 2 | head -50

Repository: opensearch-project/sql

Length of output: 48


Guard against null field for external aggregates.

List.of(field) throws NPE when field is null, which occurs for zero-arg aggregates like count(). The built-in COUNT() aggregate explicitly handles this case; external aggregates should follow the same pattern to prevent planner crashes.

🛠️ Suggested fix
-    AggHandler handler =
-        (distinct, field, argList, ctx) ->
-            UserDefinedFunctionUtils.makeAggregateCall(
-                aggFunction, List.of(field), argList, ctx.relBuilder);
+    AggHandler handler =
+        (distinct, field, argList, ctx) -> {
+          List<RexNode> fields = field == null ? List.of() : List.of(field);
+          return UserDefinedFunctionUtils.makeAggregateCall(
+              aggFunction, fields, argList, ctx.relBuilder);
+        };
🤖 Prompt for AI Agents
In
`@core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java`
around lines 414 - 425, registerExternalAggOperator currently calls
UserDefinedFunctionUtils.makeAggregateCall with List.of(field) which NPEs for
zero-arg aggregates when field is null; modify the AggHandler lambda in
registerExternalAggOperator to pass an empty list (e.g., Collections.emptyList()
or List.of()) when field==null and List.of(field) only when field!=null, then
call makeAggregateCall(aggFunction, fieldList, argList, ctx.relBuilder) so
zero-arg external aggregates (like count()) are handled safely; keep the same
functionName->handler registration in aggExternalFunctionRegistry.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercion.java`:
- Line 47: Add unit tests for PplTypeCoercion and add missing JavaDoc: write
focused JUnit tests exercising PplTypeCoercion (class PplTypeCoercion) methods
including null inputs, error paths and boundary cases for datetime and IP
conversions, numeric widening, and SAFE_CAST behavior (including safe/unsafe
results), and assert expected types/results and exceptions; also add JavaDoc to
the PplTypeCoercion constructor and all public methods with clear descriptions
and `@param/`@return (and `@throws` where applicable) to match project standards so
reviewers can see intended behavior and edge-case handling.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In
`@core/src/test/java/org/opensearch/sql/calcite/validate/OpenSearchSparkSqlDialectTest.java`:
- Around line 30-41: Add two unit tests in OpenSearchSparkSqlDialectTest that
call OpenSearchSparkSqlDialect.DEFAULT.getCastSpec: one that passes a non‑IP
RelDataType (e.g., TYPE_FACTORY.createSqlType(SqlTypeName.INTEGER)) and asserts
the method returns null (or not a SqlDataTypeSpec) for unsupported types, and
another that calls getCastSpec(null) and asserts it throws a
NullPointerException (using assertThrows). Reference the existing test pattern
that inspects SqlDataTypeSpec (castSpec/SqlDataTypeSpec) so the new tests mirror
the IP test but cover non‑IP and null/boundary cases.

In
`@core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionRuleTest.java`:
- Around line 40-50: The test testVarcharToOtherCoercion's comment is incorrect:
it states "coercible from OTHER" while the code checks
mapping.get(SqlTypeName.OTHER) contains VARCHAR/CHAR (i.e., VARCHAR can be
coerced to OTHER); update the inline comment above the assertion to correctly
read that VARCHAR is coercible to OTHER (or reword to "OTHER should allow
coercion to VARCHAR/CHAR") so the comment matches the logic in
PplTypeCoercionRule.instance() and mapping.get(SqlTypeName.OTHER).

In
`@core/src/test/java/org/opensearch/sql/calcite/validate/shuttles/SkipRelValidationShuttleTest.java`:
- Around line 33-36: Add unit tests to SkipRelValidationShuttleTest to cover
NULL, empty, and invalid cases for the skip predicate logic: add tests that pass
null operands to the shuttle (e.g., simulate a RexCall or RexNode with a null
operand), tests that pass an empty operand list, and negative tests for invalid
aggregate-input shapes (e.g., a single CASE expression or a non-project input)
to assert the shuttle rejects or throws as expected; create clearly named test
methods (e.g., testSkipPredicateWithNullOperand,
testSkipPredicateWithEmptyOperands, testSkipPredicateWithInvalidAggregateInput)
and exercise the same validation paths used by the existing happy-path tests so
you get NULL/boundary/error coverage for SkipRelValidationShuttle and relevant
methods in that class.

In
`@core/src/test/java/org/opensearch/sql/calcite/validate/ValidationUtilsTest.java`:
- Around line 209-232: Add a unit test in ValidationUtilsTest to cover a
null-exception input to ValidationUtils.tolerantValidationException: create a
test method (e.g., testTolerantValidationExceptionNullException) that passes
null as the exception argument and asserts the expected boolean result (likely
false based on other tests); place it alongside the existing tests in the same
class so the null-exception case is covered per the NULL input testing
guideline.
- Around line 98-207: Add null-input unit tests for
ValidationUtils.createUDTWithAttributes: verify that passing sourceType == null
and typeName == null to both overloads (the one taking ExprUDT and the one
taking SqlTypeName) throws a NullPointerException (use
assertThrows(NullPointerException.class, ...)). Add four tests in
ValidationUtilsTest such as testCreateUDTWithAttributes_NullSourceType_ExprUDT,
testCreateUDTWithAttributes_NullTypeName_ExprUDT,
testCreateUDTWithAttributes_NullSourceType_SqlTypeName, and
testCreateUDTWithAttributes_NullTypeName_SqlTypeName that call
ValidationUtils.createUDTWithAttributes(TYPE_FACTORY or null,
null/ExprUDT.EXPR_DATE) and the SqlTypeName variant accordingly and assert the
expected exception.
🧹 Nitpick comments (3)
core/src/test/java/org/opensearch/sql/calcite/validate/shuttles/SqlRewriteShuttleTest.java (2)

36-94: Consider adding NULL input test for identifier visit.

Per coding guidelines, NULL input tests should be included for all new functions. Consider adding a test for how shuttle.visit() handles a null SqlIdentifier input, or document that null inputs are not expected at this layer.

`@Test`
public void testVisitIdentifierWithNull() {
    SqlNode result = shuttle.visit((SqlIdentifier) null);
    // Assert expected behavior (null return, exception, etc.)
}

174-189: Consider adding symmetric test for NOT IN with non-SqlNodeList.

For completeness and symmetry with the IN tests, consider adding a test that verifies NOT IN with a non-SqlNodeList first operand also remains unchanged.

`@Test`
public void testVisitCallNotInWithNonSqlNodeListUnchanged() {
    SqlIdentifier column = new SqlIdentifier("my_column", SqlParserPos.ZERO);
    SqlIdentifier subquery = new SqlIdentifier("subquery", SqlParserPos.ZERO);
    SqlBasicCall notInCall =
        new SqlBasicCall(SqlStdOperatorTable.NOT_IN, List.of(column, subquery), SqlParserPos.ZERO);

    SqlNode result = shuttle.visit(notInCall);

    assertInstanceOf(SqlBasicCall.class, result);
    SqlBasicCall resultCall = (SqlBasicCall) result;
    assertEquals(SqlKind.NOT_IN, resultCall.getKind());
    assertInstanceOf(SqlIdentifier.class, resultCall.getOperandList().get(0));
}
core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionRuleTest.java (1)

17-67: Consider adding tests for lenientInstance() and assignmentInstance().

The PplTypeCoercionRule class exposes three public factory methods: instance(), lenientInstance(), and assignmentInstance(). Currently, only instance() is tested. Adding similar tests for the other two would ensure the PPL-specific mappings are correctly merged with both SqlTypeCoercionRule.lenientInstance() and SqlTypeAssignmentRule.instance() base mappings.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In
`@core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java`:
- Around line 173-184: The test method
testLeastRestrictive_ipUdtAndOther_returnsIpUdt is using SqlTypeName.NULL for
the second type which duplicates the IP+NULL scenario; change the second type to
SqlTypeName.OTHER by creating RelDataType otherType =
TYPE_FACTORY.createSqlType(SqlTypeName.OTHER) and pass List.of(ipUdt, otherType)
into TYPE_FACTORY.leastRestrictive so the code exercises the IP+OTHER branch;
update assertions remain the same (use OpenSearchTypeUtil.isIp and
result.isNullable()) and ensure the variable name (otherType) replaces nullType
in that call.
- Around line 20-22: Add an integration-style test method (e.g.,
testLeastRestrictiveViaSqlPipeline) to OpenSearchTypeFactoryTest that drives
leastRestrictive through Calcite's SQL->validation->Rel conversion pipeline:
construct a test schema with the same UDT types used by the existing unit tests,
parse a simple SQL SELECT that uses expressions/columns with those UDTs, run the
SqlValidator (using the test's SqlValidator and
OpenSearchSqlOperatorTable/SqlStdOperatorTable), convert the validated SqlNode
to a RelNode via SqlToRelConverter, and assert that the resulting RelNode types
(or the validator/typeFactory outputs) resolved the common type using
leastRestrictive as expected; reference the class OpenSearchTypeFactoryTest, the
type factory instance used in existing tests, and the Calcite pipeline classes
SqlParser/SqlValidator/SqlToRelConverter to locate where to integrate this new
test.
- Around line 24-278: Add three unit tests to OpenSearchTypeFactoryTest that
cover null/empty inputs for TYPE_FACTORY.leastRestrictive: (1)
testLeastRestrictive_nullInput_returnsNull — call
TYPE_FACTORY.leastRestrictive(null) and assertNull(result); (2)
testLeastRestrictive_listWithNullElement_returnsNull — call
TYPE_FACTORY.leastRestrictive(List.of(TYPE_FACTORY.createUDT(ExprUDT.EXPR_DATE),
null)) and assertNull(result); (3) testLeastRestrictive_emptyList_returnsNull —
call TYPE_FACTORY.leastRestrictive(List.of()) and assertNull(result). Name each
test accordingly and use the existing TYPE_FACTORY and ExprUDT symbols so they
integrate with the current test suite.

In
`@core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionTest.java`:
- Around line 101-113: The test testImplicitCast_preservesNullability currently
checks nullableResult but misses asserting nonNullableResult's nullability;
update the test in PplTypeCoercionTest (method
testImplicitCast_preservesNullability) to assert that nonNullableResult is
non-nullable by adding an assertion like
assertFalse(nonNullableResult.isNullable()) after the existing checks on
nonNullableResult, using the same typeCoercion.implicitCast invocation to verify
non-nullable input yields non-nullable output.
🧹 Nitpick comments (5)
core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeUtilTest.java (2)

19-45: Consider adding NULL input tests for type-checking methods.

Only isScalar (line 342) includes a NULL input test. Per the coding guidelines, NULL input tests should be included for all new functions. Consider adding NULL input tests for isUserDefinedType, isNumericOrCharacter, isDatetime, isDate, isTimestamp, isTime, isCharacter, isIp, and isBinary to verify behavior when passed null.

Example NULL input tests to add
`@Test`
public void testIsUserDefinedType_withNull_returnsFalse() {
    assertFalse(OpenSearchTypeUtil.isUserDefinedType(null));
}

`@Test`
public void testIsNumericOrCharacter_withNull_returnsFalse() {
    assertFalse(OpenSearchTypeUtil.isNumericOrCharacter(null));
}

`@Test`
public void testIsDatetime_withNull_returnsFalse() {
    assertFalse(OpenSearchTypeUtil.isDatetime(null));
}

// Similar tests for isDate, isTimestamp, isTime, isCharacter, isIp, isBinary

As per coding guidelines: "NULL input tests must be included for all new functions".


8-10: Add missing import for java.util.List.

java.util.List.of is used inline at lines 369 and 372. Consider adding an import for consistency with the other imports.

Suggested import
 import static org.junit.jupiter.api.Assertions.assertFalse;
 import static org.junit.jupiter.api.Assertions.assertTrue;
 import static org.opensearch.sql.calcite.utils.OpenSearchTypeFactory.TYPE_FACTORY;

+import java.util.List;
 import org.apache.calcite.rel.type.RelDataType;

Then update lines 369-372:

-            java.util.List.of(
+            List.of(
                 TYPE_FACTORY.createSqlType(SqlTypeName.INTEGER),
                 TYPE_FACTORY.createSqlType(SqlTypeName.VARCHAR)),
-            java.util.List.of("id", "name"));
+            List.of("id", "name"));
core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java (1)

24-25: Align test method names with the required pattern.

Current names use underscores; the guideline expects test<Functionality><Condition> (camel-cased) naming. Please rename consistently (e.g., testLeastRestrictiveDateUdtsOnlyReturnsDateUdt).

As per coding guidelines, "Test naming should follow pattern test".

core/src/test/java/org/opensearch/sql/calcite/validate/shuttles/PplRelToSqlRelShuttleTest.java (2)

94-201: Consider adding boundary condition tests for interval literals.

Per testing guidelines, boundary condition tests (min/max values, zero, negative inputs) should be included for new functions. The interval tests use specific positive values but don't cover edge cases:

  • Zero interval value (e.g., INTERVAL 0 DAY)
  • Negative interval value (if supported)
  • Very large interval values near overflow boundaries

These edge cases could reveal issues with arithmetic operations in the transformation logic.


259-294: Test helper duplicates logic rather than testing actual class.

The createRexShuttle helper reimplements the exact transformation logic from PplRelToSqlRelShuttle's internal visitLiteral method rather than delegating to the actual class. While this approach isolates and tests the algorithm correctly, it doesn't verify that PplRelToSqlRelShuttle itself functions as intended—if the production implementation diverges from this test copy, tests would still pass.

Consider refactoring to instantiate PplRelToSqlRelShuttle and invoke its behavior directly, rather than reimplementing its logic inline in tests.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
@yuancu yuancu requested a review from LantaoJin January 30, 2026 05:19
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
@LantaoJin
Copy link
Member

LantaoJin commented Feb 6, 2026

This PR is too large to review, is it possible to separate into multiple smaller ones?
In future, if we want to merge such a large change, I think we'd better to create a feature branch and review PRs targeted to feature branch, then we can merge the feature branch back to main without review.

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 2 weeks with no activity.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 5723ced.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 4, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 3fd9714.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 5723ced.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 2c831f0.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

yuancu added 4 commits March 6, 2026 15:02
…path PRs

Merge the validation branch with origin/main, resolving conflicts caused
by the revert of dynamic column support (opensearch-project#5139). Spath-related files
(DynamicFieldsHelper, AppendFunctionImpl, spath explain YAMLs) are
deleted to align with the revert. All other conflicts resolved to
preserve the validation round-trip changes from the branch.

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
- Add missing import for OpenSearchSparkSqlDialect in CalcitePPLNoMvTest
  (class moved from ppl to core/validate package)
- Update CalcitePPLSpathTest.testSpathAutoExtractModeWithEval expected plan
  to match validation-deferred behavior (no eager SAFE_CAST)
- Update CalcitePPLNoMvTest.testNoMvNonExistentField to accept
  "inferred array element type" error message

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
The validation round-trip (RelNode → SQL → validate → RelNode) introduces
plan changes such as extra LogicalProject nodes, type annotations on MAP
keys, and reordered aggregate operands. Update YAML expected outputs and
test assertions to match.

- Catch Throwable (not just Exception) in validate() for AssertionError
  from unsupported RelNodes like LogicalGraphLookup
- Update 17 YAML expected output files for calcite and calcite_no_pushdown
- Add 4 new alternative YAML files for non-deterministic plan variants
- Add separate YAML for boolean string literal filter test
- Accept "inferred array element type" error in NoMv missing field test

Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 84f53df.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

calcite calcite migration releated enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Type Conversion - Leverage RelDataType from Calcite in V3 engine. [RFC] Implicit Type Coercion with Calcite

4 participants