Use Calcite's validation system for type checking & coercion#4892
Use Calcite's validation system for type checking & coercion#4892yuancu wants to merge 112 commits intoopensearch-project:mainfrom
Conversation
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughIntroduces extensive Calcite-based PPL validation and type-coercion plumbing, new PPL-specific validator/convertlets/coercion rules, UDT/type utilities, many UDF operand-metadata updates, Rex/Rel shuttles and Rel↔Sql converters, removal of legacy coercion/type-checker code, and large test/expected-output updates across integ and unit tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Rel as RelNode (PPL plan)
participant Shuttle as PplRelToSqlRelShuttle
participant RelToSql as OpenSearchRelToSqlConverter
participant Validator as SqlValidator / PplValidator
participant SqlToRel as OpenSearchSqlToRelConverter
participant Planner as QueryService (convertToCalcitePlan)
Rel->>Shuttle: traverse & normalize RexLiterals
Shuttle->>RelToSql: convert RelNode -> SqlNode
RelToSql->>Validator: validate SqlNode (type coercion, implicit casts)
Validator-->>SqlToRel: produce validated SqlNode
SqlToRel->>Planner: convert validated SqlNode -> RelNode (validated RelNode)
Planner->>Planner: continue planning / optimize using validated RelNode
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
e085f81 to
fc6dd27
Compare
…checking Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> # Conflicts: # core/src/main/java/org/opensearch/sql/executor/QueryService.java # Conflicts: # core/src/main/java/org/opensearch/sql/executor/QueryService.java
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
… logics Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
- 2 more ITs passed in PPLBuiltinFunctionIT Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
- this fix testRand, where desrialization of sarg does not restore its type - todo: update the toRex in ExtendedRelJson to the align with the latest version Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…estamp; (time, timestamp) -> timestamp (1240/1599) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…2/1872) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
- allow type cast - rewrite call to sql compare to custom ip comapre Signed-off-by: Yuanchun Shen <yuanchu@amazon.com> # Conflicts: # core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…1356/1599 | 1476/1915) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…d in mvindex's implementation (1580/2015) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…iting (1579/2015) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…pe hint Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
…e inference (1701/2015) Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (1)
471-492: Add the missing import forUDFOperandMetadata.
UDFOperandMetadatais referenced here but isn’t imported, which will fail compilation unless it’s in the same package. Add the import (or fully qualify the reference).🛠️ Suggested fix
import org.opensearch.sql.calcite.utils.PPLOperandTypes; import org.opensearch.sql.calcite.utils.PPLReturnTypes; +import org.opensearch.sql.calcite.utils.UDFOperandMetadata; import org.opensearch.sql.calcite.utils.UserDefinedFunctionUtils;core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (2)
399-425: Add null guards for external operator registrations.Fail fast if
operator/aggFunctionis null to avoid deferred NPEs during planning.🔧 Proposed fix
public void registerExternalOperator(BuiltinFunctionName functionName, SqlOperator operator) { + if (operator == null) { + throw new IllegalArgumentException( + String.format(Locale.ROOT, "Operator cannot be null for function %s", functionName)); + } if (externalFunctionRegistry.containsKey(functionName)) { logger.warn( String.format(Locale.ROOT, "Function %s is registered multiple times", functionName)); } externalFunctionRegistry.put(functionName, (builder, args) -> builder.makeCall(operator, args)); } public void registerExternalAggOperator( BuiltinFunctionName functionName, SqlUserDefinedAggFunction aggFunction) { + if (aggFunction == null) { + throw new IllegalArgumentException( + String.format(Locale.ROOT, "Agg function cannot be null for %s", functionName)); + } if (aggExternalFunctionRegistry.containsKey(functionName)) { logger.warn( String.format( Locale.ROOT, "Aggregate function %s is registered multiple times", functionName)); } AggHandler handler = (distinct, field, argList, ctx) -> UserDefinedFunctionUtils.makeAggregateCall( aggFunction, List.of(field), argList, ctx.relBuilder); aggExternalFunctionRegistry.put(functionName, handler); }
428-475: Add JavaDoc for the public resolve methods.These public APIs currently lack JavaDoc with
@param/@return(and@throwsif applicable).
As per coding guidelines, Public methods MUST have JavaDoc with@param,@return, and@throws.
🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 1860-1917: The makeOver call in embedExistingCollationsIntoOver is
missing the required RexWindowExclusion argument; update the call in method
embedExistingCollationsIntoOver (where context.rexBuilder.makeOver is invoked
for RexOver) to insert RexWindowExclusion.EXCLUDE_NO_OTHER between
window.getUpperBound() and window.isRows(), preserving the rest of the
parameters (type, agg operator, operands, partitionKeys,
orderKeys/rexCollations, lowerBound, upperBound, exclude, isRows,
allowPartial(true), isNullWhenCountZero(false), over.isDistinct(),
over.ignoreNulls()) so the parameter ordering matches RexBuilder.makeOver.
In `@core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java`:
- Around line 306-313: Add a unit test that exercises
DynamicFieldsHelper.createEmptyMap to cover the empty-keys map creation path and
assert the returned RexNode is a non-null NULL literal of MAP type (i.e., not an
unexpected Java null and its type is a MAP of VARCHAR->VARCHAR). Create a test
in the corresponding test class (e.g., DynamicFieldsHelperTest) that constructs
or mocks a CalcitePlanContext with a RexBuilder/TypeFactory (or reuse existing
test helpers), calls createEmptyMap(context), and asserts the result is not null
and result.getType().isMap() (and that the map key/value types are VARCHAR).
Ensure the new test method name clearly indicates it verifies empty-keys map
creation (e.g., testCreateEmptyMapReturnsMapNullLiteral).
- Around line 293-303: Add unit tests covering the two new private helpers
createEmptyMap() and getFieldsAsMap(): write tests that exercise
createEmptyMap() returning a NULL-cast-to-MAP literal and verify consumer
behavior (e.g., MapAppendFunctionImpl.mapAppend() preserves non-null maps when
one arg is NULL), and tests for getFieldsAsMap() producing the expected
MAP_FROM_ARRAYS RexNode for given keys; also update the JavaDoc on
createEmptyMap() to clarify it returns a NULL literal cast to MAP (not a
concrete empty map) so the comment matches behavior. Ensure tests live alongside
other core Java tests and reference the helper methods by name (createEmptyMap,
getFieldsAsMap) when locating/covering the code.
In
`@core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java`:
- Around line 370-388: In OpenSearchTypeFactory, add an explicit early check for
the all-NULL case before the current conditional chain: if nullCount ==
types.size() return the NULL SQL type (or delegate to the existing
leastRestrictive behavior) instead of falling through to the date/time/ip
checks; update the method that builds udt (the block that calls
createUDT(ExprUDT.EXPR_DATE, ...), createUDT(...EXPR_TIME...),
createUDT(...EXPR_IP...), createUDT(...EXPR_BINARY...), and
createSqlType(SqlTypeName.VARCHAR, ...)) to handle the all-null case first so an
all-NULL input does not incorrectly yield ExprUDT.EXPR_DATE.
🧹 Nitpick comments (3)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSpathCommandIT.java (1)
160-165: Make the dynamic-field test assert extracted values (not just 0 rows).
verifyNumOfRows(result, 0)can pass even if dynamic-field extraction regresses, and it’s brittle if the logs fixture changes. Consider asserting at least one known row/value (or inserting a deterministic fixture) so the test validates actual SPATH extraction behavior.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
1839-1846: Track the new TODO with an issue or reference.This will make the push‑down regression easier to triage later. I can open a follow‑up issue if you want.
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)
332-397: Add JavaDoc for the publicleastRestrictiveoverride.Per coding guidelines, public methods should have JavaDoc with
@param,@return, and@throwstags. This method implements important OpenSearch UDT semantics that differ from standard Calcite behavior and would benefit from documentation.Additionally, the timestamp promotion logic on line 384 (
binaryCount == 0 && ipCount == 0) includes cases wheredateCount + timeCount + timestampCount + nullCount == types.size(), buttimestampCountis never tracked. This appears intentional (any mix of date/time UDTs becomes timestamp), but documenting this behavior would improve clarity.📝 Suggested JavaDoc
`@Override` + /** + * Returns the least restrictive type among the given types, with special handling for OpenSearch + * UDTs. + * + * <p>For UDTs: + * <ul> + * <li>Homogeneous DATE/TIME/IP/BINARY types (plus NULLs) return the corresponding UDT</li> + * <li>Mixed DATE/TIME/TIMESTAMP types are promoted to TIMESTAMP</li> + * <li>Mixed UDT types that cannot be unified return VARCHAR</li> + * </ul> + * + * <p>For standard SQL types, delegates to Calcite with PPL coercion rules, converting CHAR to + * VARCHAR to avoid result padding. + * + * `@param` types the list of types to find the least restrictive common type for + * `@return` the least restrictive type, or null if none exists + */ public `@Nullable` RelDataType leastRestrictive(List<RelDataType> types) {
| /** | ||
| * Embed existing collation into window function's over clauses. | ||
| * | ||
| * <p>Window functions with frame specifications like {@code ROWS n PRECEDING} require ORDER BY to | ||
| * determine row order. Without it, results are non-deterministic. | ||
| * | ||
| * <p>Without this fix, the initial plan has ORDER BY separate from window functions: | ||
| * | ||
| * <pre> | ||
| * LogicalProject(SUM($5) OVER (ROWS 1 PRECEDING)) ← Missing ORDER BY | ||
| * LogicalSort(sort0=[$5]) | ||
| * </pre> | ||
| * | ||
| * <p>This causes problems during validation as the order is not bound to the window. With this | ||
| * fix, sort collations are embeded into each {@code RexOver} window: | ||
| * | ||
| * <pre> | ||
| * LogicalProject(SUM($5) OVER (ORDER BY $5 ROWS 1 PRECEDING)) ← ORDER BY embedded | ||
| * </pre> | ||
| * | ||
| * @param overExpressions Window function expressions (may contain nested {@link RexOver}) | ||
| * @param context Plan context for building RexNodes | ||
| * @return Expressions with ORDER BY embedded in all window specifications | ||
| */ | ||
| private List<RexNode> embedExistingCollationsIntoOver( | ||
| List<RexNode> overExpressions, CalcitePlanContext context) { | ||
| RelCollation existingCollation = context.relBuilder.peek().getTraitSet().getCollation(); | ||
| List<@NonNull RelFieldCollation> relCollations = | ||
| existingCollation == null ? List.of() : existingCollation.getFieldCollations(); | ||
| ImmutableList<@NonNull RexFieldCollation> rexCollations = | ||
| relCollations.stream() | ||
| .map(f -> relCollationToRexCollation(f, context.relBuilder)) | ||
| .collect(ImmutableList.toImmutableList()); | ||
| return overExpressions.stream() | ||
| .map( | ||
| n -> | ||
| n.accept( | ||
| new RexShuttle() { | ||
| @Override | ||
| public RexNode visitOver(RexOver over) { | ||
| RexWindow window = over.getWindow(); | ||
| return context.rexBuilder.makeOver( | ||
| over.getType(), | ||
| over.getAggOperator(), | ||
| over.getOperands(), | ||
| window.partitionKeys, | ||
| rexCollations, | ||
| window.getLowerBound(), | ||
| window.getUpperBound(), | ||
| window.isRows(), | ||
| true, | ||
| false, | ||
| over.isDistinct(), | ||
| over.ignoreNulls()); | ||
| } | ||
| })) | ||
| .collect(Collectors.toList()); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for RexOver class definition and its methods
rg -n "class RexOver" --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Search for makeOver method signature to understand all parameters
rg -n "makeOver\(" --type=java -A 10 | grep -E "makeOver|allowPartial|nullWhenCountZero" | head -50Repository: opensearch-project/sql
Length of output: 1319
🏁 Script executed:
#!/bin/bash
# Check if RexOver has getters for allowPartial or isNullWhenCountZero
rg -n "getAllowPartial|getNullWhenCountZero|isAllowPartial|isNullWhenCountZero|allowPartial|nullWhenCountZero" --type=java | head -40Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Look at the specific method implementation in CalciteRelNodeVisitor
sed -n '1860,1917p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaRepository: opensearch-project/sql
Length of output: 2631
🏁 Script executed:
#!/bin/bash
# Check PlanUtils.makeOver method signature and implementation
sed -n '157,200p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaRepository: opensearch-project/sql
Length of output: 2000
🏁 Script executed:
#!/bin/bash
# Compare other makeOver calls to see parameter patterns
rg -B2 -A12 "rexBuilder.makeOver|PlanUtils.makeOver" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -80Repository: opensearch-project/sql
Length of output: 2920
🏁 Script executed:
#!/bin/bash
# Check ExtendedRelJson.java usage of makeOver (line 448) for reference
sed -n '440,465p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.javaRepository: opensearch-project/sql
Length of output: 1011
🏁 Script executed:
#!/bin/bash
# Search for methods called on RexOver objects to see what properties are available
rg -n "over\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "over\.[a-zA-Z]+" | head -30Repository: opensearch-project/sql
Length of output: 395
🏁 Script executed:
#!/bin/bash
# Check if there are imports or references to Calcite's RexOver methods
rg -n "isAllowPartial|isNullWhenCountZero|allowPartial|nullWhenCountZero" core/src/main/java/ opensearch/src/main/java/Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Look at the Calcite version constraints to understand API compatibility
cat pom.xml | grep -A 5 -B 5 "calcite" | head -50Repository: opensearch-project/sql
Length of output: 105
🏁 Script executed:
#!/bin/bash
# Check if there's any documentation or comments about makeOver parameters
rg -B5 "makeOver" core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java | grep -E "makeOver|//|/\*" | head -40Repository: opensearch-project/sql
Length of output: 92
🏁 Script executed:
#!/bin/bash
# Search for allowPartial and nullWhenCountZero in the entire codebase
find . -name "*.java" -type f | xargs grep -l "allowPartial\|nullWhenCountZero" 2>/dev/nullRepository: opensearch-project/sql
Length of output: 50
🏁 Script executed:
#!/bin/bash
# Look at the actual Calcite RexBuilder.makeOver signatures in any comments or docs
rg -B20 "rexBuilder\.makeOver" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "^[0-9]+-.*(\*|makeOver)" | head -40Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if other usages in the codebase try to preserve these values from source RexOver
rg -B5 -A15 "makeOver.*over\." core/src/main/java/org/opensearch/sql/calcite/ | grep -E "over\.|makeOver|allowPartial|nullWhenCountZero" | head -50Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Look for the full PlanUtils method to see the sumOver and countOver helper functions
sed -n '157,300p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaRepository: opensearch-project/sql
Length of output: 5371
🏁 Script executed:
#!/bin/bash
# Search for helper methods that might extract these properties from RexOver
rg -n "getallowPartial|getNullWhenCountZero|getAllowPartial|isNullWhenCountZero" --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if PlanUtils has a method that wraps makeOver calls with parameter handling
sed -n '200,250p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.javaRepository: opensearch-project/sql
Length of output: 1577
🏁 Script executed:
#!/bin/bash
# Look for comments or documentation about the makeOver parameters, especially those two
rg -B10 "true.*false" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "(1910|1911|true|false|allow|null|makeOver)" | head -30Repository: opensearch-project/sql
Length of output: 546
🏁 Script executed:
#!/bin/bash
# Check what Calcite version this project uses - may help understand the API
find . -name "pom.xml" -type f | xargs grep -h "calcite" | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Look at all pom.xml files in the repo to find Calcite version
find . -type f -name "pom.xml" | xargs grep -l "calcite" | head -5Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check the gradle files if any
find . -type f -name "build.gradle" -o -name "build.gradle.kts" | xargs grep -l "calcite" 2>/dev/null | head -5Repository: opensearch-project/sql
Length of output: 175
🏁 Script executed:
#!/bin/bash
# Search for RexBuilder in imports to understand which Calcite version is used
rg -n "import.*RexBuilder|import.*RexOver" --type=java | head -20Repository: opensearch-project/sql
Length of output: 2761
🏁 Script executed:
#!/bin/bash
# Look at the complete line context for the makeOver call to see if there are inline comments
sed -n '1895,1920p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaRepository: opensearch-project/sql
Length of output: 1128
🏁 Script executed:
#!/bin/bash
# Check the core build.gradle for Calcite version
sed -n '1,100p' core/build.gradle | grep -i calciteRepository: opensearch-project/sql
Length of output: 168
🏁 Script executed:
#!/bin/bash
# Look for any usage of RexStandardizer which imports RexOver
sed -n '1,150p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/RexStandardizer.java | head -80Repository: opensearch-project/sql
Length of output: 3279
🏁 Script executed:
#!/bin/bash
# Search for how RexOver is used elsewhere in the codebase to understand available methods
rg -A5 "visitOver|RexOver" core/src/main/java/org/opensearch/sql/calcite/ | grep -E "over\.|RexOver" | head -40Repository: opensearch-project/sql
Length of output: 1447
🏁 Script executed:
#!/bin/bash
# Search for tests of embedExistingCollationsIntoOver
rg -n "embedExistingCollationsIntoOver" --type=javaRepository: opensearch-project/sql
Length of output: 359
🏁 Script executed:
#!/bin/bash
# Check what parameters RexBuilder.makeOver accepts in Calcite 1.41.0
# by looking at all makeOver invocations with their parameter counts
rg -A20 "\.makeOver\(" core/src/main/java/org/opensearch/sql/calcite/ | grep -E "makeOver\(|,\s*\)|over\.|window\." | head -60Repository: opensearch-project/sql
Length of output: 2140
🏁 Script executed:
#!/bin/bash
# Look for any documentation or comments about what allowPartial and isNullWhenCountZero mean
rg -B5 -A5 "allowPartial|nullWhenCountZero" . --type=java 2>/dev/null | head -40Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if RexOver provides any getter methods beyond what we've already seen
rg -n "over\.get|over\.is" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -v "getWindow\|getType\|getAggOperator\|getOperands\|isDistinct\|ignoreNulls"Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Look more carefully at the exact makeOver signature by checking line counts
sed -n '1901,1913p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | cat -nRepository: opensearch-project/sql
Length of output: 766
🏁 Script executed:
#!/bin/bash
# Check if there's an exclude parameter being used elsewhere
rg -n "RexWindowExclusion\|getExclude" --type=java | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Count the parameters in the ExtendedRelJson makeOver call more carefully
sed -n '448,465p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | cat -nRepository: opensearch-project/sql
Length of output: 814
🏁 Script executed:
#!/bin/bash
# Search for documentation or comments about what parameters 9 and 10 represent
rg -B10 "true,\s*false," core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "^\s*\*|makeOver|true.*false" | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# The two makeOver calls have different signatures. Let's search for documentation or other usages
# to determine which is correct. Check if exclude is a newer addition
rg -n "exclude.*EXCLUDE" --type=java | head -20Repository: opensearch-project/sql
Length of output: 218
🏁 Script executed:
#!/bin/bash
# Search for RexWindowExclusion usage
rg -n "RexWindowExclusion" --type=javaRepository: opensearch-project/sql
Length of output: 1408
🏁 Script executed:
#!/bin/bash
# Look at the imports in ExtendedRelJson to see if it uses exclude
head -50 opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | grep -E "import|RexWindow"Repository: opensearch-project/sql
Length of output: 1888
🏁 Script executed:
#!/bin/bash
# Count total parameters by looking at ExtendedRelJson more carefully with all makeOver calls
rg -B5 -A20 "rexBuilder.makeOver" opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | head -80Repository: opensearch-project/sql
Length of output: 1048
🏁 Script executed:
#!/bin/bash
# Check what RexWindow exposes - does it have getExclude or similar?
rg -n "window\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "window\.(get|is)" | head -20Repository: opensearch-project/sql
Length of output: 229
🏁 Script executed:
#!/bin/bash
# Compare the parameter order and count: ExtendedRelJson has 13 params, CalciteRelNodeVisitor has 12
# Let's check if window has an exclude field
rg -n "getExclude\|exclusion" --type=java | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Let's look at what the original RexOver contains - does it have exclude info?
rg -B5 -A10 "class RexWindow\|RexWindow(" opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.java | head -40Repository: opensearch-project/sql
Length of output: 170
🏁 Script executed:
#!/bin/bash
# Critical: Check if parameter mismatch - ExtendedRelJson has exclude param, CalciteRelNodeVisitor doesn't
# Let's see ALL the parameters being passed to understand the order
echo "=== CalciteRelNodeVisitor version ===" && sed -n '1901,1913p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java && echo "" && echo "=== ExtendedRelJson version ===" && sed -n '448,461p' opensearch/src/main/java/org/opensearch/sql/opensearch/storage/serde/ExtendedRelJson.javaRepository: opensearch-project/sql
Length of output: 1204
🏁 Script executed:
#!/bin/bash
# This is critical - the parameter lists don't match. Let's see if RexWindow has getExclude
rg -n "\.getExclude|Exclusion" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaRepository: opensearch-project/sql
Length of output: 223
🏁 Script executed:
#!/bin/bash
# Check if RexWindow exposes exclusion information
sed -n '1900,1915p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "window\."Repository: opensearch-project/sql
Length of output: 264
🏁 Script executed:
#!/bin/bash
# Look for RexWindow.getExclude or similar in the whole codebase
rg -n "getExclude" --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# The signatures don't match - CalciteRelNodeVisitor is missing the exclude parameter
# Let's verify the exact Calcite RexBuilder.makeOver signature by checking javadoc or other hints
# Search for how many parameters makeOver should have in different contexts
rg -n "makeOver" --type=java -A1 | grep -E "makeOver.*\(" | wc -lRepository: opensearch-project/sql
Length of output: 68
🏁 Script executed:
#!/bin/bash
# Search online Calcite documentation or check if there's version info about when exclude was added
# For now, let's verify whether window object has exclusion info
rg -n "window\.get|RexWindow\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -20Repository: opensearch-project/sql
Length of output: 179
🏁 Script executed:
#!/bin/bash
# Check if there are overloaded versions of makeOver in Calcite
# by looking at how it's being called in different ways
rg -B2 -A15 "rexBuilder\.makeOver" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -50Repository: opensearch-project/sql
Length of output: 886
🏁 Script executed:
#!/bin/bash
# Check the git history to see if this code was recently added or if there's a comment about it
git log --oneline -20 core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java 2>/dev/null | head -20Repository: opensearch-project/sql
Length of output: 134
🏁 Script executed:
#!/bin/bash
# Search for any TODO or FIXME comments near the embedExistingCollationsIntoOver method
sed -n '1850,1920p' core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -E "TODO|FIXME|BUG|XXX"Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if the code compiles without errors by looking for any compilation markers
rg -n "PARAM_MISMATCH|parameter.*mismatch" --type=javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
web_search
Apache Calcite 1.41.0 RexBuilder.makeOver method signature parametersRepository: opensearch-project/sql
Length of output: 161
🏁 Script executed:
#!/bin/bash
# Let's try to use web_search properly with the capability
# For now, check if there are any recent changes or PRs related to this
git log --all --oneline --grep="makeOver\|window\|collation" 2>/dev/null | head -20Repository: opensearch-project/sql
Length of output: 1400
🏁 Script executed:
#!/bin/bash
# Check if RexWindow has an exclusion getter that should be used
rg -n "RexWindow\|class.*Window" core/src/main/java/org/opensearch/sql/calcite/ --type=java | grep -v ".class" | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Critical check: Does RexWindow from over.getWindow() have an exclusion?
# by checking what RexWindow methods are available
rg -n "over\.getWindow\(\)" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java -A5 | head -30Repository: opensearch-project/sql
Length of output: 409
🏁 Script executed:
#!/bin/bash
# Let's check if there's a variant of makeOver that takes fewer parameters
# by looking at how PlanUtils helper methods wrap it
sed -n '200,250p' core/src/main/java/org/opensearch/sql/calcite/utils/PlanUtils.java | grep -E "makeOver|relBuilder\."Repository: opensearch-project/sql
Length of output: 395
🏁 Script executed:
#!/bin/bash
# Critical discovery: the parameter lists differ significantly
# ExtendedRelJson has 14 params, CalciteRelNodeVisitor has 12
# AND ExtendedRelJson includes RexWindowExclusion which is missing from CalciteRelNodeVisitor
# Let's verify if the code even compiles by checking for any build errors
find . -name "*.log" -o -name "*error*" 2>/dev/null | head -10Repository: opensearch-project/sql
Length of output: 454
🏁 Script executed:
#!/bin/bash
# Let's check the git commit that added embedExistingCollationsIntoOver
git log --all -p --follow -- core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -B20 -A20 "embedExistingCollationsIntoOver" | head -100Repository: opensearch-project/sql
Length of output: 152
🏁 Script executed:
#!/bin/bash
# Check if there's any test that validates this method works correctly
find . -name "*Test.java" -type f | xargs grep -l "embedExistingCollationsIntoOver\|window.*collation" 2>/dev/nullRepository: opensearch-project/sql
Length of output: 50
🏁 Script executed:
#!/antml:tool_start
```web_search
Calcite 1.41.0 RexBuilder makeOver signature documentation
```
</tool_start>Repository: opensearch-project/sql
Length of output: 267
🏁 Script executed:
#!/bin/bash
# Let me check the actual failing scenario more carefully
# The parameter mismatch is concerning - let's see if CalciteRelNodeVisitor was recently modified
git log --all --oneline -5 -- core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaRepository: opensearch-project/sql
Length of output: 367
🏁 Script executed:
#!/bin/bash
# Check the PR that added this method
git log --all -p -- core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | grep -B50 "embedExistingCollationsIntoOver" | grep "^commit\|^Author\|^Date" | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# The critical issue: parameter 8 in CalciteRelNodeVisitor is window.isRows()
# but in ExtendedRelJson it's exclude (RexWindowExclusion)
# Let's verify what RexBuilder.makeOver actually expects
# by checking if window has an exclusion property
rg -n "getExclusion\|getExclude\|exclusion\|Exclusion" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/antml:thinking_start
Looking at the parameter discrepancy:
CalciteRelNodeVisitor version (12 params):
1. type
2. operator
3. operands
4. partitionKeys
5. orderKeys (as rexCollations)
6. lowerBound
7. upperBound
8. isRows (boolean)
9. true
10. false
11. distinct
12. ignoreNulls
ExtendedRelJson version (13 params):
1. type
2. operator
3. operands
4. partitionKeys
5. orderKeys (as orderKeys)
6. lowerBound
7. upperBound
8. exclude (RexWindowExclusion)
9. physical (boolean)
10. true
11. false
12. distinct
13. false
The issue is clear: CalciteRelNodeVisitor is missing the `exclude` parameter and has the parameter types wrong. Parameter 8 should be RexWindowExclusion, not isRows().
This looks like a genuine bug in the embedExistingCollationsIntoOver method - it's missing the exclude parameter which should come from window.getExclusion() or default to EXCLUDE_NO_OTHER.
Let me search for whether RexWindow has a getExclusion method.
</thinking_start>Repository: opensearch-project/sql
Length of output: 241
🏁 Script executed:
#!/antml:thinking_endRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/bin/bash
# Check if RexWindow class has getExclusion method by looking at how it's used elsewhere
rg -n "window\.getExclusion|RexWindow" core/src/main/java/org/opensearch/sql/calcite/ --type=java | grep -i "exclusion" | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
#!/antml:thinking_start
The user hasn't provided results yet. Let me think about what I know:
1. ExtendedRelJson passes a RexWindowExclusion parameter to makeOver
2. It gets this from window.get("exclude") or defaults to RexWindowExclusion.EXCLUDE_NO_OTHER
3. CalciteRelNodeVisitor doesn't pass any RexWindowExclusion parameter
4. The parameter at position 8 in CalciteRelNodeVisitor is window.isRows() which seems wrong
This is a real bug - the makeOver signature requires RexWindowExclusion but the code doesn't provide it. The question is whether window.getExclusion() exists or if we need to default to EXCLUDE_NO_OTHER like ExtendedRelJson does.
</thinking_end>Repository: opensearch-project/sql
Length of output: 287
Fix missing RexWindowExclusion parameter in makeOver call.
The makeOver call is missing a required parameter. Comparing with ExtendedRelJson.java (line 448), the Calcite RexBuilder.makeOver signature expects RexWindowExclusion at parameter position 8, but the current code passes window.isRows() instead.
The correct order should be:
- type
- operator
- operands
- partitionKeys
- orderKeys
- lowerBound
- upperBound
- exclude (RexWindowExclusion) — currently missing
- followed by remaining parameters
Add the missing exclude parameter (defaulting to RexWindowExclusion.EXCLUDE_NO_OTHER like in ExtendedRelJson):
return context.rexBuilder.makeOver(
over.getType(),
over.getAggOperator(),
over.getOperands(),
window.partitionKeys,
rexCollations,
window.getLowerBound(),
window.getUpperBound(),
RexWindowExclusion.EXCLUDE_NO_OTHER, // Add this
window.isRows(),
true,
false,
over.isDistinct(),
over.ignoreNulls());The original concern about preserving allowPartial and isNullWhenCountZero from the source RexOver is not actionable—the RexOver API does not expose these properties as getters, making hardcoding acceptable.
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`
around lines 1860 - 1917, The makeOver call in embedExistingCollationsIntoOver
is missing the required RexWindowExclusion argument; update the call in method
embedExistingCollationsIntoOver (where context.rexBuilder.makeOver is invoked
for RexOver) to insert RexWindowExclusion.EXCLUDE_NO_OTHER between
window.getUpperBound() and window.isRows(), preserving the rest of the
parameters (type, agg operator, operands, partitionKeys,
orderKeys/rexCollations, lowerBound, upperBound, exclude, isRows,
allowPartial(true), isNullWhenCountZero(false), over.isDistinct(),
over.ignoreNulls()) so the parameter ordering matches RexBuilder.makeOver.
| // Handle empty keys case - Calcite ARRAY[] requires at least 1 element | ||
| if (keys.isEmpty()) { | ||
| return createEmptyMap(context); | ||
| } | ||
| RexNode keysArray = getStringLiteralArray(keys, context); | ||
| List<RexNode> values = | ||
| keys.stream().map(key -> context.relBuilder.field(key)).collect(Collectors.toList()); | ||
| keys.stream() | ||
| .map(key -> context.relBuilder.cast(context.relBuilder.field(key), SqlTypeName.VARCHAR)) | ||
| .collect(Collectors.toList()); | ||
| RexNode valuesArray = makeArray(values, context); | ||
| return context.rexBuilder.makeCall(BuiltinFunctionName.MAP_FROM_ARRAYS, keysArray, valuesArray); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd /repo && find . -name "DynamicFieldsHelper.java" -type fRepository: opensearch-project/sql
Length of output: 121
🏁 Script executed:
cd /repo && wc -l core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.javaRepository: opensearch-project/sql
Length of output: 121
🏁 Script executed:
cd /repo && cat -n core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '280,330p'Repository: opensearch-project/sql
Length of output: 121
🏁 Script executed:
cd /repo && cat -n core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '1,50p'Repository: opensearch-project/sql
Length of output: 121
🏁 Script executed:
cd /repo && grep -n "createEmptyMap\|MAP_FROM_ARRAYS\|MAP_APPEND" core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.javaRepository: opensearch-project/sql
Length of output: 121
🏁 Script executed:
cd /repo && find . -path "*/test*" -name "*DynamicFieldsHelperTest*" -type fRepository: opensearch-project/sql
Length of output: 121
🏁 Script executed:
find . -name "DynamicFieldsHelper.java" -type fRepository: opensearch-project/sql
Length of output: 138
🏁 Script executed:
wc -l ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.javaRepository: opensearch-project/sql
Length of output: 142
🏁 Script executed:
cat -n ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '280,330p'Repository: opensearch-project/sql
Length of output: 2638
🏁 Script executed:
cat -n ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '1,100p'Repository: opensearch-project/sql
Length of output: 4829
🏁 Script executed:
grep -n "createEmptyMap\|MAP_FROM_ARRAYS\|MAP_APPEND" ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.javaRepository: opensearch-project/sql
Length of output: 649
🏁 Script executed:
find . -path "*/test*" -name "*DynamicFieldsHelperTest*" -type fRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
find . -name "*DynamicFieldsHelper*" -type fRepository: opensearch-project/sql
Length of output: 138
🏁 Script executed:
find . -path "*/test/*" -name "*DynamicFields*" -type fRepository: opensearch-project/sql
Length of output: 157
🏁 Script executed:
grep -r "DynamicFieldsHelper" --include="*Test.java" | head -20Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
cat -n ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '160,250p'Repository: opensearch-project/sql
Length of output: 4505
🏁 Script executed:
grep -n "MAP_FROM_ARRAYS\|MAP_APPEND" ./core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java -B 2 -A 2Repository: opensearch-project/sql
Length of output: 1275
🏁 Script executed:
grep -r "getFieldsAsMap" --include="*.java"Repository: opensearch-project/sql
Length of output: 676
🏁 Script executed:
find . -path "*/test/*" -name "*.java" -exec grep -l "createEmptyMap\|getFieldsAsMap" {} \;Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
find . -name "pom.xml" -type f | head -5Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
cat ./pom.xml | grep -A 5 "calcite"Repository: opensearch-project/sql
Length of output: 107
🏁 Script executed:
find . -name "pom.xml" -type f -exec grep -l "calcite" {} \;Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
grep -r "MAP_APPEND\|MAP_FROM_ARRAYS" --include="*.java" | head -30Repository: opensearch-project/sql
Length of output: 3167
🏁 Script executed:
find . -path "*/test/*" -name "*.java" -exec grep -l "MAP_APPEND\|MAP_FROM_ARRAYS" {} \;Repository: opensearch-project/sql
Length of output: 155
🏁 Script executed:
grep -r "makeNullLiteral" --include="*.java" | head -20Repository: opensearch-project/sql
Length of output: 3067
🏁 Script executed:
cat -n ./integ-test/src/test/java/org/opensearch/sql/calcite/standalone/MapAppendFunctionIT.javaRepository: opensearch-project/sql
Length of output: 6404
🏁 Script executed:
cat -n ./core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MapAppendFunctionImpl.javaRepository: opensearch-project/sql
Length of output: 4823
🏁 Script executed:
find . -name "pom.xml" -type f | xargs grep -l "calcite" | head -1 | xargs cat | grep -A 5 "calcite-core\|calcite.version"Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
find . -name "*.xml" -type f | xargs grep -l "calcite" | head -1Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
find . -name "pom.xml" -type f -exec grep -l "properties\|version" {} \; | head -3Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
cat ./pom.xml 2>/dev/null || find . -maxdepth 2 -name "pom.xml" -type f -exec head -30 {} \;Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
ls -la | grep -i pomRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
find . -maxdepth 3 -name "pom.xml" -type fRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
grep -n "private static RexNode getFieldsAsMap\|private static RexNode createEmptyMap" core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java -A 1Repository: opensearch-project/sql
Length of output: 325
🏁 Script executed:
find . -path "*/test/*" -name "*.java" -exec grep -l "getFieldsAsMap\|createEmptyMap" {} \;Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
grep -r "excludeMetaFields\|isMetadataField" core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java -B 2 -A 2Repository: opensearch-project/sql
Length of output: 1003
🏁 Script executed:
find . -path "*/test/*" -name "*.java" | xargs grep -l "DynamicFieldsHelper" 2>/dev/nullRepository: opensearch-project/sql
Length of output: 50
🏁 Script executed:
grep -n "new DynamicFieldsHelper\|DynamicFieldsHelper\." core/src/main/java/org/opensearch/sql/calcite/*.java | head -20Repository: opensearch-project/sql
Length of output: 2272
🏁 Script executed:
git log --oneline -20 -- core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java 2>/dev/null | head -10Repository: opensearch-project/sql
Length of output: 134
🏁 Script executed:
git show HEAD:core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java 2>/dev/null | grep -n "createEmptyMap\|getFieldsAsMap" | head -5Repository: opensearch-project/sql
Length of output: 450
🏁 Script executed:
cat -n core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java | sed -n '287,315p'Repository: opensearch-project/sql
Length of output: 1606
Add unit tests for new helper functions per coding guidelines.
The coding guidelines require that new functions in core/src/main/java/**/*.java must have unit tests in the same commit. Both createEmptyMap() and getFieldsAsMap() are new private utilities and need test coverage.
Additionally, note that createEmptyMap() returns a NULL literal (not an actual empty map), which is semantically different from its JavaDoc comment "Create an empty map by casting NULL to MAP type". However, this is not a functional bug: MAP_APPEND has explicit null-handling logic (see MapAppendFunctionImpl.mapAppend() lines 69-81) that returns the non-null map when one argument is NULL, so the dynamic fields map will not be wiped out. The existing test suite confirms this behavior (MapAppendFunctionIT lines 87-113). Still, consider updating the comment for clarity.
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java`
around lines 293 - 303, Add unit tests covering the two new private helpers
createEmptyMap() and getFieldsAsMap(): write tests that exercise
createEmptyMap() returning a NULL-cast-to-MAP literal and verify consumer
behavior (e.g., MapAppendFunctionImpl.mapAppend() preserves non-null maps when
one arg is NULL), and tests for getFieldsAsMap() producing the expected
MAP_FROM_ARRAYS RexNode for given keys; also update the JavaDoc on
createEmptyMap() to clarify it returns a NULL literal cast to MAP (not a
concrete empty map) so the comment matches behavior. Ensure tests live alongside
other core Java tests and reference the helper methods by name (createEmptyMap,
getFieldsAsMap) when locating/covering the code.
| /** Create an empty map by casting NULL to MAP type */ | ||
| private static RexNode createEmptyMap(CalcitePlanContext context) { | ||
| RelDataType varcharType = | ||
| context.rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR); | ||
| RelDataType mapType = | ||
| context.rexBuilder.getTypeFactory().createMapType(varcharType, varcharType); | ||
| return context.rexBuilder.makeNullLiteral(mapType); | ||
| } |
There was a problem hiding this comment.
Add unit test coverage for empty-keys map creation.
A new helper function was added to core/src/main/java; please add a unit test that exercises the empty-keys path (and ensures _MAP is not unexpectedly NULL).
As per coding guidelines: New functions MUST have unit tests in the same commit.
🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/DynamicFieldsHelper.java`
around lines 306 - 313, Add a unit test that exercises
DynamicFieldsHelper.createEmptyMap to cover the empty-keys map creation path and
assert the returned RexNode is a non-null NULL literal of MAP type (i.e., not an
unexpected Java null and its type is a MAP of VARCHAR->VARCHAR). Create a test
in the corresponding test class (e.g., DynamicFieldsHelperTest) that constructs
or mocks a CalcitePlanContext with a RexBuilder/TypeFactory (or reuse existing
test helpers), calls createEmptyMap(context), and asserts the result is not null
and result.getType().isMap() (and that the map key/value types are VARCHAR).
Ensure the new test method name clearly indicates it verifies empty-keys map
creation (e.g., testCreateEmptyMapReturnsMapNullLiteral).
| if (dateCount + nullCount == types.size()) { | ||
| udt = createUDT(ExprUDT.EXPR_DATE, nullable); | ||
| } else if (timeCount + nullCount == types.size()) { | ||
| udt = createUDT(ExprUDT.EXPR_TIME, nullable); | ||
| } | ||
| // There are cases where UDT IP interleaves with its intermediate SQL type for validation | ||
| // OTHER, we check otherCount to patch such cases | ||
| else if (ipCount + nullCount == types.size() || otherCount + nullCount == types.size()) { | ||
| udt = createUDT(ExprUDT.EXPR_IP, nullable); | ||
| } else if (binaryCount + nullCount == types.size()) { | ||
| udt = createUDT(ExprUDT.EXPR_BINARY, nullable); | ||
| } | ||
| // There exists a mix of time, date, and timestamp (and optionally null) | ||
| else if (binaryCount == 0 && ipCount == 0) { | ||
| udt = createUDT(ExprUDT.EXPR_TIMESTAMP, nullable); | ||
| } else { | ||
| udt = createSqlType(SqlTypeName.VARCHAR, nullable); | ||
| } | ||
| return udt; |
There was a problem hiding this comment.
Edge case: all-NULL input may incorrectly return DATE UDT.
When types contains only NULL types, nullCount == types.size() and dateCount + nullCount == types.size() evaluates to true (since dateCount == 0), causing the method to return EXPR_DATE. This may not be the intended behavior—an all-NULL list should likely return NULL type or delegate to the standard leastRestrictive.
🐛 Proposed fix
if (anyCount == 0) {
RelDataType udt;
boolean nullable = nullableCount > 0 || nullCount > 0;
- if (dateCount + nullCount == types.size()) {
+ if (dateCount > 0 && dateCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_DATE, nullable);
- } else if (timeCount + nullCount == types.size()) {
+ } else if (timeCount > 0 && timeCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_TIME, nullable);
}
// There are cases where UDT IP interleaves with its intermediate SQL type for validation
// OTHER, we check otherCount to patch such cases
- else if (ipCount + nullCount == types.size() || otherCount + nullCount == types.size()) {
+ else if ((ipCount > 0 && ipCount + nullCount == types.size()) || (otherCount > 0 && otherCount + nullCount == types.size())) {
udt = createUDT(ExprUDT.EXPR_IP, nullable);
- } else if (binaryCount + nullCount == types.size()) {
+ } else if (binaryCount > 0 && binaryCount + nullCount == types.size()) {
udt = createUDT(ExprUDT.EXPR_BINARY, nullable);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (dateCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_DATE, nullable); | |
| } else if (timeCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_TIME, nullable); | |
| } | |
| // There are cases where UDT IP interleaves with its intermediate SQL type for validation | |
| // OTHER, we check otherCount to patch such cases | |
| else if (ipCount + nullCount == types.size() || otherCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_IP, nullable); | |
| } else if (binaryCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_BINARY, nullable); | |
| } | |
| // There exists a mix of time, date, and timestamp (and optionally null) | |
| else if (binaryCount == 0 && ipCount == 0) { | |
| udt = createUDT(ExprUDT.EXPR_TIMESTAMP, nullable); | |
| } else { | |
| udt = createSqlType(SqlTypeName.VARCHAR, nullable); | |
| } | |
| return udt; | |
| if (dateCount > 0 && dateCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_DATE, nullable); | |
| } else if (timeCount > 0 && timeCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_TIME, nullable); | |
| } | |
| // There are cases where UDT IP interleaves with its intermediate SQL type for validation | |
| // OTHER, we check otherCount to patch such cases | |
| else if ((ipCount > 0 && ipCount + nullCount == types.size()) || (otherCount > 0 && otherCount + nullCount == types.size())) { | |
| udt = createUDT(ExprUDT.EXPR_IP, nullable); | |
| } else if (binaryCount > 0 && binaryCount + nullCount == types.size()) { | |
| udt = createUDT(ExprUDT.EXPR_BINARY, nullable); | |
| } | |
| // There exists a mix of time, date, and timestamp (and optionally null) | |
| else if (binaryCount == 0 && ipCount == 0) { | |
| udt = createUDT(ExprUDT.EXPR_TIMESTAMP, nullable); | |
| } else { | |
| udt = createSqlType(SqlTypeName.VARCHAR, nullable); | |
| } | |
| return udt; |
🤖 Prompt for AI Agents
In
`@core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java`
around lines 370 - 388, In OpenSearchTypeFactory, add an explicit early check
for the all-NULL case before the current conditional chain: if nullCount ==
types.size() return the NULL SQL type (or delegate to the existing
leastRestrictive behavior) instead of falling through to the date/time/ip
checks; update the method that builds udt (the block that calls
createUDT(ExprUDT.EXPR_DATE, ...), createUDT(...EXPR_TIME...),
createUDT(...EXPR_IP...), createUDT(...EXPR_BINARY...), and
createSqlType(SqlTypeName.VARCHAR, ...)) to handle the all-null case first so an
all-NULL input does not incorrectly yield ExprUDT.EXPR_DATE.
There was a problem hiding this comment.
@coderabbitai nope, all null casts won't pass the initial condition if (types.stream().anyMatch(OpenSearchTypeUtil::isUserDefinedType)), thus won't fall into the date UDT case.
There was a problem hiding this comment.
@yuancu You’re right—thanks for the correction. Because the UDT branch in OpenSearchTypeFactory.leastRestrictive() executes only when at least one input type is a UDT (via OpenSearchTypeUtil.isUserDefinedType), an all-NULL list won’t enter that block and will fall back to the standard leastRestrictive(types, PplTypeCoercionRule.assignmentInstance()) path. The earlier suggestion about EXPR_DATE on all-NULL was incorrect—no change needed.
As a minor guardrail, consider a unit test asserting the all-NULL case takes the standard path, but otherwise this looks good.
✏️ Learnings added
Learnt from: yuancu
Repo: opensearch-project/sql PR: 4892
File: core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java:370-388
Timestamp: 2026-01-29T06:05:23.585Z
Learning: OpenSearchTypeFactory.leastRestrictive(List<RelDataType>) enters the UDT-specific logic only if any input is a user-defined type (OpenSearchTypeUtil.isUserDefinedType). An all-NULL input list does not enter this branch and uses the standard leastRestrictive with PplTypeCoercionRule.assignmentInstance() instead.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2026-01-26T22:19:54.446Z
Learning: Applies to **/calcite/**/*.java : Ensure version compatibility with Calcite in integration code
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2026-01-26T22:19:54.446Z
Learning: Applies to **/calcite/**/*.java : Document Calcite-specific workarounds in code comments
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2026-01-26T22:19:54.446Z
Learning: Applies to **/*.java : Maintain Java 11 compatibility when possible for OpenSearch 2.x support
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)
393-413: Add complete JavaDoc for these public APIs.
resolve/resolveAgglack JavaDoc, and theregisterExternal*methods don’t include full tag coverage. Please add@param,@return, and@throwsas required.
As per coding guidelines: Public methods MUST have JavaDoc with@param,@return, and@throws.Also applies to: 428-475
🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 1899-1913: The current embedExistingCollationsIntoOver logic
always overwrites window ORDER BY keys (rexCollations) when building RexOver,
dropping any explicit window ordering; change the visitOver in
embedExistingCollationsIntoOver (the code that calls context.rexBuilder.makeOver
with rexCollations) to first inspect the RexWindow (window) for existing order
keys and only use the plan collation (rexCollations) when the window has no
explicit ordering, otherwise pass the window's existing order keys through
unchanged so explicit window ORDER BY is preserved.
In
`@core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java`:
- Around line 477-483: The file references UDFOperandMetadata in the
INTERNAL_PATTERN declaration inside PPLBuiltinOperators
(createUserDefinedAggFunction call for LogPatternAggFunction "pattern") but
missing its import causes compilation failure; add the proper import for
UDFOperandMetadata at the top of the file (matching the package where
UDFOperandMetadata is defined) so the reference in the INTERNAL_PATTERN constant
resolves.
- Around line 494-520: Add unit tests covering the MAP_FROM_ARRAYS function
declared as PPLBuiltinOperators.MAP_FROM_ARRAYS: create tests that (1) verify
correct map output when given two arrays of matching length (e.g., keys and
values), (2) assert behavior with empty arrays, (3) assert handling of null
inputs and null elements, and (4) validate behavior when key/value arrays have
mismatched lengths (expected error or defined behavior). Put tests in the same
test suite that covers other PPLBuiltinOperators (reuse existing test utilities
used for ATAN/REGEXP/MAP_CONCAT) and invoke the function through the same
evaluation path used by other function tests so the SqlFunction resolution and
type handling (SqlLibraryOperators -> PPLBuiltinOperators.MAP_FROM_ARRAYS) are
exercised.
In
`@core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java`:
- Around line 414-425: registerExternalAggOperator currently calls
UserDefinedFunctionUtils.makeAggregateCall with List.of(field) which NPEs for
zero-arg aggregates when field is null; modify the AggHandler lambda in
registerExternalAggOperator to pass an empty list (e.g., Collections.emptyList()
or List.of()) when field==null and List.of(field) only when field!=null, then
call makeAggregateCall(aggFunction, fieldList, argList, ctx.relBuilder) so
zero-arg external aggregates (like count()) are handled safely; keep the same
functionName->handler registration in aggExternalFunctionRegistry.
| public RexNode visitOver(RexOver over) { | ||
| RexWindow window = over.getWindow(); | ||
| return context.rexBuilder.makeOver( | ||
| over.getType(), | ||
| over.getAggOperator(), | ||
| over.getOperands(), | ||
| window.partitionKeys, | ||
| rexCollations, | ||
| window.getLowerBound(), | ||
| window.getUpperBound(), | ||
| window.isRows(), | ||
| true, | ||
| false, | ||
| over.isDistinct(), | ||
| over.ignoreNulls()); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the full context around the visitOver method
cat -n core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | sed -n '1880,1930p'Repository: opensearch-project/sql
Length of output: 2823
🏁 Script executed:
# Search for where rexCollations is defined in the visitOver context
grep -n "rexCollations" core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java | head -20Repository: opensearch-project/sql
Length of output: 180
🏁 Script executed:
# Look for the broader method context to understand how rexCollations is populated
ast-grep --pattern 'visitOver($$$) {
$$$
}'Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for RexWindow.orderKeys references
rg "window\.orderKeys" core/src/main/java/org/opensearch/sql/calcite/Repository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for RexWindow creation or usage patterns
rg "RexWindow" core/src/main/java/org/opensearch/sql/calcite/ -A 3 -B 1Repository: opensearch-project/sql
Length of output: 14719
🏁 Script executed:
# Look for similar patterns of window handling elsewhere in the file
rg "makeOver\(" core/src/main/java/org/opensearch/sql/calcite/ -B 5 -A 5Repository: opensearch-project/sql
Length of output: 12259
🏁 Script executed:
# Check if there are any imports or usage of RexWindow that show field names
grep -n "import.*RexWindow\|RexWindow\." core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javaRepository: opensearch-project/sql
Length of output: 159
🏁 Script executed:
# Search for window.orderKeys usage in the codebase
rg "\.orderKeys" core/src/main/java/ -t javaRepository: opensearch-project/sql
Length of output: 48
🏁 Script executed:
# Search for window field access patterns
rg "window\.(partition|order|collation)" core/src/main/java/ -t javaRepository: opensearch-project/sql
Length of output: 188
🏁 Script executed:
# Check pom.xml for Calcite version
cat pom.xml | grep -A 2 "calcite" | head -20Repository: opensearch-project/sql
Length of output: 105
🌐 Web query:
Apache Calcite RexWindow class API orderKeys field structure
💡 Result:
In org.apache.calcite.rex.RexWindow, the orderKeys field is:
public final ImmutableList<RexFieldCollation> orderKeys— an ordered list describing the window’s ORDER BY items. [1]
Each element (RexFieldCollation) is structured as:
RexFieldCollation extends Pair<RexNode, ImmutableSet<SqlKind>>- left (
RexNode): the sort expression (e.g., an input ref, computed expression, etc.) - right (
Set<SqlKind>): sort flags such as DESCENDING and null-direction flags (e.g., NULLS FIRST/LAST) [2]
- left (
Convenience accessors on RexFieldCollation interpret those flags as enums:
getDirection()→RelFieldCollation.Direction(ASC/DESC)getNullDirection()→RelFieldCollation.NullDirection[2]
Sources: [1] [2]
🏁 Script executed:
# Search for RexBuilder.makeOver method to see its signature
rg "makeOver\(" --type java -B 2 -A 8 | head -100Repository: opensearch-project/sql
Length of output: 10225
🏁 Script executed:
# Look for patterns where window.orderKeys is checked or used
find . -name "*.java" -type f | xargs grep -l "orderKeys" 2>/dev/null | head -10Repository: opensearch-project/sql
Length of output: 322
🏁 Script executed:
# Search for the method name "embedExistingCollationsIntoOver" and surrounding context
rg "embedExistingCollationsIntoOver" -B 5 -A 30Repository: opensearch-project/sql
Length of output: 8562
🌐 Web query:
RexBuilder.makeOver signature parameter orderKeys Calcite
💡 Result:
In Apache Calcite, RexBuilder.makeOver takes orderKeys after partitionKeys.
Signature (current javadocAggregate):
makeOver(RelDataType type, SqlAggFunction operator, List<RexNode> exprs, List<RexNode> partitionKeys, ImmutableList<RexFieldCollation> orderKeys, RexWindowBound lowerBound, RexWindowBound upperBound, RexWindowExclusion exclude, boolean rows, boolean allowPartial, boolean nullWhenCountZero, boolean distinct, boolean ignoreNulls) [1]
Older/shorter overload (no exclude, no ignoreNulls):
makeOver(RelDataType type, SqlAggFunction operator, List<RexNode> exprs, List<RexNode> partitionKeys, ImmutableList<RexFieldCollation> orderKeys, RexWindowBound lowerBound, RexWindowBound upperBound, boolean physical, boolean allowPartial, boolean nullWhenCountZero, boolean distinct) [2]
So the relevant order is: exprs, partitionKeys, orderKeys, then bounds/flags. [1][2]
Preserve existing window ORDER BY when embedding plan collations.
The embedExistingCollationsIntoOver method unconditionally replaces all RexOver order keys with the plan collation, dropping any explicit window ORDER BY. This can change query semantics. Only inject the plan collation when the window lacks existing order keys.
🛠️ Suggested fix
- return context.rexBuilder.makeOver(
+ ImmutableList<@NonNull RexFieldCollation> orderKeys =
+ window.orderKeys.isEmpty() ? rexCollations : window.orderKeys;
+ return context.rexBuilder.makeOver(
over.getType(),
over.getAggOperator(),
over.getOperands(),
window.partitionKeys,
- rexCollations,
+ orderKeys,
window.getLowerBound(),
window.getUpperBound(),
window.isRows(),
true,
false,
over.isDistinct(),
over.ignoreNulls());🤖 Prompt for AI Agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`
around lines 1899 - 1913, The current embedExistingCollationsIntoOver logic
always overwrites window ORDER BY keys (rexCollations) when building RexOver,
dropping any explicit window ordering; change the visitOver in
embedExistingCollationsIntoOver (the code that calls context.rexBuilder.makeOver
with rexCollations) to first inspect the RexWindow (window) for existing order
keys and only use the plan collation (rexCollations) when the window has no
explicit ordering, otherwise pass the window's existing order keys through
unchanged so explicit window ORDER BY is preserved.
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
Show resolved
Hide resolved
| public void registerExternalAggOperator( | ||
| BuiltinFunctionName functionName, SqlUserDefinedAggFunction aggFunction) { | ||
| PPLTypeChecker typeChecker = | ||
| wrapSqlOperandTypeChecker(aggFunction.getOperandTypeChecker(), functionName.name(), true); | ||
| CalciteFuncSignature signature = new CalciteFuncSignature(functionName.getName(), typeChecker); | ||
| if (aggExternalFunctionRegistry.containsKey(functionName)) { | ||
| logger.warn( | ||
| String.format( | ||
| Locale.ROOT, "Aggregate function %s is registered multiple times", functionName)); | ||
| } | ||
| AggHandler handler = | ||
| (distinct, field, argList, ctx) -> | ||
| UserDefinedFunctionUtils.makeAggregateCall( | ||
| aggFunction, List.of(field), argList, ctx.relBuilder); | ||
| aggExternalFunctionRegistry.put(functionName, Pair.of(signature, handler)); | ||
| } | ||
|
|
||
| public List<RexNode> validateAggFunctionSignature( | ||
| BuiltinFunctionName functionName, | ||
| RexNode field, | ||
| List<RexNode> argList, | ||
| RexBuilder rexBuilder) { | ||
| var implementation = getImplementation(functionName); | ||
| return validateFunctionArgs(implementation, functionName, field, argList, rexBuilder); | ||
| aggExternalFunctionRegistry.put(functionName, handler); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's examine the file structure and locate the method in question
cd "$(git rev-parse --show-toplevel)" 2>/dev/null || cd .
# Find the file and show context around the method
fd -t f "PPLFuncImpTable.java" --exec cat -n {} \; | head -100Repository: opensearch-project/sql
Length of output: 8647
🏁 Script executed:
#!/bin/bash
# Search for all calls to registerExternalAggOperator
rg -n "registerExternalAggOperator" --type=java -C 5Repository: opensearch-project/sql
Length of output: 3156
🏁 Script executed:
#!/bin/bash
# Search for AggHandler usages to understand how field is passed
rg -n "AggHandler" --type=java -C 3Repository: opensearch-project/sql
Length of output: 13249
🏁 Script executed:
#!/bin/bash
# Look for resolveAgg and similar methods that invoke handlers
rg -n "aggExternalFunctionRegistry|resolveAgg" --type=java -C 5Repository: opensearch-project/sql
Length of output: 10778
🏁 Script executed:
#!/bin/bash
# Look at the built-in aggregate handler implementations in AggBuilder
cat -n core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java | sed -n '996,1050p'Repository: opensearch-project/sql
Length of output: 2934
🏁 Script executed:
#!/bin/bash
# Search for calls to resolveAgg to see how field is passed
rg -n "\.resolveAgg\(" --type=java -B 2 -A 2Repository: opensearch-project/sql
Length of output: 574
🏁 Script executed:
#!/bin/bash
# Search for zero-arg aggregate functions or COUNT() without field
rg -n "COUNT|count\(\s*\)" --type=java | head -30Repository: opensearch-project/sql
Length of output: 4974
🏁 Script executed:
#!/bin/bash
# Look for how COUNT(*) is handled in plan building
rg -n "RexBuilder\|relBuilder.*COUNT\|COUNT.*null" --type=java -B 2 -A 2 | head -50Repository: opensearch-project/sql
Length of output: 48
Guard against null field for external aggregates.
List.of(field) throws NPE when field is null, which occurs for zero-arg aggregates like count(). The built-in COUNT() aggregate explicitly handles this case; external aggregates should follow the same pattern to prevent planner crashes.
🛠️ Suggested fix
- AggHandler handler =
- (distinct, field, argList, ctx) ->
- UserDefinedFunctionUtils.makeAggregateCall(
- aggFunction, List.of(field), argList, ctx.relBuilder);
+ AggHandler handler =
+ (distinct, field, argList, ctx) -> {
+ List<RexNode> fields = field == null ? List.of() : List.of(field);
+ return UserDefinedFunctionUtils.makeAggregateCall(
+ aggFunction, fields, argList, ctx.relBuilder);
+ };🤖 Prompt for AI Agents
In
`@core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java`
around lines 414 - 425, registerExternalAggOperator currently calls
UserDefinedFunctionUtils.makeAggregateCall with List.of(field) which NPEs for
zero-arg aggregates when field is null; modify the AggHandler lambda in
registerExternalAggOperator to pass an empty list (e.g., Collections.emptyList()
or List.of()) when field==null and List.of(field) only when field!=null, then
call makeAggregateCall(aggFunction, fieldList, argList, ctx.relBuilder) so
zero-arg external aggregates (like count()) are handled safely; keep the same
functionName->handler registration in aggExternalFunctionRegistry.
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercion.java`:
- Line 47: Add unit tests for PplTypeCoercion and add missing JavaDoc: write
focused JUnit tests exercising PplTypeCoercion (class PplTypeCoercion) methods
including null inputs, error paths and boundary cases for datetime and IP
conversions, numeric widening, and SAFE_CAST behavior (including safe/unsafe
results), and assert expected types/results and exceptions; also add JavaDoc to
the PplTypeCoercion constructor and all public methods with clear descriptions
and `@param/`@return (and `@throws` where applicable) to match project standards so
reviewers can see intended behavior and edge-case handling.
core/src/main/java/org/opensearch/sql/calcite/validate/PplTypeCoercion.java
Show resolved
Hide resolved
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In
`@core/src/test/java/org/opensearch/sql/calcite/validate/OpenSearchSparkSqlDialectTest.java`:
- Around line 30-41: Add two unit tests in OpenSearchSparkSqlDialectTest that
call OpenSearchSparkSqlDialect.DEFAULT.getCastSpec: one that passes a non‑IP
RelDataType (e.g., TYPE_FACTORY.createSqlType(SqlTypeName.INTEGER)) and asserts
the method returns null (or not a SqlDataTypeSpec) for unsupported types, and
another that calls getCastSpec(null) and asserts it throws a
NullPointerException (using assertThrows). Reference the existing test pattern
that inspects SqlDataTypeSpec (castSpec/SqlDataTypeSpec) so the new tests mirror
the IP test but cover non‑IP and null/boundary cases.
In
`@core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionRuleTest.java`:
- Around line 40-50: The test testVarcharToOtherCoercion's comment is incorrect:
it states "coercible from OTHER" while the code checks
mapping.get(SqlTypeName.OTHER) contains VARCHAR/CHAR (i.e., VARCHAR can be
coerced to OTHER); update the inline comment above the assertion to correctly
read that VARCHAR is coercible to OTHER (or reword to "OTHER should allow
coercion to VARCHAR/CHAR") so the comment matches the logic in
PplTypeCoercionRule.instance() and mapping.get(SqlTypeName.OTHER).
In
`@core/src/test/java/org/opensearch/sql/calcite/validate/shuttles/SkipRelValidationShuttleTest.java`:
- Around line 33-36: Add unit tests to SkipRelValidationShuttleTest to cover
NULL, empty, and invalid cases for the skip predicate logic: add tests that pass
null operands to the shuttle (e.g., simulate a RexCall or RexNode with a null
operand), tests that pass an empty operand list, and negative tests for invalid
aggregate-input shapes (e.g., a single CASE expression or a non-project input)
to assert the shuttle rejects or throws as expected; create clearly named test
methods (e.g., testSkipPredicateWithNullOperand,
testSkipPredicateWithEmptyOperands, testSkipPredicateWithInvalidAggregateInput)
and exercise the same validation paths used by the existing happy-path tests so
you get NULL/boundary/error coverage for SkipRelValidationShuttle and relevant
methods in that class.
In
`@core/src/test/java/org/opensearch/sql/calcite/validate/ValidationUtilsTest.java`:
- Around line 209-232: Add a unit test in ValidationUtilsTest to cover a
null-exception input to ValidationUtils.tolerantValidationException: create a
test method (e.g., testTolerantValidationExceptionNullException) that passes
null as the exception argument and asserts the expected boolean result (likely
false based on other tests); place it alongside the existing tests in the same
class so the null-exception case is covered per the NULL input testing
guideline.
- Around line 98-207: Add null-input unit tests for
ValidationUtils.createUDTWithAttributes: verify that passing sourceType == null
and typeName == null to both overloads (the one taking ExprUDT and the one
taking SqlTypeName) throws a NullPointerException (use
assertThrows(NullPointerException.class, ...)). Add four tests in
ValidationUtilsTest such as testCreateUDTWithAttributes_NullSourceType_ExprUDT,
testCreateUDTWithAttributes_NullTypeName_ExprUDT,
testCreateUDTWithAttributes_NullSourceType_SqlTypeName, and
testCreateUDTWithAttributes_NullTypeName_SqlTypeName that call
ValidationUtils.createUDTWithAttributes(TYPE_FACTORY or null,
null/ExprUDT.EXPR_DATE) and the SqlTypeName variant accordingly and assert the
expected exception.
🧹 Nitpick comments (3)
core/src/test/java/org/opensearch/sql/calcite/validate/shuttles/SqlRewriteShuttleTest.java (2)
36-94: Consider adding NULL input test for identifier visit.Per coding guidelines, NULL input tests should be included for all new functions. Consider adding a test for how
shuttle.visit()handles anullSqlIdentifier input, or document that null inputs are not expected at this layer.`@Test` public void testVisitIdentifierWithNull() { SqlNode result = shuttle.visit((SqlIdentifier) null); // Assert expected behavior (null return, exception, etc.) }
174-189: Consider adding symmetric test for NOT IN with non-SqlNodeList.For completeness and symmetry with the IN tests, consider adding a test that verifies NOT IN with a non-SqlNodeList first operand also remains unchanged.
`@Test` public void testVisitCallNotInWithNonSqlNodeListUnchanged() { SqlIdentifier column = new SqlIdentifier("my_column", SqlParserPos.ZERO); SqlIdentifier subquery = new SqlIdentifier("subquery", SqlParserPos.ZERO); SqlBasicCall notInCall = new SqlBasicCall(SqlStdOperatorTable.NOT_IN, List.of(column, subquery), SqlParserPos.ZERO); SqlNode result = shuttle.visit(notInCall); assertInstanceOf(SqlBasicCall.class, result); SqlBasicCall resultCall = (SqlBasicCall) result; assertEquals(SqlKind.NOT_IN, resultCall.getKind()); assertInstanceOf(SqlIdentifier.class, resultCall.getOperandList().get(0)); }core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionRuleTest.java (1)
17-67: Consider adding tests forlenientInstance()andassignmentInstance().The
PplTypeCoercionRuleclass exposes three public factory methods:instance(),lenientInstance(), andassignmentInstance(). Currently, onlyinstance()is tested. Adding similar tests for the other two would ensure the PPL-specific mappings are correctly merged with bothSqlTypeCoercionRule.lenientInstance()andSqlTypeAssignmentRule.instance()base mappings.
core/src/test/java/org/opensearch/sql/calcite/validate/OpenSearchSparkSqlDialectTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionRuleTest.java
Show resolved
Hide resolved
...src/test/java/org/opensearch/sql/calcite/validate/shuttles/SkipRelValidationShuttleTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/validate/ValidationUtilsTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/validate/ValidationUtilsTest.java
Show resolved
Hide resolved
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In
`@core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java`:
- Around line 173-184: The test method
testLeastRestrictive_ipUdtAndOther_returnsIpUdt is using SqlTypeName.NULL for
the second type which duplicates the IP+NULL scenario; change the second type to
SqlTypeName.OTHER by creating RelDataType otherType =
TYPE_FACTORY.createSqlType(SqlTypeName.OTHER) and pass List.of(ipUdt, otherType)
into TYPE_FACTORY.leastRestrictive so the code exercises the IP+OTHER branch;
update assertions remain the same (use OpenSearchTypeUtil.isIp and
result.isNullable()) and ensure the variable name (otherType) replaces nullType
in that call.
- Around line 20-22: Add an integration-style test method (e.g.,
testLeastRestrictiveViaSqlPipeline) to OpenSearchTypeFactoryTest that drives
leastRestrictive through Calcite's SQL->validation->Rel conversion pipeline:
construct a test schema with the same UDT types used by the existing unit tests,
parse a simple SQL SELECT that uses expressions/columns with those UDTs, run the
SqlValidator (using the test's SqlValidator and
OpenSearchSqlOperatorTable/SqlStdOperatorTable), convert the validated SqlNode
to a RelNode via SqlToRelConverter, and assert that the resulting RelNode types
(or the validator/typeFactory outputs) resolved the common type using
leastRestrictive as expected; reference the class OpenSearchTypeFactoryTest, the
type factory instance used in existing tests, and the Calcite pipeline classes
SqlParser/SqlValidator/SqlToRelConverter to locate where to integrate this new
test.
- Around line 24-278: Add three unit tests to OpenSearchTypeFactoryTest that
cover null/empty inputs for TYPE_FACTORY.leastRestrictive: (1)
testLeastRestrictive_nullInput_returnsNull — call
TYPE_FACTORY.leastRestrictive(null) and assertNull(result); (2)
testLeastRestrictive_listWithNullElement_returnsNull — call
TYPE_FACTORY.leastRestrictive(List.of(TYPE_FACTORY.createUDT(ExprUDT.EXPR_DATE),
null)) and assertNull(result); (3) testLeastRestrictive_emptyList_returnsNull —
call TYPE_FACTORY.leastRestrictive(List.of()) and assertNull(result). Name each
test accordingly and use the existing TYPE_FACTORY and ExprUDT symbols so they
integrate with the current test suite.
In
`@core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionTest.java`:
- Around line 101-113: The test testImplicitCast_preservesNullability currently
checks nullableResult but misses asserting nonNullableResult's nullability;
update the test in PplTypeCoercionTest (method
testImplicitCast_preservesNullability) to assert that nonNullableResult is
non-nullable by adding an assertion like
assertFalse(nonNullableResult.isNullable()) after the existing checks on
nonNullableResult, using the same typeCoercion.implicitCast invocation to verify
non-nullable input yields non-nullable output.
🧹 Nitpick comments (5)
core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeUtilTest.java (2)
19-45: Consider adding NULL input tests for type-checking methods.Only
isScalar(line 342) includes a NULL input test. Per the coding guidelines, NULL input tests should be included for all new functions. Consider adding NULL input tests forisUserDefinedType,isNumericOrCharacter,isDatetime,isDate,isTimestamp,isTime,isCharacter,isIp, andisBinaryto verify behavior when passednull.Example NULL input tests to add
`@Test` public void testIsUserDefinedType_withNull_returnsFalse() { assertFalse(OpenSearchTypeUtil.isUserDefinedType(null)); } `@Test` public void testIsNumericOrCharacter_withNull_returnsFalse() { assertFalse(OpenSearchTypeUtil.isNumericOrCharacter(null)); } `@Test` public void testIsDatetime_withNull_returnsFalse() { assertFalse(OpenSearchTypeUtil.isDatetime(null)); } // Similar tests for isDate, isTimestamp, isTime, isCharacter, isIp, isBinaryAs per coding guidelines: "NULL input tests must be included for all new functions".
8-10: Add missing import forjava.util.List.
java.util.List.ofis used inline at lines 369 and 372. Consider adding an import for consistency with the other imports.Suggested import
import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertTrue; import static org.opensearch.sql.calcite.utils.OpenSearchTypeFactory.TYPE_FACTORY; +import java.util.List; import org.apache.calcite.rel.type.RelDataType;Then update lines 369-372:
- java.util.List.of( + List.of( TYPE_FACTORY.createSqlType(SqlTypeName.INTEGER), TYPE_FACTORY.createSqlType(SqlTypeName.VARCHAR)), - java.util.List.of("id", "name")); + List.of("id", "name"));core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java (1)
24-25: Align test method names with the required pattern.Current names use underscores; the guideline expects
test<Functionality><Condition>(camel-cased) naming. Please rename consistently (e.g.,testLeastRestrictiveDateUdtsOnlyReturnsDateUdt).As per coding guidelines, "Test naming should follow pattern test".
core/src/test/java/org/opensearch/sql/calcite/validate/shuttles/PplRelToSqlRelShuttleTest.java (2)
94-201: Consider adding boundary condition tests for interval literals.Per testing guidelines, boundary condition tests (min/max values, zero, negative inputs) should be included for new functions. The interval tests use specific positive values but don't cover edge cases:
- Zero interval value (e.g.,
INTERVAL 0 DAY)- Negative interval value (if supported)
- Very large interval values near overflow boundaries
These edge cases could reveal issues with arithmetic operations in the transformation logic.
259-294: Test helper duplicates logic rather than testing actual class.The
createRexShuttlehelper reimplements the exact transformation logic fromPplRelToSqlRelShuttle's internalvisitLiteralmethod rather than delegating to the actual class. While this approach isolates and tests the algorithm correctly, it doesn't verify thatPplRelToSqlRelShuttleitself functions as intended—if the production implementation diverges from this test copy, tests would still pass.Consider refactoring to instantiate
PplRelToSqlRelShuttleand invoke its behavior directly, rather than reimplementing its logic inline in tests.
core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactoryTest.java
Show resolved
Hide resolved
core/src/test/java/org/opensearch/sql/calcite/validate/PplTypeCoercionTest.java
Show resolved
Hide resolved
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
|
This PR is too large to review, is it possible to separate into multiple smaller ones? |
|
This PR is stalled because it has been open for 2 weeks with no activity. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 5723ced. 'Diff too large, requires skip by maintainers after manual review' Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 3fd9714. 'Diff too large, requires skip by maintainers after manual review' Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 5723ced. 'Diff too large, requires skip by maintainers after manual review' Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 2c831f0. 'Diff too large, requires skip by maintainers after manual review' Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
…path PRs Merge the validation branch with origin/main, resolving conflicts caused by the revert of dynamic column support (opensearch-project#5139). Spath-related files (DynamicFieldsHelper, AppendFunctionImpl, spath explain YAMLs) are deleted to align with the revert. All other conflicts resolved to preserve the validation round-trip changes from the branch. Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
- Add missing import for OpenSearchSparkSqlDialect in CalcitePPLNoMvTest (class moved from ppl to core/validate package) - Update CalcitePPLSpathTest.testSpathAutoExtractModeWithEval expected plan to match validation-deferred behavior (no eager SAFE_CAST) - Update CalcitePPLNoMvTest.testNoMvNonExistentField to accept "inferred array element type" error message Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
The validation round-trip (RelNode → SQL → validate → RelNode) introduces plan changes such as extra LogicalProject nodes, type annotations on MAP keys, and reordered aggregate operands. Update YAML expected outputs and test assertions to match. - Catch Throwable (not just Exception) in validate() for AssertionError from unsupported RelNodes like LogicalGraphLookup - Update 17 YAML expected output files for calcite and calcite_no_pushdown - Add 4 new alternative YAML files for non-deterministic plan variants - Add separate YAML for boolean string literal filter test - Accept "inferred array element type" error in NoMv missing field test Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
Signed-off-by: Yuanchun Shen <yuanchu@amazon.com>
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit 84f53df. 'Diff too large, requires skip by maintainers after manual review' Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
Description
This PR migrates from our custom type checking mechanism to Apache Calcite's native validation system by introducing a SqlNode validation layer. This addresses the lack of a proper SQL validation phase and resolves compatibility issues with user-defined types (UDTs).
This implements Approach 1 described in #3865.
How It Works
The PR introduces a validation layer in the query execution pipeline:
This approach leverages Calcite's robust type system while preserving OpenSearch's custom type semantics through careful type mapping and restoration.
Benefits
sqrt(x)→pow(x, 0.5)) before generating physical plansWork Items / Implementation Details
Core Infrastructure
deriveType,coerceOperandType, and type inference methodsType System Enhancements
SqlTypeNameenum with OpenSearch UDTs through dynamic type mapping:EXPR_TIME→SqlTypeName.TIME) before validationcommonTypeForBinaryComparisonto enable datetime cross-type comparisons (DATE vs TIME, etc.)SAFE_CASTfor string-to-number conversions to tolerate malformatted data; useCASTfor literal numbersFunction & Operator Handling
array_slice,reduce,mvappend, etc.) with proper type inferencejson_extract,json_set, etc.)atanoverloading, percentile approximations)DISTINCT_COUNT_APPROX,COUNT(*)rewriting, etc.)ADDoperator: String concatenation vs numeric additionATAN: Single-operand vs two-operand versionsGEOIP: String overrides due to UDT erasureSqlKindforDIVIDEandMODUDFsQuery Construct Support
RelHintthrough conversions (addedSqlHintforLogicalAggregate)bucket_nullableflagsIN/NOT INwith tuple inputs via row constructor rewritingDialect & Compatibility
LogicalValuesfor empty row generationEdge Cases & Fixes
SAFE_CASTRelToSqlConverteris instantiated per-use (stateful component)Test Fixes
CalcitePPLDateTimeBuiltinFunctionIT: Interval semanticsCalcitePPLBuiltinFunctionIT: LOG function, sarg deserializationCalciteArrayFunctionIT: Type checkers, reduce function inferenceCalciteMathematicalFunctionIT,CalcitePPLAggregationITCalciteBinCommandIT: Timestamp operations, windowed aggregates in GROUP BYCalciteStreamstatsCommandIT: Sort columns, bucket_nullableCalcitePPLJsonBuiltinFunctionIT: String conversionCode Cleanup
PPLFuncImpTableUDFOperandMetadata.wrapUDTinterfaceCalciteFuncSignature,PPLTypeCheckerEnhancedCoalesceinto built-inCoalesceOptimizations
SAFE_CASTon non-string literal numbers (useCASTfor better performance)dedupoptimization issuesPerformance Impact
Profiled on personal laptop, each test runs twice, then averaged.
Conclusion: No significant performance degradation. ClickBench shows slight overhead (+12.3%) during analyze phase due to validation, but big5 and TPCH show improvements, likely from better query optimization enabled by proper type information.
Related Issues
Resolves #4636, resolves #3865
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.