Explicit head command should override fetch_size in PPL#5194
Explicit head command should override fetch_size in PPL#5194ahkcs wants to merge 10 commits intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit 8c6d460)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 8c6d460 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 713b095
Suggestions up to commit 713b095
Suggestions up to commit d7a93b3
Suggestions up to commit 3039353
Suggestions up to commit 4cacb79
|
When a user's PPL query contains an explicit head command, the fetch_size parameter should not inject an additional Head node on top. Previously, fetch_size always wrapped the plan with Head(fetchSize), creating a double-Head that could cap results unexpectedly (e.g., head 100 with fetch_size=5 would only return 5 rows). This adds a containsHead() check in AstStatementBuilder to skip the Head injection when the user's query already has a head command, letting the user's explicit intent take precedence. Signed-off-by: Kai Huang <ahkcs@amazon.com>
d2b35ee to
ff1fd32
Compare
|
Persistent review updated to latest commit ff1fd32 |
Remove the outer LogicalSort(fetch=[fetchSize]) from expected explain plans when an explicit head command is present, since fetch_size no longer injects a Head node in that case. Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 7f797a9 |
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstStatementBuilder.java
Outdated
Show resolved
Hide resolved
Verifies that when a query has multiple heads (e.g., head 3 | head 500), fetch_size does not inject yet another Head node. The existing Head nodes are preserved as-is, with the inner head 3 limiting first. Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit adea087 |
|
Persistent review updated to latest commit e62de53 |
Signed-off-by: ahkcs <austinhkcs@gmail.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit a6e1216 |
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstStatementBuilder.java
Outdated
Show resolved
Hide resolved
Change from recursive all-children traversal to iterative first-child chain walk. This ensures head commands inside join subqueries or nested structures do not incorrectly suppress fetch_size injection on the outer query. Signed-off-by: ahkcs <austinhkcs@gmail.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 5841f08 |
Explain that since PPL's fetch_size has no pagination support, it acts as a default response size rather than an absolute cap. When the user specifies an explicit head command, fetch_size yields to avoid silently discarding rows with no way to retrieve them. Also document that head commands inside subqueries do not affect fetch_size for the outer query. Signed-off-by: ahkcs <austinhkcs@gmail.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 4cacb79 |
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstStatementBuilder.java
Show resolved
Hide resolved
Verifies that containsHead correctly walks past Project (from fields) to find the Head node underneath. Signed-off-by: ahkcs <austinhkcs@gmail.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 3039353 |
docs/user/interfaces/endpoint.rst
Outdated
| +-----------------------------------------------+---------------------------------------------------+ | ||
| | Query | Behavior with ``fetch_size=5`` | | ||
| +===============================================+===================================================+ | ||
| | ``source=t \| fields age`` | ``fetch_size`` applies — returns 5 rows | |
b1fe41b to
d7a93b3
Compare
Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit d7a93b3 |
d7a93b3 to
713b095
Compare
|
Persistent review updated to latest commit 713b095 |
1 similar comment
|
Persistent review updated to latest commit 713b095 |
| +===============================================+===================================================+ | ||
| | ``source=t | fields age`` | ``fetch_size`` applies — returns 5 rows | | ||
| +-----------------------------------------------+---------------------------------------------------+ | ||
| | ``source=t | head 100 | fields age`` | ``head`` takes precedence — returns 100 rows | |
There was a problem hiding this comment.
fetch_size is optional — if you don't want it, don't send it. The server should honor what the client explicitly requests. The fix belongs in the client that's injecting fetch_size by default.
From a REST API perspective, fetch_size is an explicit, optional parameter the caller chooses to send. If the caller sends fetch_size=5, they're requesting at most 5 rows — that's the API contract. Adding server-side logic to silently ignore that parameter based on query content breaks the principle of least surprise at the API layer.
There was a problem hiding this comment.
I think this is a very valid point. If we fix it server-side, we're working around a client behavior by making the API semantics inconsistent. If we fix it client-side, the API contract stays clean.
There was a problem hiding this comment.
Opened PR on OSD side: opensearch-project/OpenSearch-Dashboards#11430
Validates that head commands inside join subqueries do not suppress fetch_size on the outer query, locking down the scoping guarantee. Signed-off-by: ahkcs <austinhkcs@gmail.com> Signed-off-by: Kai Huang <ahkcs@amazon.com>
|
Persistent review updated to latest commit 8c6d460 |
Summary
Fixes bug 2 in #5191
When a user's PPL query contains an explicit
headcommand, thefetch_sizeAPI parameter should not inject an additionalHeadnode on top. Previously,AstStatementBuilder.visitPplStatement()always wrapped the plan withHead(fetchSize), creating a double-Head that capped results unexpectedly.Example:
source=index | head 100withfetch_size=5would only return 5 rows instead of the expected 100 rows, because the outerHead(5)silently overrode the user's explicithead 100.Test plan
head 5+fetch_size=10→ 5 rows ✓head 8+fetch_size=3→ 8 rows ✓ (head overrides fetch_size)fetch_size=3(no head) → 3 rows ✓ (fetch_size still works when no head)