Apply docs website feedback to ppl functions by ritvibhatt · Pull Request #5207 · opensearch-project/sql

ritvibhatt · 2026-03-05T23:20:15Z

Description

Apply feedback from PR for adding PPL functions documentation to documentation website to keep sql repo in sync (Update PPL functions documentation documentation-website#11747)
Update exporter script to add backtick wrapping for standalone %-prefixed format specifiers in exported CLI tables and ensured index.md gets nav_order: 1 in the functions directory.

Related Issues

Check List

New functionality includes testing.
New functionality has been documented.
New functionality has javadoc added.
New functionality has a user manual doc added.
New PPL command checklist all confirmed.
API changes companion pull request created.
Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

github-actions · 2026-03-05T23:23:45Z

PR Reviewer Guide 🔍

(Review updated until commit `93f92d5`)

Here are some key observations to aid the review process:

🧪 No relevant tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes Sub-PR theme: Update docs exporter script with backtick wrapping and index.md nav ordering Relevant files: scripts/docs_exporter/export_to_docs_website.py Sub-PR theme: Apply documentation website feedback to PPL function docs Relevant files: docs/user/ppl/functions/math.md docs/user/ppl/functions/statistical.md docs/user/ppl/functions/condition.md
⚡ Recommended focus areas for review Nested Function Definition The `backtick_percent` function is defined inside a loop body (within the list comprehension context), causing it to be redefined on every iteration of the outer loop. It should be moved outside the loop to avoid unnecessary repeated function object creation. def backtick_percent(cell): if "'" in cell or '"' in cell or '`' in cell: return cell if re.match(r'^%\S+$', cell): return f'`{cell}`' return cell cells = [backtick_percent(c) for c in cells] Missing Reference Link The `EARLIEST` function description removed the reference link to the detailed relative timestamp documentation ("Read more details here") that was present in the old version. This may reduce discoverability of important documentation for users who need more detail on relative time formats. Relative string formats: 1. `"now"` or `"now()"`: Uses the current system time. 2. Absolute format (`MM/dd/yyyy:HH:mm:ss` or `yyyy-MM-dd HH:mm:ss`): Converts the string to a timestamp and compares it against the field value. 3. Relative format: `(+\|-)<time_integer><time_unit>[+<...>]@<snap_unit>` Steps to specify a relative time: - Time offset: Indicate the offset from the current time using `+` or `-`. - Time amount: Provide a numeric value followed by a time unit (`s`, `m`, `h`, `d`, `w`, `M`, `y`). - Snap to unit: Optionally, specify a snap unit using `@<unit>` to round the result down to the nearest unit (for example, hour, day, month). Examples (assuming current time is `2025-05-28 14:28:34`): - `-3d+2y` → `2027-05-25 14:28:34`. - `+1d@m` → `2025-05-29 14:28:00`. - `-3M+1y@M` → `2026-02-01 00:00:00`. EXISTS Section Accuracy The `EXISTS` section header now says "Use `isnull(field)` or `isnotnull(field)` to test field existence" as its usage, which is not a function signature but a recommendation. This may be confusing as it doesn't follow the same pattern as other function entries and could mislead users into thinking `EXISTS` is a callable function. ## EXISTS Usage: Use `isnull(field)` or `isnotnull(field)` to test field existence Since OpenSearch doesn't differentiate between null and missing values, functions like `ismissing`/`isnotmissing` are not available. Use `isnull`/`isnotnull` to test field existence instead. MULTIPLY NULL Behavior The old documentation for `MULTIPLY` noted "If y equals to 0, then returns NULL" in the return type description. This behavior note has been removed in the new version. If this is still the actual behavior, it should be documented to avoid user confusion. ## MULTIPLY Usage: `MULTIPLY(x, y)` Calculates the product of `x` and `y`. Parameters: - `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. - `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. Return type: The wider numeric type between `x` and `y` Synonyms: Multiplication Symbol (`*`)

github-actions · 2026-03-05T23:24:53Z

PR Code Suggestions ✨

Latest suggestions up to 93f92d5

Explore these optional code suggestions:

Category	Suggestion	Impact
General	Fix incorrect return type documentation The return type for `ROUND` with `FLOAT/DOUBLE` input is incorrectly documented as `LONG`. When rounding a `FLOAT` or `DOUBLE` with a decimal place argument, the result should remain `DOUBLE`, not `LONG`. This is a documentation accuracy issue that could mislead users. docs/user/ppl/functions/math.md [1184-1186] Return type: - `(INTEGER/LONG [,INTEGER])` -> `LONG`. -- `(FLOAT/DOUBLE [,INTEGER])` -> `LONG`. +- `(FLOAT/DOUBLE [,INTEGER])` -> `DOUBLE`. Suggestion importance[1-10]: 7 __ Why: The `ROUND` function's return type for `FLOAT/DOUBLE` inputs is documented as `LONG`, which matches the old documentation. However, this could be a genuine documentation error if the actual behavior returns `DOUBLE` for decimal rounding. This is a potentially important accuracy issue for users relying on the documentation.	Medium
	Fix inconsistent code formatting in description The backtick formatting is inconsistent — `multi_match query` should be formatted as `multi_match` query (with only the function name in backticks, not the word "query"). This is a documentation accuracy issue that could confuse readers. docs/user/ppl/functions/relevance.md [211] -Maps to the `multi_match query` in the OpenSearch engine. +Maps to the `multi_match` query in the OpenSearch engine. Suggestion importance[1-10]: 5 __ Why: The backtick formatting wraps `multi_match query` as a single code span, but only `multi_match` is the function name. The word "query" should be outside the backticks for consistency with how other functions are described in the same file.	Low
	Move helper function outside the loop Defining a function inside a loop body (which runs for every table row) is inefficient. The `backtick_percent` function should be defined outside the loop, ideally at module level or at least outside the `for line in lines` loop, to avoid redefining it on every iteration. scripts/docs_exporter/export_to_docs_website.py [108-114] def backtick_percent(cell): if "'" in cell or '"' in cell or '`' in cell: return cell if re.match(r'^%\S+$', cell): return f'`{cell}`' return cell + +# ... (inside the loop) cells = [backtick_percent(c) for c in cells] Suggestion importance[1-10]: 4 __ Why: The suggestion is valid - defining `backtick_percent` inside the loop body causes it to be redefined on every iteration, which is inefficient. However, the performance impact is minor for typical documentation processing, and the `improved_code` doesn't clearly show where the function should be moved to.	Low
	Restore missing NULL behavior documentation The `MULTIPLY` function description previously noted that if `y` equals 0, it returns `NULL`, but this behavior note was removed in the new documentation. This is important behavioral information that users need to know. Consider adding a note about the behavior when multiplying by zero. docs/user/ppl/functions/math.md [47-49] Return type: The wider numeric type between `x` and `y` -Synonyms: Addition Symbol (`+`) +Synonyms: Multiplication Symbol (``) +> Note*: If `y` equals `0`, the function returns `NULL`. + Suggestion importance[1-10]: 4 __ Why: The `improved_code` incorrectly applies the note to the `ADD` function section (which has `Addition Symbol (+)` synonym) rather than the `MULTIPLY` function section. The `existing_code` snippet matches the `ADD` function, not `MULTIPLY`, making this suggestion misplaced.	Low
	Use consistent callout/note formatting throughout document The blockquote format used here (`>`) is inconsistent with the note format used elsewhere in the same file (e.g., `{: .note}`). Using a consistent callout style throughout the document improves readability and ensures proper rendering in the documentation framework. docs/user/ppl/functions/aggregations.md [787-790] -> The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: -> - The default value is 0, which returns an unlimited number of values. -> - Setting this to any positive integer limits the number of unique values. -> - See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details +The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: +- The default value is `0`, which returns an unlimited number of values. +- Setting this to any positive integer limits the number of unique values. +- See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details. +{: .note} Suggestion importance[1-10]: 4 __ Why: The blockquote format (`>`) is inconsistent with the `{: .note}` callout style used elsewhere in the file. Consistent formatting improves readability and ensures proper rendering in the documentation framework.	Low
	Narrow regex to avoid wrapping compound specifiers The regex `r'^%\S+$'` will also match cells like `%V%m` (compound format specifiers) or `%%` (literal percent), wrapping them in backticks even though they may already be handled or should not be wrapped. Consider whether compound specifiers like `%s%f` or `%%` should be excluded from this transformation to avoid double-wrapping or incorrect rendering. scripts/docs_exporter/export_to_docs_website.py [111-112] -if re.match(r'^%\S+$', cell): +if re.match(r'^%[A-Za-z%]$', cell) or re.match(r'^%[A-Za-z]{1,2}$', cell): return f'`{cell}`' Suggestion importance[1-10]: 3 __ Why: The concern about compound specifiers like `%s%f` or `%%` being wrapped is valid, but looking at the PR context, the table cells in the markdown already have backtick-wrapped specifiers (e.g., `%a`), so the check `if '`' in cell`would prevent double-wrapping. The`improved_code`uses a regex that would miss valid single-character specifiers like`%:z`,` %::z`,` %:::z` that appear in the STRFTIME table, making it less correct than the original.	Low
	Remove inconsistent extra blank lines in examples There is an extra blank line between the closing code fence and the result table for the `regexp_match` examples. This inconsistency in formatting differs from all other examples in the document and may render unexpectedly. Remove the extra blank line to maintain consistent formatting. docs/user/ppl/functions/condition.md [845-852] ```ppl source=logs \| where regexp_match(message, 'ERROR\|WARN\|FATAL') \| fields timestamp, message \| timestamp \| message \| +\| --- \| --- \| +\| 2024-01-15 10:23:45 \| ERROR: Connection timeout to database \| +\| 2024-01-15 10:24:12 \| WARN: High memory usage detected \| +\| 2024-01-15 10:25:33 \| FATAL: System crashed unexpectedly \| <details><summary>Suggestion importance[1-10]: 3</summary> __ Why: The extra blank line between the code fence and result table is a minor formatting inconsistency. The `improved_code` also reformats the table structure, which goes beyond just removing the blank line. </details></details></td><td align=center>Low </td></tr><tr><td> <details><summary>Add missing newline at end of file</summary> ___ The file ends without a newline character, which can cause issues with some Markdown <br>renderers and version control systems. Add a newline at the end of the file to <br>follow standard file formatting conventions. [docs/user/ppl/functions/condition.md [898]](https://github.com/opensearch-project/sql/pull/5207/files#diff-d96327aa4c8968cb04d38b138e14622bb2b5370283579c26145c2b0dc1adc164R898-R898) ```diff -\| Desktop Computer Tower \| Computing\| -+------------------------+----------+ -``` +\| Desktop Computer Tower \| Computing \| +\| --- \| --- \| +\| Dell Laptop XPS \| Computing \| +\| iPhone 15 Pro \| Mobile \| +\| Wireless Mouse \| Other \| +\| Desktop Computer Tower \| Computing \| Suggestion importance[1-10]: 2 __ Why: The `improved_code` doesn't match the `existing_code` - it replaces the old-style table with a new-style table rather than simply adding a newline. The suggestion is also minor formatting concern with low impact.	Low
	Verify anchor link resolves to correct section The anchor link was changed from `#percentile-or-percentile_approx` to `#percentile-percentile_approx`, but the section heading in `aggregations.md` is `###` `PERCENTILE, PERCENTILE_APPROX`. Markdown typically generates anchors by lowercasing and replacing spaces/special characters with hyphens, so the comma and space would produce `#percentile-percentile_approx`. Verify that this anchor correctly resolves to the intended section in the rendered documentation. docs/user/ppl/functions/index.md [21] +- [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-percentile_approx). - Suggestion importance[1-10]: 2 __ Why: The `existing_code` and `improved_code` are identical, making this purely a verification suggestion. The anchor `#percentile-percentile_approx` is likely correct for the heading `### PERCENTILE, PERCENTILE_APPROX` in standard Markdown rendering.	Low
	Possible issue	Verify return type change from INTEGER to BOOLEAN The old documentation listed the return type of `LIKE` as `INTEGER`, but the new documentation correctly changes it to `BOOLEAN`. However, `ILIKE` also had `INTEGER` as its return type in the old docs and is now changed to `BOOLEAN` — this is correct. Verify that both `LIKE` and `ILIKE` actually return `BOOLEAN` (not `INTEGER`) in the implementation to ensure the documentation matches the actual behavior. docs/user/ppl/functions/string.md [120-128] +Return type: `BOOLEAN` +### Example + +```ppl +source=people +\| eval `LIKE('hello world', '_ello%')` = LIKE('hello world', '_ello%'), `LIKE('hello world', '_ELLo%', true)` = LIKE('hello world', '_ELLo%', true), `LIKE('hello world', '_ELLo%', false)` = LIKE('hello world', '_ELLo%', false) +\| fields `LIKE('hello world', '_ello%')`, `LIKE('hello world', '_ELLo%', true)`, `LIKE('hello world', '_ELLo%', false)` +``` Suggestion importance[1-10]: 3 __ Why: The `existing_code` and `improved_code` are identical, making this a verification suggestion rather than an actual code change. The suggestion asks to verify implementation behavior, which is valid but doesn't warrant a high score.	Low

Previous suggestions

Suggestions up to commit c578e42

Category	Suggestion	Impact
Possible issue	Fix trailing percent in format specifier The format string for `DAY_HOUR` appears to have a stray `%` at the end: `%d%H%`. Based on the pattern of other entries and the original content, it should be `%d%H` (without the trailing `%`). docs/user/ppl/functions/datetime.md [1097] -\| `DAY_HOUR` \| `%d%H%` \| +\| `DAY_HOUR` \| `%d%H` \| Suggestion importance[1-10]: 7 __ Why: The `DAY_HOUR` format string `%d%H%` has a stray trailing `%` that appears to be a typo. The original old hunk also had `%d%H%` (same bug), but the correct format should be `%d%H` based on the pattern of other entries. This is a documentation accuracy issue.	Medium
	Fix incorrect return type for float/double inputs The return type for `FLOAT/DOUBLE` inputs to `ROUND` is listed as `LONG`, but rounding a float/double should return a `DOUBLE` (or at minimum `FLOAT`), not a `LONG`. The original documentation also listed this as `LONG`, but this appears to be a documentation inaccuracy that should be corrected. docs/user/ppl/functions/math.md [1184-1186] Return type: - `(INTEGER/LONG [,INTEGER])` -> `LONG`. -- `(FLOAT/DOUBLE [,INTEGER])` -> `LONG`. +- `(FLOAT/DOUBLE [,INTEGER])` -> `DOUBLE`. Suggestion importance[1-10]: 5 __ Why: The return type for `FLOAT/DOUBLE` inputs to `ROUND` is listed as `LONG`, which appears incorrect since rounding a float/double with decimal places should return a `DOUBLE`. This is a potential documentation inaccuracy that could mislead users about the actual behavior of the function.	Low
	Verify anchor link matches actual heading ID The anchor link for `PERCENTILE, PERCENTILE_APPROX` was changed from `#percentile-or-percentile_approx` to `#percentile-percentile_approx`. Verify that the new anchor ID matches the actual heading in `aggregations.md`, which reads `###` `PERCENTILE, PERCENTILE_APPROX`, to avoid a broken link. docs/user/ppl/functions/index.md [21] +- [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-percentile_approx). - Suggestion importance[1-10]: 3 __ Why: The `existing_code` and `improved_code` are identical, making this only a verification request. The anchor `#percentile-percentile_approx` should match the heading in `aggregations.md`, but no actual code change is proposed.	Low
General	Fix inconsistent inline code formatting The backtick formatting is inconsistent — `multi_match query` should be formatted as two separate elements or as `multi_match` query. The current formatting wraps both the function name and the word "query" together in a code span, which is misleading. docs/user/ppl/functions/relevance.md [211] -Maps to the `multi_match query` in the OpenSearch engine. +Maps to the `multi_match` query in the OpenSearch engine. Suggestion importance[1-10]: 5 __ Why: The formatting `multi_match query` wraps both the function name and the word "query" in a single code span, which is misleading. The improved code correctly formats only `multi_match` as code, making the documentation clearer.	Low
	Move helper function outside of loop Defining a function inside a loop body is inefficient since it gets redefined on every iteration. Move the `backtick_percent` function definition outside of the loop (or at module level) so it is only created once. scripts/docs_exporter/export_to_docs_website.py [108-114] def backtick_percent(cell): if "'" in cell or '"' in cell or '`' in cell: return cell if re.match(r'^%\S+$', cell): return f'`{cell}`' return cell + +# ... (inside the loop) cells = [backtick_percent(c) for c in cells] Suggestion importance[1-10]: 4 __ Why: The `backtick_percent` function is defined inside the loop body and gets redefined on every iteration, which is slightly inefficient. Moving it outside the loop is a valid improvement, though the performance impact is minimal for typical documentation processing.	Low
	Restore missing type compatibility warning The old documentation noted that if `expr1` and `expr2` are different types, a semantic check failure occurs. The new documentation drops this important constraint warning, which could mislead users into thinking mixed types are always safe. Consider adding a note about type compatibility requirements. docs/user/ppl/functions/condition.md [293-303] Usage: `if(condition, expr1, expr2)` Returns `expr1` if the condition is `true`, otherwise returns `expr2`. Parameters: - `condition` (Required): The Boolean expression to evaluate. - `expr1` (Required): The value to return if the condition is `true`. - `expr2` (Required): The value to return if the condition is `false`. Return type: Least restrictive common type of `expr1` and `expr2` +> Note: If `expr1` and `expr2` have incompatible types, the query will fail a semantic check. + Suggestion importance[1-10]: 4 __ Why: The old documentation warned that incompatible types for `expr1` and `expr2` would fail a semantic check. Removing this constraint warning could mislead users, but the new `Return type: Least restrictive common type` phrasing partially addresses this by implying type coercion occurs rather than failure.	Low
	Restore missing type compatibility constraint warning The old documentation noted that if the two parameters have different types, a semantic check failure occurs. This important constraint was removed in the new version, which could mislead users. Consider adding a note about type compatibility requirements for `ifnull`. docs/user/ppl/functions/condition.md [196-205] Usage: `ifnull(field1, field2)` Returns `field2` if `field1` is `NULL`. Parameters: - `field1` (Required): The field to check for `NULL` values. - `field2` (Required): The value to return if `field1` is `NULL`. Return type: Any (matches input types) +> Note: If `field1` and `field2` have different types, the query will fail a semantic check. + Suggestion importance[1-10]: 4 __ Why: The old documentation warned that different types for `field1` and `field2` would fail a semantic check. This constraint was removed in the new version, which could mislead users. Adding this note back would improve accuracy, though the `Return type: Any (matches input types)` phrasing partially conveys the expectation of matching types.	Low
	Fix inconsistent callout formatting The blockquote formatting is inconsistent with the rest of the document, which uses `{: .note}` callout syntax for similar informational notes (as seen in the `LIST` and `PERCENTILE` sections). Using a blockquote here creates visual inconsistency. docs/user/ppl/functions/aggregations.md [787-790] -> The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: -> - The default value is 0, which returns an unlimited number of values. -> - Setting this to any positive integer limits the number of unique values. -> - See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details +The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: +- The default value is `0`, which returns an unlimited number of values. +- Setting this to any positive integer limits the number of unique values. +- See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details. +{: .note} Suggestion importance[1-10]: 4 __ Why: The blockquote formatting is inconsistent with the `{: .note}` callout syntax used elsewhere in the document (e.g., in `LIST` and `PERCENTILE` sections). Standardizing to `{: .note}` would improve visual consistency across the documentation.	Low
	Fix missing table row and trailing newline The file ends without a newline character, which can cause issues with some Markdown renderers and version control systems. The last row of the table is also missing the final entry `\| Desktop Computer Tower \| Computing \|`. Add the missing table row and a trailing newline. docs/user/ppl/functions/condition.md [898] -\| Desktop Computer Tower \| Computing\| -+------------------------+----------+ -``` +\| Desktop Computer Tower \| Computing \| Suggestion importance[1-10]: 3 __ Why: The suggestion points out a missing table row for `Desktop Computer Tower` and a trailing newline. Looking at the PR diff, the new format uses a Markdown table without the `+---+` borders, and the `Desktop Computer Tower` row is indeed missing from the new format. However, the `improved_code` only adds the missing row without addressing the trailing newline concern, making it a partial fix with moderate impact.	Low
	Verify corrected return types match implementation The `LIKE` function's documented return type was changed from `INTEGER` to `BOOLEAN` in the new code, which is actually a correction. However, the `ILIKE` function's return type was also changed from `INTEGER` to `BOOLEAN`. Verify that both `LIKE` and `ILIKE` actually return `BOOLEAN` and not `INTEGER` to ensure the documentation matches the implementation. docs/user/ppl/functions/string.md [120] +Return type: `BOOLEAN` - Suggestion importance[1-10]: 2 __ Why: The `existing_code` and `improved_code` are identical, so this is just a verification request. The suggestion asks to confirm the return type change from `INTEGER` to `BOOLEAN` matches the implementation, but doesn't propose any actual code change.	Low

Suggestions up to commit a650ebf

Category	Suggestion	Impact
Possible issue	Fix malformed format specifier in table The format string for `DAY_HOUR` appears to have a stray `%` at the end (`%d%H%`), which looks like a copy-paste error. It should likely be `%d%H` to match the pattern of the other combined specifiers. docs/user/ppl/functions/datetime.md [1097] -\| `DAY_HOUR` \| `%d%H%` \| +\| `DAY_HOUR` \| `%d%H` \| Suggestion importance[1-10]: 7 __ Why: The `DAY_HOUR` format specifier `%d%H%` has a trailing `%` that appears to be a typo, inconsistent with the pattern of all other combined specifiers in the table. This is a documentation accuracy issue that could mislead users.	Medium
	Missing table row in final example The last example in the `REGEXP_MATCH` section is missing the final row "Desktop Computer Tower" in the new Markdown table format. The table ends with "Wireless Mouse" but the original output included a fourth row. Additionally, the file is missing a trailing newline, which can cause issues with some Markdown renderers and linters. docs/user/ppl/functions/condition.md [895-898] -\| Desktop Computer Tower \| Computing\| -+------------------------+----------+ -``` +\| name \| category \| +\| --- \| --- \| +\| Dell Laptop XPS \| Computing \| +\| iPhone 15 Pro \| Mobile \| +\| Wireless Mouse \| Other \| +\| Desktop Computer Tower \| Computing \| Suggestion importance[1-10]: 6 __ Why: The new Markdown table format for the last `REGEXP_MATCH` example is missing the "Desktop Computer Tower" row that was present in the original documentation, resulting in incomplete example output.	Low
	Fix incorrect return type for ROUND with float input The return type for `ROUND` with `FLOAT/DOUBLE` input is listed as `LONG`, but rounding a float/double to decimal places should return a `DOUBLE`, not a `LONG`. The original documentation also listed this as `LONG`, but this appears to be a documentation inaccuracy that should be corrected in the new version. docs/user/ppl/functions/math.md [1184-1186] Return type: - `(INTEGER/LONG [,INTEGER])` -> `LONG`. -- `(FLOAT/DOUBLE [,INTEGER])` -> `LONG`. +- `(FLOAT/DOUBLE [,INTEGER])` -> `DOUBLE`. Suggestion importance[1-10]: 5 __ Why: The return type `LONG` for `FLOAT/DOUBLE` input to `ROUND` appears incorrect when decimal places are specified; however, this was also in the original docs, so it may reflect actual implementation behavior rather than a documentation error.	Low
General	Fix inconsistent inline code formatting The backtick formatting is inconsistent — `multi_match query` should be formatted as two separate elements or as `multi_match` query. The current formatting wraps both the function name and the word "query" together in a code span, which is misleading. docs/user/ppl/functions/relevance.md [211] -Maps to the `multi_match query` in the OpenSearch engine. +Maps to the `multi_match` query in the OpenSearch engine. Suggestion importance[1-10]: 5 __ Why: The formatting `multi_match query` wraps both the function name and the word "query" in a single code span, which is misleading. The fix to `multi_match` query is accurate and improves clarity.	Low
	Avoid redefining function inside loop Defining a function inside a loop body is inefficient since it gets redefined on every iteration of the outer loop. Move `backtick_percent` outside the loop (or at module level) so it is defined only once. scripts/docs_exporter/export_to_docs_website.py [108-114] +# Define outside the loop, e.g. at module level or before the loop: def backtick_percent(cell): if "'" in cell or '"' in cell or '`' in cell: return cell if re.match(r'^%\S+$', cell): return f'`{cell}`' return cell + +# Inside the loop, just use it: cells = [backtick_percent(c) for c in cells] Suggestion importance[1-10]: 4 __ Why: The `backtick_percent` function is defined inside a loop body, causing it to be redefined on every iteration. Moving it outside the loop is a valid optimization, though the performance impact is minimal for typical documentation processing workloads.	Low
	Restore NULL return behavior for MULTIPLY by zero The `MULTIPLY` function description previously noted that if `y` equals 0, it returns `NULL`, but this important behavior detail was removed in the new documentation. This is a meaningful behavioral note that users should be aware of. docs/user/ppl/functions/math.md [115-117] -Return type: The wider numeric type between `x` and `y` +Return type: The wider numeric type between `x` and `y`. If `y` equals `0`, returns `NULL`. -Synonyms: Addition Symbol (`+`) +Synonyms: Multiplication Symbol (`*`) Suggestion importance[1-10]: 4 __ Why: The suggestion correctly identifies that the NULL-on-zero behavior note was removed from `MULTIPLY`, but the `improved_code` incorrectly applies this to the `ADD` function's return type section rather than the `MULTIPLY` section, and also incorrectly changes the synonym symbol.	Low
	Restore type mismatch warning for IF function The original documentation noted that if `expr1` and `expr2` have different types, a semantic check failure occurs. The new text says "Least restrictive common type" but omits the important warning about type mismatch causing a semantic check failure, which is useful information for users. docs/user/ppl/functions/condition.md [293-303] Usage: `if(condition, expr1, expr2)` -... -Return type: Least restrictive common type of `expr1` and `expr2` +Returns `expr1` if the condition is `true`, otherwise returns `expr2`. + +Parameters: + +- `condition` (Required): The Boolean expression to evaluate. +- `expr1` (Required): The value to return if the condition is `true`. +- `expr2` (Required): The value to return if the condition is `false`. + +Return type: Least restrictive common type of `expr1` and `expr2`. If `expr1` and `expr2` have incompatible types, a semantic check error occurs. + Suggestion importance[1-10]: 3 __ Why: The suggestion to add a semantic check warning for type mismatches in `IF` is a minor documentation improvement, but the new "Least restrictive common type" wording already implies type coercion behavior, making this a low-impact change.	Low
	Standardize callout/note formatting style The blockquote formatting used here is inconsistent with the note formatting used elsewhere in the same file (for example, the `LIST` function uses `{: .note}` callout syntax). Using mixed formatting styles for similar advisory content reduces visual consistency in the documentation. docs/user/ppl/functions/aggregations.md [787-790] -> The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: -> - The default value is 0, which returns an unlimited number of values. -> - Setting this to any positive integer limits the number of unique values. -> - See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details +The `plugins.ppl.values.max.limit` setting controls the maximum number of unique values returned: +- The default value is 0, which returns an unlimited number of values. +- Setting this to any positive integer limits the number of unique values. +- See the [PPL Settings](../admin/settings.md#plugins-ppl-values-max-limit) documentation for more details. +{: .note} Suggestion importance[1-10]: 3 __ Why: The blockquote formatting is inconsistent with the `{: .note}` callout syntax used elsewhere in the file (e.g., for `LIST`). Standardizing improves visual consistency, but this is a minor style issue.	Low
	Verify corrected return types match implementation The `LIKE` function's documented return type was changed from `INTEGER` to `BOOLEAN` in the new code, which is actually a correction. However, the `ILIKE` function's return type was also changed from `INTEGER` to `BOOLEAN`. Verify that both `LIKE` and `ILIKE` actually return `BOOLEAN` and not `INTEGER` to ensure the documentation matches the implementation. docs/user/ppl/functions/string.md [120] +Return type: `BOOLEAN` - Suggestion importance[1-10]: 2 __ Why: The suggestion asks to verify that `LIKE` and `ILIKE` return `BOOLEAN` rather than `INTEGER`, but the `existing_code` and `improved_code` are identical, making this a verification request rather than an actionable code change.	Low
	Verify anchor link resolves to correct section The anchor link `#percentile-percentile_approx` was changed from the original `#percentile-or-percentile_approx`. The section heading in `aggregations.md` is `###` `PERCENTILE, PERCENTILE_APPROX`, which would generate the anchor `#percentile-percentile_approx`. Verify that this anchor correctly resolves to the intended section, as the comma in the heading may generate a different anchor depending on the Markdown renderer. docs/user/ppl/functions/index.md [21] +- [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-percentile_approx). - Suggestion importance[1-10]: 2 __ Why: The `existing_code` and `improved_code` are identical, making this only a verification request. The anchor `#percentile-percentile_approx` should work correctly for the heading `### PERCENTILE, PERCENTILE_APPROX` in most Markdown renderers.	Low

Suggestions up to commit 2306832

Category	Suggestion	Impact
Possible issue	Fix incorrect return type for ROUND function The return type for `ROUND` with `FLOAT/DOUBLE` input is incorrectly listed as `LONG`. When rounding a `FLOAT` or `DOUBLE` with a decimal precision argument, the result should be `DOUBLE`, not `LONG`. This is a documentation accuracy issue that could mislead users. docs/user/ppl/functions/math.md [1184-1186] Return type: - `(INTEGER/LONG [,INTEGER])` -> `LONG`. -- `(FLOAT/DOUBLE [,INTEGER])` -> `LONG`. +- `(FLOAT/DOUBLE [,INTEGER])` -> `DOUBLE`. Suggestion importance[1-10]: 7 __ Why: The return type for `ROUND` with `FLOAT/DOUBLE` input is listed as `LONG`, which appears to be incorrect - rounding a float/double with decimal precision should return `DOUBLE`. This is a documentation accuracy issue that could mislead users about function behavior.	Medium
	Fix incorrect return type for LIKE function The `LIKE` function's documented return type is listed as `INTEGER`, but the function description says it returns `TRUE` or `FALSE`. The return type should be `BOOLEAN` to match the actual behavior and the description. The original documentation also incorrectly listed it as `INTEGER`. docs/user/ppl/functions/string.md [120] -Return type: `INTEGER` +Return type: `BOOLEAN` Suggestion importance[1-10]: 7 __ Why: The `LIKE` function description clearly states it returns `TRUE` or `FALSE`, but the documented return type is `INTEGER`. This is a factual error in the documentation that should be corrected to `BOOLEAN`.	Medium
	Fix incorrect return type for ILIKE function The `ILIKE` function's return type is listed as `INTEGER`, but the function description states it returns `TRUE` or `FALSE`. The return type should be `BOOLEAN` to match the actual behavior and the description. docs/user/ppl/functions/string.md [158-162] -Return type: `INTEGER` +Return type: `BOOLEAN` ### Example ```ppl source=people \| eval `ILIKE('hello world', '_ELLo%')` = ILIKE('hello world', '_ELLo%') Suggestion importance[1-10]: 7 __ Why: The `ILIKE` function description states it returns `TRUE` or `FALSE`, but the documented return type is `INTEGER`. This is a factual error that should be corrected to `BOOLEAN`, consistent with the function's described behavior.	Medium
	Fix missing table row and trailing newline The file ends without a newline character, which can cause issues with some Markdown renderers and version control systems. The last row of the table is also missing the `Desktop Computer Tower` entry in the new table format. Add the missing row and a trailing newline. docs/user/ppl/functions/condition.md [895-898] -\| Desktop Computer Tower \| Computing\| -+------------------------+----------+ -``` +\| name \| category \| +\| --- \| --- \| +\| Dell Laptop XPS \| Computing \| +\| iPhone 15 Pro \| Mobile \| +\| Wireless Mouse \| Other \| +\| Desktop Computer Tower \| Computing \| Suggestion importance[1-10]: 5 __ Why: The new table format for `REGEXP_MATCH` examples is missing the `Desktop Computer Tower` row that existed in the old format. This is a valid content omission that reduces documentation completeness.	Low
General	Restrict regex to match only single format specifiers The regex `^%\S+$` matches any non-whitespace sequence after `%`, including compound format strings like `%Y-%m-%d`. If the intent is only to wrap single format specifiers (like `%a`, `%b`), the pattern should be more restrictive. Consider using `^%[A-Za-z]$` or `^%[^%\s]+$` depending on the intended scope. scripts/docs_exporter/export_to_docs_website.py [108-113] def backtick_percent(cell): if "'" in cell or '"' in cell or '`' in cell: return cell - if re.match(r'^%\S+$', cell): + if re.match(r'^%[A-Za-z]$', cell): return f'`{cell}`' return cell Suggestion importance[1-10]: 6 __ Why: The current regex `^%\S+$` would match compound format strings like `%Y-%m-%d`, wrapping them in backticks unintentionally. Restricting to `^%[A-Za-z]$` better targets single format specifiers like `%a`, `%b`, etc., which appear in the DATE_FORMAT table being processed.	Low
	Fix inconsistent inline code formatting The backtick formatting is inconsistent — `multi_match query` should be formatted as two separate elements or as `multi_match` query. The current formatting wraps both the function name and the word "query" together in a code span, which is incorrect. docs/user/ppl/functions/relevance.md [211] -Maps to the `multi_match query` in the OpenSearch engine. +Maps to the `multi_match` query in the OpenSearch engine. Suggestion importance[1-10]: 6 __ Why: The formatting `multi_match query` wraps both the function name and the word "query" in a single code span, which is inconsistent with how other functions are documented (e.g., `match` query, `match_phrase` query). This is a valid documentation quality fix.	Low
	Restore important type constraint note for IFNULL The documentation for `IFNULL` removed the important note that both parameters must have the same type or the semantic check will fail. This is a behavioral constraint that users need to know to avoid errors. The note should be preserved. docs/user/ppl/functions/condition.md [196-205] Usage: `ifnull(field1, field2)` Returns `field2` if `field1` is `NULL`. Parameters: - `field1` (Required): The field to check for `NULL` values. -- `field2` (Required): The value to return if `field1` is `NULL`. +- `field2` (Required): The value to return if `field1` is `NULL`. Must be the same type as `field1`. Return type: Any (matches input types) +> Note: If the two parameters have different types, the query will fail the semantic check. + Suggestion importance[1-10]: 5 __ Why: The PR removed the note that `ifnull` parameters must have the same type or the semantic check will fail. This is a behavioral constraint users need to know to avoid errors, and its removal reduces documentation accuracy.	Low
	Avoid redefining function inside a loop Defining a function inside a loop body (which runs for every table row) is inefficient. The `backtick_percent` function should be defined outside the loop to avoid redefining it on every iteration. Move the function definition to the module or class level, or at least outside the loop. scripts/docs_exporter/export_to_docs_website.py [108-114] +# Define outside the loop (e.g., at module level or before the loop) def backtick_percent(cell): if "'" in cell or '"' in cell or '`' in cell: return cell if re.match(r'^%\S+$', cell): return f'`{cell}`' return cell + +# Inside the loop: cells = [backtick_percent(c) for c in cells] Suggestion importance[1-10]: 4 __ Why: The `backtick_percent` function is defined inside the loop body, causing it to be redefined on every iteration. Moving it outside the loop is a valid optimization, though the performance impact is minor in practice for typical document sizes.	Low
	Restore NULL return behavior for zero divisor The original documentation for `MULTIPLY` included an important note that if `y` equals `0`, the function returns `NULL`. This behavior was removed in the new version and should be preserved in the return type description to avoid user confusion. docs/user/ppl/functions/math.md [104-117] ## MULTIPLY Usage: `MULTIPLY(x, y)` Calculates the product of `x` and `y`. -... -Return type: The wider numeric type between `x` and `y` + +Parameters: + +- `x` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. +- `y` (Required): An `INTEGER`, `LONG`, `FLOAT`, or `DOUBLE` value. + +Return type: The wider numeric type between `x` and `y`. If `y` equals `0`, returns `NULL`. Synonyms: Multiplication Symbol (`*`) Suggestion importance[1-10]: 4 __ Why: The original documentation noted that `MULTIPLY` returns `NULL` when `y` equals `0`, but this behavior was removed in the new version. However, this behavior is unusual for multiplication (unlike division), so it may have been intentionally removed if the behavior changed. The suggestion is valid but uncertain.	Low
	Verify anchor link resolves correctly The anchor link for `PERCENTILE, PERCENTILE_APPROX` was changed from `#percentile-or-percentile_approx` to `#percentile-percentile_approx`, but the section heading in `aggregations.md` is `### PERCENTILE, PERCENTILE_APPROX`. Markdown typically generates anchors by lowercasing and replacing spaces/commas with hyphens, which would produce `#percentile-percentile_approx`. Verify that this anchor correctly resolves to the intended section in the rendered documentation. docs/user/ppl/functions/index.md [21] +- [PERCENTILE, PERCENTILE_APPROX](aggregations.md/#percentile-percentile_approx). - Suggestion importance[1-10]: 3 __ Why: The suggestion asks to verify that the anchor `#percentile-percentile_approx` resolves correctly. The `existing_code` and `improved_code` are identical, so this is just a verification request with no actual code change, warranting a low score.	Low

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

github-actions · 2026-03-05T23:33:41Z

Persistent review updated to latest commit a650ebf

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

github-actions · 2026-03-05T23:57:05Z

Persistent review updated to latest commit c578e42

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

github-actions · 2026-03-06T00:02:26Z

Persistent review updated to latest commit 93f92d5

ritvibhatt added 2 commits March 5, 2026 14:54

apply doc website feedback to ppl functions

97f5ea6

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

take out comments

2306832

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

fix json_append example

a650ebf

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

fix json_append example

c578e42

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

fix links

93f92d5

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>

ritvibhatt marked this pull request as ready for review March 6, 2026 17:04

ritvibhatt requested review from GumpacG, LantaoJin, MaxKsyunz, Swiddis, YANG-DB, Yury-Fridlyand, acarbonetto, anirudha, dai-chen, derek-ho, forestmvey, joshuali925, kavithacm, mengweieric, penghuo, ps48, seankao-az, vamsimanohar and ykmr1224 as code owners March 6, 2026 17:04

ritvibhatt requested review from RyanL1997, noCharger, qianheng-aws and yuancu as code owners March 6, 2026 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply docs website feedback to ppl functions#5207

Apply docs website feedback to ppl functions#5207
ritvibhatt wants to merge 5 commits intoopensearch-project:mainfrom
ritvibhatt:ppl-functions-updates

ritvibhatt commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ritvibhatt commented Mar 5, 2026

Description

Related Issues

Check List

Uh oh!

github-actions bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Reviewer Guide 🔍

(Review updated until commit 93f92d5)

Uh oh!

github-actions bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 5, 2026

Uh oh!

github-actions bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Mar 5, 2026 •

edited

Loading

(Review updated until commit `93f92d5`)

github-actions bot commented Mar 5, 2026 •

edited

Loading