Skip to content

Add time conversion functions for convert command#5210

Draft
ritvibhatt wants to merge 13 commits intoopensearch-project:mainfrom
ritvibhatt:convert-time-functions
Draft

Add time conversion functions for convert command#5210
ritvibhatt wants to merge 13 commits intoopensearch-project:mainfrom
ritvibhatt:convert-time-functions

Conversation

@ritvibhatt
Copy link
Contributor

@ritvibhatt ritvibhatt commented Mar 6, 2026

Description

Implement ctime, mktime, mstime, and dur2sec conversion functions for the PPL convert command, along with timeformat parameter support.

Changes

  • ctime(field) — converts UNIX epoch timestamps to human-readable time strings
  • mktime(field) — converts time strings to UNIX epoch timestamps
  • mstime(field) — converts [MM:]SS.SSS time strings to total seconds
  • dur2sec(field) — converts HH:MM:SS duration strings to total seconds
  • timeformat parameter — allows ctime and mktime to use custom strftime format specifiers (default: %m/%d/%Y %H:%M:%S)
  • Register all four functions in PPLFuncImpTable
  • Preserve timeformat parameter in PPLQueryDataAnonymizer output
  • Add ANY_OPTIONAL_STRING operand type for functions accepting 1 or 2 arguments
  • Add toJavaPattern() utility to convert strftime specifiers to Java DateTimeFormatter patterns

Testing

  • Unit tests for all four conversion functions
  • PPL parser tests for convert with timeformat
  • Anonymizer tests for convert command variants
  • Integration tests for all functions with default and custom timeformat

Documentation

  • Updated docs/user/ppl/cmd/convert.md with new functions, timeformat parameter, and examples

Related Issues

Related to #5031

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Reviewer Guide 🔍

(Review updated until commit 215b59b)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add ctime/mktime functions with timeformat parameter support

Relevant files:

  • core/src/main/java/org/opensearch/sql/expression/function/udf/CTimeConvertFunction.java
  • core/src/main/java/org/opensearch/sql/expression/function/udf/MkTimeConvertFunction.java
  • core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java
  • core/src/main/java/org/opensearch/sql/calcite/utils/PPLOperandTypes.java
  • core/src/main/java/org/opensearch/sql/ast/tree/Convert.java
  • ppl/src/main/antlr/OpenSearchPPLParser.g4
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

Sub-PR theme: Add dur2sec and mstime conversion functions

Relevant files:

  • core/src/main/java/org/opensearch/sql/expression/function/udf/Dur2SecConvertFunction.java
  • core/src/main/java/org/opensearch/sql/expression/function/udf/MsTimeConvertFunction.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java

⚡ Recommended focus areas for review

Incomplete strftime mapping

The STRFTIME_TO_JAVA_PARSE map is missing many common strftime specifiers (e.g., %I for 12-hour, %p for AM/PM, %j for day of year, %Z/%z for timezone, %b/%B for month names, %e for day, %n for newline, etc.). Users specifying these in timeformat will silently get incorrect Java patterns rather than an error, potentially producing wrong results.

private static final Map<String, String> STRFTIME_TO_JAVA_PARSE =
    ImmutableMap.<String, String>builder()
        .put("%Y", "yyyy")
        .put("%m", "MM")
        .put("%d", "dd")
        .put("%H", "HH")
        .put("%M", "mm")
        .put("%S", "ss")
        .put("%T", "HH:mm:ss")
        .put("%F", "yyyy-MM-dd")
        .put("%%", "'%'")
        .build();

/**
 * Convert a strftime format string to a Java DateTimeFormatter pattern suitable for parsing.
 *
 * @param strftimeFormat the strftime-style format string (e.g. {@code %Y-%m-%d %H:%M:%S})
 * @return a Java DateTimeFormatter pattern (e.g. {@code yyyy-MM-dd HH:mm:ss})
 */
public static String toJavaPattern(String strftimeFormat) {
  String result = strftimeFormat;
  for (Map.Entry<String, String> entry : STRFTIME_TO_JAVA_PARSE.entrySet()) {
    String specifier = entry.getKey();
    if (result.contains(specifier)) {
      result = result.replace(specifier, entry.getValue());
    }
  }
  return result;
}
Missing minutes validation

The mstime pattern allows minutes values of 60 or more (e.g., 61:01 returns 3661.0 as shown in the test). The Splunk mstime function treats the first group as minutes, so values like 99:59 would be accepted. However, there is no documented upper bound for minutes, and the test at line 381 explicitly asserts 3661.0 for "61:01". This should be clarified or validated to match the intended Splunk behavior.

int minutes = matcher.group(1) != null ? Integer.parseInt(matcher.group(1)) : 0;
int seconds = Integer.parseInt(matcher.group(2));

if (seconds >= 60) {
  return null;
}
Hardcoded UTC timezone

ctime() always uses ZoneId.of("UTC") when converting epoch timestamps to human-readable strings. Splunk's ctime() typically uses the user's configured timezone. This may produce unexpected results for users in non-UTC timezones and differs from Splunk's behavior.

ZonedDateTime zdt = ZonedDateTime.ofInstant(instant, ZoneId.of("UTC"));
return StrftimeFormatterUtil.formatZonedDateTime(zdt, format).stringValue();
Hardcoded UTC offset

mktime() uses ZoneOffset.UTC when parsing date strings to epoch seconds. If the user's data contains timestamps in a local timezone, the resulting epoch values will be incorrect. This should either be configurable or documented clearly.

return (double) dateTime.toEpochSecond(ZoneOffset.UTC);
timeformat applied to non-time functions

In resolveConvertFunction, the timeFormat is only passed to ctime and mktime. However, when timeformat is specified alongside non-time functions (e.g., auto, num), the format is silently ignored without any warning or error. This may confuse users who mistakenly specify timeformat with incompatible functions.

private RexNode resolveConvertFunction(
    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {

  // Time functions that support timeformat parameter
  Set<String> timeFunctions = Set.of("ctime", "mktime");

  if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
    // For time functions with custom timeformat, pass the format as a second parameter
    RexNode timeFormatLiteral = context.rexBuilder.makeLiteral(timeFormat);
    return PPLFuncImpTable.INSTANCE.resolve(
        context.rexBuilder, functionName, sourceField, timeFormatLiteral);
  } else {
    // Regular conversion functions or time functions without custom format
    return PPLFuncImpTable.INSTANCE.resolve(context.rexBuilder, functionName, sourceField);
  }
}

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Suggestions ✨

Latest suggestions up to 215b59b

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix empty timeformat falling back to default

When timeFormatObj is not null but its trimmed value is empty, the function returns
null instead of falling back to the default format. This is inconsistent with
MkTimeConvertFunction.convertWithFormat, which also returns null for empty format.
However, the test testCtimeWithCustomTimeformat expects that an empty timeformat
falls back to the default. The condition should use the default format when the
trimmed format string is empty.

core/src/main/java/org/opensearch/sql/expression/function/udf/CTimeConvertFunction.java [73-76]

-String format = (timeFormatObj != null) ? timeFormatObj.toString().trim() : DEFAULT_FORMAT;
-if (format.isEmpty()) {
-  return null;
-}
+String format = (timeFormatObj != null && !timeFormatObj.toString().trim().isEmpty())
+    ? timeFormatObj.toString().trim()
+    : DEFAULT_FORMAT;
Suggestion importance[1-10]: 8

__

Why: The test testCtimeWithCustomTimeformat expects that an empty timeformat falls back to the default format, but the current code returns null when the trimmed format is empty. The improved code correctly handles this by using the default format when the trimmed string is empty, making behavior consistent with the test expectations.

Medium
Fix escape sequence ordering in format conversion

The %% escape sequence must be replaced before other specifiers to avoid
double-substitution. Since ImmutableMap preserves insertion order, %% is last in the
map, meaning %Y, %m, etc. are replaced first. If the input contains %%Y, it would
first replace %Y (within %%Y) incorrectly. The %% entry should be processed first,
or the replacement logic should handle ordering explicitly.

core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java [273-282]

 public static String toJavaPattern(String strftimeFormat) {
-  String result = strftimeFormat;
+  // Replace %% first to avoid double-substitution
+  String result = strftimeFormat.replace("%%", "'%'");
   for (Map.Entry<String, String> entry : STRFTIME_TO_JAVA_PARSE.entrySet()) {
     String specifier = entry.getKey();
+    if (specifier.equals("%%")) continue;
     if (result.contains(specifier)) {
       result = result.replace(specifier, entry.getValue());
     }
   }
   return result;
 }
Suggestion importance[1-10]: 6

__

Why: The %% escape sequence is placed last in the ImmutableMap, meaning other specifiers like %Y could be incorrectly matched within %%Y before %% is processed. Processing %% first prevents this double-substitution issue, which is a correctness concern for edge-case inputs.

Low
General
Add validation for minutes upper bound

The validation only checks that seconds >= 60 but does not validate the minutes
value. The test testMstimeConvertInvalid expects "25:70" to return null (seconds >=
60 is caught), but there is no upper bound check on minutes. While the test passes
for "25:70", there is no constraint preventing arbitrarily large minute values,
which may be unintended behavior.

core/src/main/java/org/opensearch/sql/expression/function/udf/MsTimeConvertFunction.java [47-49]

-if (seconds >= 60) {
+if (minutes >= 60 || seconds >= 60) {
   return null;
 }
Suggestion importance[1-10]: 5

__

Why: The minutes field has no upper bound validation, which could allow values like 99:30 to be treated as valid. While the test testMstimeConvertInvalid doesn't explicitly test for invalid minutes, adding a minutes >= 60 check would make the validation more robust and consistent with Dur2SecConvertFunction.

Low
Extract repeated set creation to static constant

The timeFunctions set is recreated on every call to resolveConvertFunction. This set
should be a static constant to avoid repeated object allocation, especially since it
is called in a loop over all conversions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1140-1142]

-Set<String> timeFunctions = Set.of("ctime", "mktime");
+private static final Set<String> TIME_FUNCTIONS = Set.of("ctime", "mktime");
 
-if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
+private RexNode resolveConvertFunction(
+    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {
 
+  if (TIME_FUNCTIONS.contains(functionName.toLowerCase()) && timeFormat != null) {
+
Suggestion importance[1-10]: 4

__

Why: The timeFunctions set is recreated on every invocation of resolveConvertFunction, which is called in a loop. Extracting it to a private static final constant avoids repeated object allocation and is a minor performance and code quality improvement.

Low

Previous suggestions

Suggestions up to commit 5ac49be
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix empty timeformat falling back to default

When timeFormatObj is an empty string, the code returns null instead of falling back
to the default format. This is inconsistent with the test expectation
testCtimeWithCustomTimeformat which asserts that an empty timeformat falls back to
the default %m/%d/%Y %H:%M:%S. The logic should use the default format when the
trimmed format string is empty.

core/src/main/java/org/opensearch/sql/expression/function/udf/CTimeConvertFunction.java [75-77]

 if (format.isEmpty()) {
-  return null;
+  format = DEFAULT_FORMAT;
 }
Suggestion importance[1-10]: 8

__

Why: The test testCtimeWithCustomTimeformat asserts that an empty string timeformat falls back to the default format (%m/%d/%Y %H:%M:%S), but the current code returns null when format.isEmpty(). This is a real behavioral inconsistency between the implementation and the test expectations.

Medium
Fix ordering of escape sequence substitution

The %% escape sequence must be replaced before other specifiers to avoid
double-substitution. Since ImmutableMap does not guarantee iteration order, %% might
be processed after %Y, %m, etc., causing incorrect results. The %% replacement
should be handled first, separately from the loop.

core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java [273-282]

 public static String toJavaPattern(String strftimeFormat) {
-  String result = strftimeFormat;
+  // Handle %% first to avoid interfering with other specifier replacements
+  String result = strftimeFormat.replace("%%", "'%'");
   for (Map.Entry<String, String> entry : STRFTIME_TO_JAVA_PARSE.entrySet()) {
     String specifier = entry.getKey();
+    if (specifier.equals("%%")) continue;
     if (result.contains(specifier)) {
       result = result.replace(specifier, entry.getValue());
     }
   }
   return result;
 }
Suggestion importance[1-10]: 7

__

Why: The %% escape sequence needs to be processed before other specifiers to avoid double-substitution. Since ImmutableMap does not guarantee iteration order, %% could be processed after %Y, %m, etc., leading to incorrect pattern generation. The fix is valid and addresses a real ordering issue.

Medium
General
Extract repeated set creation to static constant

The timeFunctions set is recreated on every call to resolveConvertFunction. This set
should be a static constant to avoid repeated object allocation, especially when
processing many conversions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1088-1090]

-Set<String> timeFunctions = Set.of("ctime", "mktime");
+private static final Set<String> TIME_FUNCTIONS = Set.of("ctime", "mktime");
 
-if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
+private RexNode resolveConvertFunction(
+    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {
 
+  if (TIME_FUNCTIONS.contains(functionName.toLowerCase()) && timeFormat != null) {
+
Suggestion importance[1-10]: 4

__

Why: The timeFunctions set is indeed recreated on every call to resolveConvertFunction, and extracting it to a static constant is a valid performance and style improvement. However, the impact is minor since this is not a hot path.

Low
Clarify hour validation for duration format

The validation rejects hours >= 24, but the dur2sec format [D+]HH:MM:SS is intended
to support durations where hours can exceed 23 when no days prefix is used (e.g.,
25:00:00 could be a valid 25-hour duration). The documentation says "Hours must be
less than 24", but the test testMstimeConvertBasic shows 61:01 is valid for mstime.
Please verify whether dur2sec should also allow hours >= 24 when no days prefix is
present, and update the validation accordingly.

core/src/main/java/org/opensearch/sql/expression/function/udf/Dur2SecConvertFunction.java [46-48]

-if (hours >= 24 || minutes >= 60 || seconds >= 60) {
+// Only enforce hours < 24 when days prefix is present; otherwise allow any hour value
+if (matcher.group(1) != null && hours >= 24) {
+  return null;
+}
+if (minutes >= 60 || seconds >= 60) {
   return null;
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion raises a valid design question about whether hours >= 24 should be allowed without a days prefix, but the documentation explicitly states "Hours must be less than 24" and the test cases don't test hours >= 24 without a days prefix. This is more of a design clarification than a clear bug.

Low
Suggestions up to commit da6cfeb
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix empty timeformat falling back to default

When timeFormatObj is not null but its trimmed value is empty, the function returns
null. However, the test testCtimeWithCustomTimeformat expects that an empty
timeformat falls back to the default format. The current logic is inconsistent with
the test expectation and should fall back to DEFAULT_FORMAT when the trimmed format
string is empty.

core/src/main/java/org/opensearch/sql/expression/function/udf/CTimeConvertFunction.java [73-77]

 String format =
-    (timeFormatObj != null) ? timeFormatObj.toString().trim() : DEFAULT_FORMAT;
-if (format.isEmpty()) {
-  return null;
-}
+    (timeFormatObj != null && !timeFormatObj.toString().trim().isEmpty())
+        ? timeFormatObj.toString().trim()
+        : DEFAULT_FORMAT;
Suggestion importance[1-10]: 8

__

Why: The test testCtimeWithCustomTimeformat explicitly expects that an empty timeformat string falls back to the default format ("10/18/2003 20:07:13"), but the current code returns null when the trimmed format is empty. This is a real bug where the implementation contradicts the test expectations.

Medium
Fix ordering of escape sequence replacement

The %% escape sequence must be replaced before other specifiers to avoid
double-replacement. Since ImmutableMap does not guarantee iteration order, %% might
be processed after %Y, %m, etc., causing incorrect substitutions. The %% replacement
should be handled first, separately from the loop.

core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java [273-282]

 public static String toJavaPattern(String strftimeFormat) {
-  String result = strftimeFormat;
+  // Handle %% escape first to avoid double-replacement
+  String result = strftimeFormat.replace("%%", "\u0000");
   for (Map.Entry<String, String> entry : STRFTIME_TO_JAVA_PARSE.entrySet()) {
     String specifier = entry.getKey();
+    if (specifier.equals("%%")) continue;
     if (result.contains(specifier)) {
       result = result.replace(specifier, entry.getValue());
     }
   }
-  return result;
+  return result.replace("\u0000", "'%'");
 }
Suggestion importance[1-10]: 7

__

Why: ImmutableMap does not guarantee iteration order, so %% could be processed after other specifiers, causing double-replacement issues (e.g., %%Y could become '%'Y then '%'yyyy). This is a valid correctness concern that could cause subtle bugs with escaped percent signs in format strings.

Medium
General
Extract repeated set creation to static constant

The timeFunctions set is recreated on every call to resolveConvertFunction. This set
should be a static constant to avoid repeated object allocation, especially since it
is called in a loop over all conversions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1088-1090]

-// Time functions that support timeformat parameter
-Set<String> timeFunctions = Set.of("ctime", "mktime");
+private static final Set<String> TIME_FUNCTIONS = Set.of("ctime", "mktime");
 
-if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
+private RexNode resolveConvertFunction(
+    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {
 
+  if (TIME_FUNCTIONS.contains(functionName.toLowerCase()) && timeFormat != null) {
+
Suggestion importance[1-10]: 4

__

Why: The timeFunctions set is recreated on every invocation of resolveConvertFunction, which is called in a loop. Moving it to a static constant is a valid performance and code quality improvement, though the impact is minor for typical query sizes.

Low
Fix hour validation when no days component present

The dur2sec format supports an optional days prefix (e.g., 2+12:30:15), which means
hours can legitimately exceed 23 when days are not provided. However, the test
testMstimeConvertBasic shows "61:01" returning 3661.0 for mstime, while dur2sec with
"01:23:45" validates hours < 24. The current validation rejects inputs like
"25:00:00" even without a days prefix, which may be intentional, but the validation
should only apply when no days component is present to allow large hour values in
day-less durations.

core/src/main/java/org/opensearch/sql/expression/function/udf/Dur2SecConvertFunction.java [46-48]

-if (hours >= 24 || minutes >= 60 || seconds >= 60) {
+boolean hasDays = matcher.group(1) != null;
+if (!hasDays && hours >= 24 || minutes >= 60 || seconds >= 60) {
   return null;
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion proposes allowing hours >= 24 when no days prefix is present, but the existing test testDur2secConvertInvalid expects "25:70:80" to return null, and the format is documented as [D+]HH:MM:SS where HH is 1-2 digits. The current validation appears intentional per the spec, making this suggestion questionable.

Low
Suggestions up to commit fdc0b67
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent double-substitution of escaped percent specifier

The current implementation processes %% (escaped percent) last only by coincidence
of map iteration order, but ImmutableMap preserves insertion order. However, if %%
is processed before other specifiers, the % it produces could be re-matched by
subsequent specifiers (e.g., %%Y%Yyyyy). The %% entry should be processed
last, or the replacement should be done in a single pass to avoid
double-substitution.

core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java [273-282]

 public static String toJavaPattern(String strftimeFormat) {
   String result = strftimeFormat;
   for (Map.Entry<String, String> entry : STRFTIME_TO_JAVA_PARSE.entrySet()) {
     String specifier = entry.getKey();
+    if (specifier.equals("%%")) continue; // handle last
     if (result.contains(specifier)) {
       result = result.replace(specifier, entry.getValue());
     }
   }
+  // Handle escaped percent last to avoid double-substitution
+  result = result.replace("%%", "'%'");
   return result;
 }
Suggestion importance[1-10]: 7

__

Why: This is a real bug: since ImmutableMap preserves insertion order and %% is inserted last, it happens to work correctly now, but the logic is fragile. If %% were processed before other specifiers, %%Y would become %Y then yyyy, causing incorrect behavior. Making the fix explicit improves correctness and maintainability.

Medium
General
Add missing minutes range validation

The validation only checks that seconds >= 60 but does not validate that minutes is
within a reasonable range. According to the documentation, the format is
[MM:]SS.SSS, but there is no upper bound check on minutes, allowing values like
99:59 to pass. A validation for minutes should be added if the format implies a
bounded range.

core/src/main/java/org/opensearch/sql/expression/function/udf/MsTimeConvertFunction.java [47-49]

-if (seconds >= 60) {
+if (minutes >= 60 || seconds >= 60) {
   return null;
 }
Suggestion importance[1-10]: 4

__

Why: The test testMstimeConvertBasic explicitly asserts assertEquals(3661.0, MsTimeConvertFunction.convert("61:01")), meaning minutes are intentionally unbounded in mstime. Adding a minutes >= 60 check would break this existing test and contradict the intended behavior.

Low
Extract repeated set allocation to static constant

The timeFunctions set is recreated on every call to resolveConvertFunction. This set
should be a static final constant to avoid repeated object allocation, especially
since it is called in a loop over all conversions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1088-1090]

-Set<String> timeFunctions = Set.of("ctime", "mktime");
+private static final Set<String> TIME_FUNCTIONS = Set.of("ctime", "mktime");
 
-if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
+private RexNode resolveConvertFunction(
+    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {
 
+  if (TIME_FUNCTIONS.contains(functionName.toLowerCase()) && timeFormat != null) {
+    RexNode timeFormatLiteral = context.rexBuilder.makeLiteral(timeFormat);
+    return PPLFuncImpTable.INSTANCE.resolve(
+        context.rexBuilder, functionName, sourceField, timeFormatLiteral);
+  } else {
+    return PPLFuncImpTable.INSTANCE.resolve(context.rexBuilder, functionName, sourceField);
+  }
+}
+
Suggestion importance[1-10]: 4

__

Why: The timeFunctions set is recreated on every call to resolveConvertFunction. Moving it to a static final field is a minor but valid optimization and improves code clarity.

Low
Suggestions up to commit 80b16a9
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix potential double-replacement in format conversion

The current implementation iterates over the map and replaces specifiers one by one,
but this can cause double-replacement issues. For example, %Y contains %, and if %%
is processed after %Y, the literal '%' replacement won't be affected, but if %% is
processed first, the % in %% could be matched by subsequent specifiers like %Y. The
%% escape should be handled last, or the replacement should be done in a single pass
to avoid such conflicts.

core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java [273-282]

 public static String toJavaPattern(String strftimeFormat) {
-  String result = strftimeFormat;
-  for (Map.Entry<String, String> entry : STRFTIME_TO_JAVA_PARSE.entrySet()) {
-    String specifier = entry.getKey();
-    if (result.contains(specifier)) {
-      result = result.replace(specifier, entry.getValue());
+  StringBuilder result = new StringBuilder();
+  int i = 0;
+  while (i < strftimeFormat.length()) {
+    if (strftimeFormat.charAt(i) == '%' && i + 1 < strftimeFormat.length()) {
+      String specifier = strftimeFormat.substring(i, i + 2);
+      String replacement = STRFTIME_TO_JAVA_PARSE.get(specifier);
+      if (replacement != null) {
+        result.append(replacement);
+        i += 2;
+      } else {
+        result.append(strftimeFormat.charAt(i));
+        i++;
+      }
+    } else {
+      result.append(strftimeFormat.charAt(i));
+      i++;
     }
   }
-  return result;
+  return result.toString();
 }
Suggestion importance[1-10]: 7

__

Why: The current map-based replacement approach can cause issues when specifiers share characters (e.g., %% and %Y both contain %). A single-pass character-by-character approach is more robust and avoids potential double-replacement bugs. This is a valid correctness concern.

Medium
General
Extract repeated set creation to static constant

The timeFunctions set is recreated on every call to resolveConvertFunction. This set
should be a static constant to avoid repeated object allocation, especially since
it's called in a loop over all conversions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1087-1090]

-// Time functions that support timeformat parameter
-Set<String> timeFunctions = Set.of("ctime", "mktime");
+private static final Set<String> TIME_FUNCTIONS = Set.of("ctime", "mktime");
 
-if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
+private RexNode resolveConvertFunction(
+    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {
 
+  if (TIME_FUNCTIONS.contains(functionName.toLowerCase()) && timeFormat != null) {
+    RexNode timeFormatLiteral = context.rexBuilder.makeLiteral(timeFormat);
+    return PPLFuncImpTable.INSTANCE.resolve(
+        context.rexBuilder, functionName, sourceField, timeFormatLiteral);
+  } else {
+    return PPLFuncImpTable.INSTANCE.resolve(context.rexBuilder, functionName, sourceField);
+  }
+}
+
Suggestion importance[1-10]: 5

__

Why: The timeFunctions set is recreated on every invocation of resolveConvertFunction, which is called in a loop. Extracting it to a static final constant is a straightforward performance and code quality improvement.

Low
Add minutes range validation for consistency

The MsTimeConvertFunction does not validate that the minutes value is within a
reasonable range (e.g., < 60), unlike Dur2SecConvertFunction which validates hours,
minutes, and seconds. The test even asserts that "61:01" (61 minutes) returns
3661.0, which seems intentionally permissive, but this inconsistency should be
documented or the validation should be consistent across time conversion functions.

core/src/main/java/org/opensearch/sql/expression/function/udf/MsTimeConvertFunction.java [47-49]

-assertEquals(3661.0, MsTimeConvertFunction.convert("61:01"));
+// If minutes should be validated (< 60), add:
+if (minutes >= 60 || seconds >= 60) {
+  return null;
+}
Suggestion importance[1-10]: 3

__

Why: The suggestion points out an inconsistency between MsTimeConvertFunction (no minutes validation) and Dur2SecConvertFunction (validates hours/minutes/seconds). However, the test explicitly asserts "61:01" returns 3661.0, indicating this permissive behavior is intentional. The improved_code would break existing tests, making this a low-impact style/design note rather than a bug fix.

Low
Suggestions up to commit 93f7768
CategorySuggestion                                                                                                                                    Impact
Possible issue
Escape single-quote characters in Java DateTimeFormatter pattern

When a literal single-quote character ' appears in the strftime format string, the
current code appends it as-is (since ' is not a letter). In Java's
DateTimeFormatter, a single quote is the escape character, so an unescaped ' in the
pattern will cause a DateTimeParseException. The single-quote character should be
escaped as '' in the Java pattern.

core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java [301-306]

 // Escape Java pattern letters as literals
 if (Character.isLetter(c)) {
   result.append("'").append(c).append("'");
+} else if (c == '\'') {
+  result.append("''");
 } else {
   result.append(c);
 }
Suggestion importance[1-10]: 6

__

Why: This is a valid bug fix — an unescaped single-quote ' in a Java DateTimeFormatter pattern would cause a parse exception. The improved_code correctly handles this edge case by doubling the single-quote character.

Low
General
Extract repeated set creation to a static constant

The timeFunctions set is recreated on every call to resolveConvertFunction. This set
should be a static constant to avoid repeated object allocation, especially since
it's called in a loop over all conversions.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java [1087-1090]

-// Time functions that support timeformat parameter
-Set<String> timeFunctions = Set.of("ctime", "mktime");
+private static final Set<String> TIME_FUNCTIONS = Set.of("ctime", "mktime");
 
-if (timeFunctions.contains(functionName.toLowerCase()) && timeFormat != null) {
+private RexNode resolveConvertFunction(
+    String functionName, RexNode sourceField, String timeFormat, CalcitePlanContext context) {
 
+  if (TIME_FUNCTIONS.contains(functionName.toLowerCase()) && timeFormat != null) {
+    // For time functions with custom timeformat, pass the format as a second parameter
+    RexNode timeFormatLiteral = context.rexBuilder.makeLiteral(timeFormat);
+    return PPLFuncImpTable.INSTANCE.resolve(
+        context.rexBuilder, functionName, sourceField, timeFormatLiteral);
+  } else {
+    // Regular conversion functions or time functions without custom format
+    return PPLFuncImpTable.INSTANCE.resolve(context.rexBuilder, functionName, sourceField);
+  }
+}
+
Suggestion importance[1-10]: 4

__

Why: Moving the timeFunctions set to a static constant is a valid performance and code quality improvement, avoiding repeated object allocation on each call. The improved_code accurately reflects the change.

Low
Add upper bound validation for minutes field

The mstime function validates that seconds are less than 60, but does not validate
that minutes are within a reasonable range. According to the test
assertEquals(3661.0, MsTimeConvertFunction.convert("61:01")), minutes above 59 are
allowed, but there is no upper bound check. However, the test
assertNull(MsTimeConvertFunction.convert("25:70")) expects null for seconds >= 70,
which is already handled. Consider also checking that minutes don't exceed a
reasonable maximum (e.g., 99 given the pattern allows up to 2 digits) to prevent
unexpected large values, or at minimum document the intentional lack of minutes
validation.

core/src/main/java/org/opensearch/sql/expression/function/udf/MsTimeConvertFunction.java [47-49]

-if (seconds >= 60) {
+if (minutes > 99 || seconds >= 60) {
   return null;
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion adds a minutes upper bound check, but the test assertEquals(3661.0, MsTimeConvertFunction.convert("61:01")) explicitly expects minutes > 59 to be valid, so adding minutes > 99 would be a behavioral change that contradicts the existing test expectations. The suggestion has limited impact.

Low

@ritvibhatt ritvibhatt changed the title Convert time functions Add time conversion functions for convert command Mar 6, 2026
Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit eae36db

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 73f4088

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 82fd69f.

PathLineSeverityDescription
core/src/main/java/org/opensearch/sql/expression/function/udf/MkTimeConvertFunction.java100lowUser-controlled timeformat string is passed without length or complexity bounds directly to StrftimeFormatterUtil.toJavaPattern() and then to DateTimeFormatter.ofPattern(). Complex or deeply nested Java DateTimeFormatter patterns can be slow to compile, creating a minor resource-exhaustion vector if users can supply arbitrarily long format strings through PPL queries. Exceptions are caught and return null, so there is no crash risk, but no input length guard exists.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 82fd69f

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit b45ff6d

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit becd73e.

PathLineSeverityDescription
core/src/main/java/org/opensearch/sql/expression/datetime/StrftimeFormatterUtil.java290lowThe toJavaPattern() method does not escape single-quote characters in user-supplied strftime format strings before passing them to DateTimeFormatter.ofPattern(). A literal single quote in the input would be passed through unescaped, potentially producing malformed Java formatter patterns (e.g., unclosed literal sequences). Exceptions are caught and return null, so there is no exploitable impact, but it is an unvalidated user-controlled input path that warrants attention.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit becd73e

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 93f7768.

PathLineSeverityDescription
.github/workflows/sql-cli-integration-test.yml71lowThe `./gradlew clean` step runs immediately after `./gradlew publishToMavenLocal`, deleting compiled build artifacts after they have been published. While this is a common disk-space-saving practice, the sequencing (clean after publish rather than before) means local build artifacts are removed post-publication, which slightly reduces auditability of what was published. No clear malicious intent, and the pattern has a plausible legitimate explanation.

The table above displays the top 10 most important findings.

Total: 1 | Critical: 0 | High: 0 | Medium: 0 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 93f7768

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 80b16a9

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit fdc0b67

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

Persistent review updated to latest commit da6cfeb

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

Persistent review updated to latest commit 5ac49be

Signed-off-by: Ritvi Bhatt <ribhatt@amazon.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2026

Persistent review updated to latest commit 215b59b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant