Skip to content

Conversation

@doripo
Copy link
Contributor

@doripo doripo commented Feb 2, 2026

Add exp.Trunc expression class for numeric truncation, filling a gap alongside existing Round, Floor, and Ceil functions (in AST since 2021).

Implementation:

  • New exp.Trunc class with _sql_names = ["TRUNC", "TRUNCATE"]
  • Shared build_trunc in dialect.py for dialects with overloaded TRUNC (date vs numeric):
    • DateTrunc: first arg is TEMPORAL_TYPES and (second arg exists or default_date_trunc_unit set), or second arg is TEXT_TYPES
    • Trunc: first arg is NUMERIC_TYPES, or second arg is NUMERIC_TYPES, or (date_trunc_requires_part=False and no second arg)
    • Anonymous: otherwise
  • Oracle: Uses shared builder with unabbreviate=False, default_date_trunc_unit="DD"
  • Exasol: Uses shared builder; handles DATETIME (no default unit, single-arg temporal falls back to Anonymous)
  • Snowflake: Uses shared builder with date_trunc_requires_part=False (single-arg TRUNC is always numeric)
  • MySQL: Generates TRUNCATE(...); TRUNC alias normalizes to TRUNCATE
  • T-SQL: Generates ROUND(x, n, 1)
  • SQLite: Single-arg only; decimals arg warns via @unsupported_args
  • Other dialects (PostgreSQL, DuckDB, BigQuery): Auto-supported via _sql_names
  • Date-only dialects (Hive, Spark): Unchanged

Tests:

  • Oracle: type inference tests (5 discrimination cases for shared build_trunc), identity and cross-dialect transpilation
  • Exasol: Type assertion, DATETIME handling, single-arg temporal fallback to Anonymous, cross-dialect transpilation
  • Snowflake: Numeric TRUNC identity, single-arg untyped is numeric, type assertions, cross-dialect transpilation, Anonymous fallback
  • MySQL: TRUNCATE identity and TRUNC alias normalization
  • T-SQL: ROUND(x, n, 1) generation
  • SQLite: Single-arg identity, decimals warning test
  • Spark/Hive: Assertions that TRUNC remains DateTrunc/TimestampTrunc
  • Cross-dialect date/timestamp TRUNC: validate_all tests for various units / dialects:
    • Units DAY, WEEK, MONTH, QUARTER, YEAR, HOUR, MINUTE, SECOND
    • Across Oracle, Snowflake, Postgres, BigQuery, DuckDB, TSQL, Spark

Out of scope / potential follow-up:

  • T-SQL ROUND(x, n, mode): third arg still passed through verbatim (pre-existing), producing invalid SQL in most dialects. Literal ,1 could now parse as Trunc, literal ,0 could simplify to 2-arg Round; non-literal / invalid mode would need a warning.

Add exp.Trunc expression class for numeric truncation, filling a gap
alongside existing Round, Floor, and Ceil functions (in AST since 2021).

Implementation:
- New exp.Trunc class with _sql_names = ["TRUNC", "TRUNCATE"]
- Oracle: Type-annotation-aware builder (DATE/TIMESTAMP/DATETIME -> DateTrunc,
  numeric -> Trunc)
- Exasol: Updated to use exp.Trunc instead of Anonymous; fixed to handle
  DATETIME type and single-arg date TRUNC (defaults to 'DD' like Oracle)
- MySQL: Generates TRUNCATE(...); TRUNC alias normalizes to TRUNCATE
- T-SQL: Generates ROUND(x, n, 1)
- Other dialects (PostgreSQL, Snowflake, DuckDB, BigQuery): Auto-supported
  via _sql_names
- Date-only dialects (Hive, Spark) unchanged

Tests:
- Oracle: exp.Trunc identity and type assertion (vs DateTrunc), single-arg TRUNC
- MySQL: TRUNCATE identity and TRUNC alias normalization
- T-SQL: ROUND(x, n, 1) generation from other dialects' TRUNC
- Exasol: exp.Trunc type assertion, DATETIME handling, single-arg date TRUNC

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
SQLite's TRUNC() only accepts one argument. This change:
- Simplifies TRUNC(x, 0) to TRUNC(x) for SQLite
- Warns when non-zero decimals are used (unsupported)
Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @doripo, thanks for the contribution!

  1. Can we increase test coverage e.g in dialects like Spark to ensure that TRUNC(foo) remains exp.DateTrunc?
  2. Snowflake's TRUNC is also overloaded, so we'd need to approach it similar to Oracle & Exasol. See this comment

- SQLite: Use @unsupported_args("decimals") for best-effort transpilation,
  move trunc_sql inside Generator for auto-discovery, update tests
- Spark/Hive: Add tests to verify TRUNC remains DateTrunc/TimestampTrunc

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
- Add build_trunc to dialect.py for Oracle/Exasol/Snowflake
- Uses type annotation to distinguish date vs numeric truncation
- Returns Anonymous if type cannot be determined
- Add Snowflake support for overloaded TRUNC/TRUNCATE
- Add comprehensive Snowflake TRUNC tests

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
Add validate_all tests for:
- Oracle: Date truncation (DAY, WEEK, MONTH, QUARTER, YEAR),
  Timestamp truncation (HOUR, MINUTE, SECOND, DAY, MONTH, YEAR)
- Exasol: Date/time truncation with Exasol-specific units
  (YYYY, MM, DD, HH, MI, SS, WW, Q)
- Cross-dialect: Oracle, Snowflake, Postgres, BigQuery, DuckDB, TSQL, Spark

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick & accurate iterations! Leaving a few more comments your way, should be close to merging soon (also, appreciate the a-b sorting of the imports as a cherry on top 😆)

doripo and others added 4 commits February 3, 2026 14:48
add .assert_is(exp.Trunc) as chained function call

Co-authored-by: Vaggelis Danias <daniasevangelos@gmail.com>
- Use TEMPORAL_TYPES instead of DATE/TIMESTAMP/DATETIME to correctly
  handle all temporal types (e.g. TIMESTAMPTZ, TIMESTAMPLTZ)
- Group build_trunc conditions by return type (DateTrunc/Trunc/Anonymous)
- Chain .assert_is() to validate_identity calls
- Merge redundant validate_all tests
- Add fallback tests for Anonymous case
- Remove duplicate Spark TRUNC test

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
- test_trunc_type_inference: tests build_trunc discrimination logic
  (shared across Oracle, Exasol, Snowflake) - 5 cases covering
  temporal+?, ?+string, numeric+?, ?+int, ?+? fallback
- test_trunc: Oracle-specific identity and transpilation tests

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
@doripo doripo requested a review from VaggelisD February 3, 2026 14:43
@doripo
Copy link
Contributor Author

doripo commented Feb 3, 2026

@VaggelisD modified according to your suggestions, thanks, please also let me know if the trailing comment in the main PR description ("Out of scope / potential follow-up:") is appropriate or better remove it from the commit message.

@VaggelisD
Copy link
Collaborator

Thank you!

please also let me know if the trailing comment in the main PR description ("Out of scope / potential follow-up:") is appropriate or better remove it from the commit message.

It is fine for future reference, thanks for documenting it; The PR has already grown so it's great if we can now limit its scope to what's already there

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @doripo, I believe I have one final comment and a few nits left, really appreciate your cooperation. If anything else comes up after that we'll take it to the finish line by merging this PR and applying the fixes afterwards

@fivetran-amrutabhimsenayachit
Copy link
Collaborator

fivetran-amrutabhimsenayachit commented Feb 3, 2026

Hey @doripo, I believe I have one final comment and a few nits left, really appreciate your cooperation. If anything else comes up after that we'll take it to the finish line by merging this PR and applying the fixes afterwards

I agree, we can definitely increase the test coverage by adding more TRUNC specific tests in oracle, snowflake & exasol files. We can add it as part of a separate PR since this is a big PR.

Something like this for Oracle:

# Year variants
    for unit in ["YEAR", "YYYY", "YY", "Y", "SYYYY", "SYEAR"]:
        self.validate_identity(f"TRUNC(SYSDATE, '{unit}')")
    
    # Quarter
    for unit in ["Q", "QUARTER"]:
        self.validate_identity(f"TRUNC(SYSDATE, '{unit}')")
    
    # Month variants
    for unit in ["MONTH", "MM", "MON"]:
        self.validate_identity(f"TRUNC(SYSDATE, '{unit}')")
        
        ... and so on for all the date/time parameters.

Similarly we can do the same for snowflake (https://docs.snowflake.com/en/sql-reference/functions-date-time#label-supported-date-time-parts)

date_units = [
            ("YEAR", ["YEAR", "YEARS", "Y", "YY", "YYY", "YYYY", "YR", "YRS"]),
            ("QUARTER", ["QUARTER", "QUARTERS", "Q", "QTR", "QTRS"]),
            ("MONTH", ["MONTH", "MONTHS", "MM", "MON", "MONS"]),
            ("WEEK", ["WEEK", "W", "WK", "WEEKOFYEAR", "WOY", "WY"]),
            ("DAY", ["DAY", "DAYS", "D", "DD", "DAYOFMONTH"]),
        ]

        time_units = [
            ("HOUR", ["HOUR", "HOURS", "H", "HH", "HR", "HRS"]),
            ("MINUTE", ["MINUTE", "MINUTES", "M", "MI", "MIN", "MINS"]),
            ("SECOND", ["SECOND", "SECONDS", "S", "SEC", "SECS"]),
            ("MILLISECOND", ["MILLISECOND", "MILLISECONDS", "MS", "MSEC"]),
            ("MICROSECOND", ["MICROSECOND", "MICROSECONDS", "US", "USEC"]),
            (
                "NANOSECOND",
                [
                    "NANOSECOND",
                    "NANOSECONDS",
                    "NANOSECS",
                    "NS",
                    "NSEC",
                    "NANOSEC",
                    "NSECOND",
                    "NSECONDS",
                ],
            ),
        ]

Exasol:

# Basic numeric truncation
    self.validate_all(
        "TRUNC(123.456, 2)",
        write={
            "exasol": "TRUNC(123.456, 2)",
            "oracle": "TRUNC(123.456, 2)",
            "postgres": "TRUNC(123.456, 2)",
            "mysql": "TRUNCATE(123.456, 2)",
        }
    )

- Snowflake: date_trunc_requires_part=False (single-arg is numeric)
- Oracle: default_date_trunc_unit="DD" (single-arg temporal uses DD)
- Update tests for new behavior
- Use validate_identity where SQL round-trips

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
- Use "truncation" (concept) instead of "TRUNC" (function name)
- Consistent patterns: "Numeric truncation identity",
  "Date truncation with typed column and unit",
  "Cross-dialect numeric truncation transpilation"

Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
Signed-off-by: Dori Polotsky <doripo@riverpool.ai>
@VaggelisD
Copy link
Collaborator

VaggelisD commented Feb 4, 2026

Copy link
Collaborator

@georgesittas georgesittas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work

Copy link
Collaborator

@VaggelisD VaggelisD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final comments from me too, thank you!

Comment on lines +599 to +600
# Fallback to Anonymous (Exasol requires unit for date truncation)
self.validate_identity("TRUNC(CAST(x AS DATE))").assert_is(exp.Anonymous)
Copy link
Collaborator

@VaggelisD VaggelisD Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Exasol docs unit is optional so we could infer that this is date truncation here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VaggelisD

I looked into Exasol's TRUNC default behavior. I can't say that the docs are 100% explicit, but:

  • TRUNC/TRUNCATE docs say format is optional and "if a format is not specified, the value is truncated to days"
  • DATE_TRUNC is described as "PostgreSQL compatible"
  • The ROUND function docs reference "days from noon" as the rounding boundary, implying day-level granularity

This suggests DD (day truncation to midnight) matches Oracle's default, though not stated verbatim. Are you comfortable adding default_date_trunc_unit="DD" for Exasol, or would you prefer keeping the Anonymous fallback until we have firmer confirmation?

I have the fix ready (4 changed lines) just wanted to double check.

(continuing 77830a3#r2759813244)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related: opened new issue #6959 for DD / DAY unit transpilation

@doripo
Copy link
Contributor Author

doripo commented Feb 4, 2026

@VaggelisD that's a very good reference! I believe that these minor corrections to your table apply:

  • SQLite => instead of TRUNCATE(numeric_expr, n), should be:
    TRUNC(x) - no decimals parameter, no TRUNCATE function
  • ClickHouse => instead of ❌ (none), should be:
    Actually has trunc(x[, N]) with alias truncate
  • Presto/Trino => Correct, but note: returns double, not integer
  • Spark/DBX => Correct, but note: may be worked around e.g.:
    IF(x >= 0, FLOOR(x, d), CEIL(x, d))

Based on my understanding. the current gaps are therefore quite small (about 20 lines of code, not including the tests):

Dialect Current Correct Status
DuckDB TRUNC(x, d) TRUNC(x) only Needs @unsupported_args("decimals")
Presto/Trino TRUNC(x, d) TRUNCATE(x, d) Needs rename_func("TRUNCATE")
ClickHouse Not handled trunc(x, d) Needs rename_func("trunc")
Spark/Hive Not handled No native support Needs IF(x>=0, FLOOR(x,d), CEIL(x,d)) workaround

I can address them either now or in a follow-up PR, as they are incremental improvements.

@georgesittas
Copy link
Collaborator

Let's get any minor fixes in and then improve the parsing / transpilation in a follow-up to avoid increasing the scope more.

@doripo
Copy link
Contributor Author

doripo commented Feb 4, 2026

Or, for Spark/Hive probably better to just cast to BIGINT with unsupported_args("decimals") - the 'smart' workaround is somewhat weird..

If I do that now, then all the code fixes are really minor - and just a few more tests. @georgesittas does that seem acceptable at this stage of the PR?

@georgesittas
Copy link
Collaborator

Let's do it in a follow-up PR @doripo, we can apply any last one-liners here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Style fixups

Co-authored-by: Vaggelis Danias <daniasevangelos@gmail.com>
@georgesittas georgesittas merged commit 8725010 into tobymao:main Feb 4, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants