Skip to content

Comments

feat(datafusion): support isnan predicate pushdown to Iceberg#2142

Open
charlesdong1991 wants to merge 3 commits intoapache:mainfrom
charlesdong1991:implement-is-nan
Open

feat(datafusion): support isnan predicate pushdown to Iceberg#2142
charlesdong1991 wants to merge 3 commits intoapache:mainfrom
charlesdong1991:implement-is-nan

Conversation

@charlesdong1991
Copy link

@charlesdong1991 charlesdong1991 commented Feb 15, 2026

Which issue does this PR close?

Part of the ongoing effort to improve predicate pushdown coverage in the DataFusion integration.

What changes are included in this PR?

This PR adds support for pushing down isnan() predicates from DataFusion to Iceberg's native IsNan / NotNan predicate operators.

In DataFusion, isnan() is represented as a scalar function (Expr::ScalarFunction) rather than a dedicated Expr variant (unlike IsNull / IsNotNull). This PR introduces a new scalar_function_to_iceberg_predicate helper in expr_to_predicate.rs that matches scalar functions by name at runtime and converts isnan(col) into Predicate::Unary(IsNan, col).

Negation (NOT isnan(col)) is handled automatically: the existing Expr::Not arm wraps the result in Predicate::Not(...), and Iceberg's downstream rewrite_not visitor normalizes it into Predicate::Unary(NotNan, col).

This enables file pruning using nan_value_counts in manifest metadata for float/double columns, as defined in the Iceberg spec — Manifest Files: field summaries and column statistics.

Are these changes tested?

Yes

Copy link
Collaborator

@CTTY CTTY left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

}

#[test]
fn test_predicate_conversion_with_isnan_unsupported_arg() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create another issue to track complex numeric expression and left some TODO comments? An unary expression for isnan feels too limited to be useful

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review! @CTTY

Added a TODO comment in function definition and create an issue: #2154

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add isnan predicate support from Datafusion

2 participants