Docs: Add informational properties section for table comment#15367
Docs: Add informational properties section for table comment#15367yguy-ryft wants to merge 2 commits intoapache:mainfrom
Conversation
docs/docs/configuration.md
Outdated
|
|
||
| ### Informational properties | ||
|
|
||
| Informational properties are not used by Iceberg operations, but can be set by engines to provide additional context about a table. |
There was a problem hiding this comment.
nit: an alternative
Informational properties can be set to provide additional context about a table. They can be useful for documentation, discovery, and integration with external tools. They do not affect read/write behavior or query semantics.
docs/docs/configuration.md
Outdated
|
|
||
| | Property | Default | Description | | ||
| | -------- | ---------- | ------------------------------------------------------------------------------------------------------------------- | | ||
| | comment | (not set) | A human-readable description of the table. Engines like Spark and Flink set this via `COMMENT` in create table DDL. | |
There was a problem hiding this comment.
Engines like Spark and Flink set this via
COMMENTin create table DDL.
I am not sure we need to call this out. I would be good to just remove it.
A human-readable description of the table.
The doc is not just for human to read. The semantic context for LLMs might be the imore mportant use case nowadays :)
A table-level description that documents the business meaning and usage context.
There was a problem hiding this comment.
I am also debating if we need to add more details like these. I guess we probably should leave those details out.
It can cover information like business purpose, data source/pipeline, granularity, ownership, update frequency, SLA/freshness, common query patterns, relationships, disambiguation, domain-specific context, etc.
| | -------------- | -------- |--------------------------------------------------------------------------------------------------------------------------------------| | ||
| | format-version | 2 | Table's format version as defined in the [Spec](../../spec.md#format-versioning). Defaults to 2 since version 1.4.0. | | ||
|
|
||
| ### Informational properties |
There was a problem hiding this comment.
nit: we might want to call out the side effect of storing information properties in the table metadata. One that im thinking of is that updating comments will create a new table version (new snapshot). This can lead to operational complexity since it follows the table commit path
There was a problem hiding this comment.
there will be no new snapshot, but a new metadata.json file with a new commit
There was a problem hiding this comment.
@kevinjqliu this is true for all properties though, i'm not sure if we should call it out specifically here - WDYT?
Theoretically we can put this as a disclaimer for the entire page, but I would separate that to a different PR regardless.
Updated the description of informational properties and the comment property for clarity.
|
@stevenzwu your suggestions were great, I replaced the texts with them - thanks! |
commentproperty, which engines like Spark and Flink set viaCOMMENTin CREATE TABLE DDLcommentthat provide semantic context about a table - https://lists.apache.org/thread/5q92y3dlnc3mb5b2cj72hzs6xmy3xtfl