BLDX-434 | Migrate to msgspec.Struct models#811
Draft
Aryamanz29 wants to merge 13 commits intomainfrom
Draft
Conversation
Contributor
|
@greptile review |
|
Too many files changed for review. ( |
Contributor
|
@greptile re-review |
- Remove dead code: admin/, checkpoint.py, exceptions.py, py.typed - Delete old single-file client.py (replaced by client/ package later) - Rename models/ → model/assets/ for consistency with legacy pyatlan/model/assets/ - Move infrastructure files (conversion_utils.py, serde.py, transform.py) to model/ - Add model/__init__.py for package-level re-exports - Update all import paths (pyatlan_v9.models → pyatlan_v9.model.assets) - Update all model test imports to match new paths
Migrate all legacy pyatlan model files (AtlanObject/Pydantic BaseModel) to pyatlan_v9 msgspec.Struct equivalents: Infrastructure: - core.py: AtlanObject base, AtlanTag, AtlanField helpers - structs.py: SourceTagAttachment, BadgeCondition, etc. - translators.py/retranslators.py: Tag name translation pipeline Models (28 new files): - search.py: DSL, IndexSearchRequest, Query types - typedef.py: EnumDef, StructDef, AtlanTagDef, CustomMetadataDef, etc. - lineage.py: LineageListRequest, FluentLineage, LineageResponse, etc. - audit.py, search_log.py: AuditSearchRequest, SearchLogRequest - response.py: AssetMutationResponse, AssetResponse - group.py, user.py, role.py: GroupRequest, AtlanUser, AtlanRole - credential.py, oauth_client.py, sso.py, api_tokens.py - events.py, keycloak_events.py: AtlanEvent, KeycloakEvent - query.py, task.py, workflow.py, suggestions.py - aggregation.py, atlan_image.py, contract.py, custom_metadata.py - data_mesh.py, dq_rule_conditions.py, file.py, internal.py, lineage_ref.py Assets: - purpose.py: Purpose model with tag translation support - snowflake_dynamic_table.py: SnowflakeDynamicTable model All models use msgspec conventions: kw_only=True, UNSET/UnsetType, rename='camel' where needed, and proper serialization methods.
Convert AtlanClient from a plain Python class to msgspec.Struct: - AtlanClient(msgspec.Struct, kw_only=True): base_url, api_key, proxy, verify, retry config, and httpx session management - PyatlanSyncTransport/PyatlanAsyncTransport: custom httpx transports with configurable retry logic - Delegates to legacy pyatlan sub-clients (AssetClient, GroupClient, etc.) while maintaining the same interface - Uses __post_init__ for session initialization and header configuration - Supports proxy and SSL verification configuration via constructor args or environment variables
Port non-model tests from legacy tests/unit/ to tests_v9/unit/: Test files ported: - test_client.py: 200 tests (full parity with legacy 89 — deprecated excluded) Covers: terms operations, find operations, error handling, batch, bulk request, proxy/SSL config, pagination, validation, DQ rules - test_typedef_model.py: 47 tests (EnumDef, StructDef, AtlanTagDef, etc.) - test_search_model.py: 231 tests (DSL, queries, sort, pagination) - test_atlan_tag_name.py: 6 tests (tag name resolution) - test_core.py: 12 tests (AtlanObject, AtlanTag, Announcement) - test_structs.py: 1 test (SourceTagAttachment) - test_utils.py: 17 tests (utility functions) Infrastructure: - conftest.py: Pydantic v9-compat layer that allows legacy client methods (using @validate_arguments) to accept msgspec.Struct instances by converting them to legacy Pydantic models on the fly. Also patches Pydantic JSON encoder for msgspec.Struct serialization. - constants.py: Shared test constants Key patterns: - Tests use v9 models (pyatlan_v9.model.assets) where possible - Legacy Pydantic models used for BulkRequest/Batch tests (Pydantic internals) - Client-returned objects checked by type name (legacy deserialisation) Total v9 test suite: 1540 passed, 2 skipped
Contributor
|
@claude /review |
Implement a framework-agnostic validate_arguments decorator in pyatlan/client/common/validate.py that replaces pydantic.v1's @validate_arguments. This decorator: - Validates function arguments against type annotations - Supports both Pydantic BaseModel and msgspec.Struct models - Handles basic types (str, int, bool, float, Enum) - Handles container types (List, Set, Dict, Tuple) - Handles Optional[X], Union[X, Y], Type[X], Callable - Handles Pydantic constrained types (constr, StrictStr, StrictBool, StrictInt) - Handles TypeVar resolution for bound types - Supports enum coercion from string values - Matches Pydantic v1 error message format and ValueError exceptions Replace all pydantic.v1 validate_arguments imports across: - 19 sync client files (pyatlan/client/*.py) - 19 async client files (pyatlan/client/aio/*.py) - 3 model files (lineage.py, search.py, ui.py)
Update all test files to work with the custom validate_arguments decorator's error format: - Replace pytest.raises(ValidationError) with pytest.raises(ValueError) since the custom decorator raises ValueError directly - Remove unused pydantic.v1 ValidationError imports - Update expected error messages in tests/unit/constants.py to match the custom decorator's output format: - Remove Pydantic-specific (type=...) suffixes - Update error counts for Union/list validation (1 vs N) - Match 'instance of X expected' for non-builtin types - Match 'str type expected', 'none is not an allowed value' etc. - Update credential, SSO, query, task, workflow, file, UI test files with corrected error message formats - Remove trailing spaces from error message constants
…v9 test failures - Moved pyatlan/client/common/validate.py → pyatlan/validate.py to break circular import chain: search.py → client.common.__init__ → asset → model.assets → atlan_fields - Updated all 41 import paths from pyatlan.client.common.validate to pyatlan.validate - Added Mock spec-class support to _is_model_instance for v9 Batch tests - Changed v9 test_client.py to catch ValueError instead of ValidationError - All tests pass: 5798 legacy + 1540 v9
…c models - Replace isinstance checks in client/common/asset.py with _is_model_instance for dual-model compatibility (9 call sites) - Replace isinstance in client/asset.py and aio/batch.py for AtlasGlossaryTerm - Make BulkRequest.process_attributes skip msgspec models (they handle relationship categorization in their own serialization pipeline) - Use _is_model_instance in BulkRequest.process_relationship_attributes - Register msgspec JSON encoder in pyatlan/model/core.py using model's own to_json(nested=True) for proper nested API format serialization - Make Asset._convert_to_real_type_ accept v9 msgspec models via _is_model_instance - Remove all monkey-patches from tests_v9/unit/conftest.py (Patch 1-4 no longer needed — dual-model support is now in production code) All tests pass: 5798 legacy + 1540 v9
- search.py: delete 25 duplicated dataclass/ABC/Enum classes and ABC registration block; re-export from pyatlan.model.search. Keep only msgspec DSL/IndexSearchRequest/IndexSearchRequestMetadata + v9 helpers. - lineage.py: delete duplicated DirectedPair/LineageGraph; re-export from pyatlan.model.lineage. - audit.py: delete duplicated AuditActionType; re-export from pyatlan.model.audit. - pyatlan/model/search.py: TermAttributes/TextAttributes use plain str/bool/float instead of Pydantic StrictStr/StrictBool/StrictFloat. - pyatlan/validate.py: add _is_model_instance helper for cross-boundary Pydantic/msgspec isinstance checks. - pyatlan/client/common/asset.py: register msgspec.Struct in Pydantic ENCODERS_BY_TYPE for JSON serialization. - core.py: add to_dict() on BulkRequest for nested serialization. - entity.py: add semantic field to Entity base class. - asset.py: ref_by_guid/ref_by_qualified_name accept semantic param. - tests: update VALUES_BY_TYPE to plain types, use _is_model_instance for cross-boundary assertions, clean up test_client.py imports.
…t parity - Add v9 TaskSearchRequest (msgspec.Struct) with json() method - Add v9 FluentTasks that produces v9 DSL and TaskSearchRequest - Add from_yaml()/to_yaml() to v9 DataContractSpec - Update test_task_client.py to use v9 models - Update open_lineage_test.py to use v9 FluentTasks - Update data_contract_test.py to use v9 DataContractSpec - Fix imports in atlan_fields_test, data_quality_rule_test, workflow_client - Document test_packages.py legacy Asset import (ClassVar fields) - Port v9 test files: credential, custom relationships, workflow, etc. - Client layer: validate_arguments migration, msgspec Struct support - Formatting changes All 7650 tests pass (1852 v9 + 5798 legacy)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
✨ Description
https://linear.app/atlan-epd/issue/BLDX-434/plan-productionization-of-msgspecstruct-models
🧩 Type of change
Select all that apply:
📋 Checklist