Draft
Conversation
Each assertion type (Resolves, DoesNotResolve, ResolvesWith, ResolvesWithType, SearchByName, Needed) is now a self-contained class in src/babel_validation/assertions/, grouped by which service it targets (nodenorm.py, nameres.py, common.py). A central ASSERTION_HANDLERS registry in __init__.py maps lowercase assertion names to handler instances, and NodeNormAssertion / NameResAssertion marker base classes allow isinstance() checks for applicability. GitHubIssueTest.test_with_nodenorm() and test_with_nameres() are now 3-line dispatchers. A README.md documents all supported assertion types with examples in both wiki and YAML syntax. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pendently. Bumps pytest to >=9.0 (which includes built-in subtests support). Each TestResult is now evaluated in its own subtest block, so a failure no longer short-circuits the rest. Adds post-loop state-consistency subtests: a closed issue with failing tests fails with a "consider reopening" message, and an open issue where all tests pass emits an xfail "consider closing" hint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, get_github_issue_id() and GitHubIssueTest.__str__() both resolved the org/repo name via github_issue.repository.organization.name, which triggers lazy PyGitHub API calls. Parse org/repo from html_url instead (always present in the issue JSON, no extra round-trip needed). Also resolves the TODO comment in get_test_issues_from_issue(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Promote NodeNormAssertion and NameResAssertion from empty marker classes to full framework base classes (renamed NodeNormTest / NameResTest). All boilerplate (empty-param guards, CURIE pre-warming, yielded-values tracking) now lives in the base class test_with_nodenorm / test_with_nameres methods, which dispatch to a new test_param_set() hook. Handler authors now only override test_param_set() and call self.passed() / self.failed() — no TestResult/TestStatus construction, no loop scaffolding, no return-inside-generator bugs. Also fixes a latent bug where `return [TestResult(...)]` inside a generator body silently discarded the result (the list went to StopIteration.value instead of being yielded to the caller). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add PARAMETERS, WIKI_EXAMPLES, and YAML_PARAMS doc attributes to every assertion handler class, making the class body the single source of truth for documentation. Add gen_docs.py to render README.md from those attributes, and a freshness test that fails if README.md is out of date with the handlers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Move parametrization from module-level @pytest.mark.parametrize to a pytest_generate_tests hook in tests/github_issues/conftest.py. This enables two fast paths: --issue fetches only named issues directly (near-instant), and the full run pre-filters issues with BabelTest syntax at collection time rather than creating test items for all issues. Also fix get_test_issues_from_issue() to avoid a separate GitHub API call per issue by using repository.full_name instead of repository.organization.name. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add pytest-xdist[psutil] and filelock dependencies. Refactor github_issues conftest to parametrize with picklable string IDs instead of PyGithub Issue objects, add a filelock-guarded file cache for the full issue list (1-hour TTL), and hydrate Issue objects in a per-test fixture so xdist workers avoid redundant GitHub API scans. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a filelock-guarded file cache (TTL 1 hour) in GoogleSheetTestCases.__init__ so that when pytest-xdist workers reimport the module they read from the local temp file instead of each making a redundant HTTP request to Google Sheets. This enables parallel execution of gsheet tests with -n auto. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
IDENTIFIERS_WITH_DESCRIPTIONS and IDENTIFIERS_WITHOUT_DESCRIPTIONS were Python sets, which have unpredictable iteration order across processes. Workers collected tests in different orders, triggering xdist's "Different tests were collected" fatal error. Changed both to lists so parametrize order is stable across all workers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the 1-hour TTL with per-run invalidation: - pytest_configure hook in conftest.py deletes the gsheet CSV cache at the start of each run (controller only, not xdist workers). - GoogleSheetTestCases.__init__ now just checks whether the cache file exists; no more time-based TTL. Result: the controller downloads once, workers share that file — but the next pytest invocation always starts fresh, so edits to the Google Sheet are immediately reflected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Apply pytest.mark.xfail(strict=False) dynamically to each test when the GitHub issue is still open: - Open issue + any subtest fails → XFAIL (x) — known outstanding issue - Open issue + all subtests pass → XPASS (X) — issue ready to be closed - Closed issue + any subtest fails → FAIL (F) — regression, needs attention - Closed issue + all subtests pass → PASS (.) — as expected Removes the ad-hoc issue-state consistency subtests at the bottom, which are now fully covered by the xfail mechanism. Also drops the unused `from github import Issue` import. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Same pattern as the Google Sheet cache: the pytest_configure hook in tests/conftest.py now also deletes babel_validation_issues_cache.json on startup (controller only, not xdist workers). The TTL check in _get_all_test_issue_ids() is replaced with a plain existence check, so workers share the file the controller wrote without re-fetching. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents and verifies that unknown assertion names are silently parsed but raise ValueError at execution time for both wiki and YAML syntaxes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
068380f to
74c5e35
Compare
When an issue is open, individual failing subtests now call pytest.xfail() imperatively so they show as x (XFAIL) rather than F (FAIL) in the output. A parent-level pytest.xfail() is also called after all subtests complete (when any xfailed) to ensure the parent shows as XFAIL rather than XPASS. This preserves the XPASS signal for the case where all subtests pass on an open issue, indicating the issue may be ready to close. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace full-repo pagination with GitHub Search API in get_issues_with_tests(), reducing collection from O(all issues) fetches to O(matching issues). Add a module-level _fetched_issues_cache to avoid re-fetching issues already retrieved during collection. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When a GitHub issue embeds a BabelTest with an unrecognised assertion type, the test previously showed as XFAIL (misleading — implies the service is at fault, not the test itself). Now a pre-flight check runs before the xfail marker is applied: if any assertion type is unknown, pytest.fail() is called immediately, producing a hard FAIL with a clear message listing the bad type(s) and valid alternatives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds DoesNotResolveWithHandler, the inverse of ResolvesWith: asserts that a set of CURIEs do NOT all normalize to the same canonical identifier in NodeNorm. Also extracts _compare_resolutions() helper shared by both handlers, and fixes both ResolvesWith and DoesNotResolveWith to compare by result['id']['identifier'] rather than full JSON dumps — which is more robust and semantically correct (equality of the canonical identifier, not byte-for-byte JSON equality). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a new `HasLabel` assertion that checks a CURIE resolves to a
specific primary label in NodeNorm (id.label), enabling label regression
tests. Syntax: {{BabelTest|HasLabel|CHEBI:15365|aspirin}}.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
WIP
Next steps: