Skip to content

Fix dspy task 8563: update TOOL_CALL_TEST_CASES in features 2-6 test patches#42

Open
AlienKevin wants to merge 1 commit intocooperbench:mainfrom
AlienKevin:fix/dspy-t8563-test-expectations
Open

Fix dspy task 8563: update TOOL_CALL_TEST_CASES in features 2-6 test patches#42
AlienKevin wants to merge 1 commit intocooperbench:mainfrom
AlienKevin:fix/dspy-t8563-test-expectations

Conversation

@AlienKevin
Copy link
Contributor

Summary

  • The combined.patch changes ToolCalls.format() return type from list[dict] to dict when no metadata is present
  • Feature 1's test patch correctly updates TOOL_CALL_TEST_CASES expectations and test_tool_calls_format_from_dict_list assertions to match the new dict format
  • Features 2–6 test patches do not include this update
  • Since each feature's tests run against the full merged implementation (via runner.sh tests_patch merged.patch), the pre-existing test_tool_calls_format_basic[tool_calls_data0-expected0] fails whenever features 2–6 tests run — the test still expects the old list format but format() now returns a dict

Fix

Add the following updates from feature 1's test patch to features 2–6:

  1. Import update: add convert_input_schema_to_tool_args to the import line
  2. TOOL_CALL_TEST_CASES data update: [{"type": "tool_calls", ...}]{"tool_calls": [...]}
  3. test_tool_calls_format_from_dict_list assertion update: result[0]["tool_calls"]result["tool_calls"]

Test plan

  • Oracle test passes for all dspy task 8563 feature pairs (including f2-4 which previously failed)
  • Verified on Modal with Harbor evaluation framework

…patches

The combined.patch changes ToolCalls.format() return type from list to
dict when no metadata is present. Feature 1's test patch correctly
updates TOOL_CALL_TEST_CASES and format_from_dict_list assertions, but
features 2-6 test patches do not.

Since each feature's tests run against the full merged implementation
(which includes the format() change), features 2-6 tests fail on the
pre-existing test_tool_calls_format_basic[tool_calls_data0-expected0].

Add the same expectation updates (import fix, TOOL_CALL_TEST_CASES dict
format, format_from_dict_list assertions) to all 5 remaining feature
test patches.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant