From 4c782419c3c4a0eeabc6b7165b56b7fff24aebae Mon Sep 17 00:00:00 2001
From: Phillip Cloud <417981+cpcloud@users.noreply.github.com>
Date: Fri, 6 Mar 2026 16:34:29 -0500
Subject: [PATCH 1/2] docs: seed package-specific AGENTS guidance

Create initial AGENTS.md coverage at root plus cuda_pathfinder, cuda_bindings, cuda_core, and cuda_python with clearer root-vs-package scope and practical conventions for agent behavior. Treat this as an initial seed to iterate on with real usage rather than a final policy spec.

Made-with: Cursor
---
 AGENTS.md                 | 340 +++++++++++++++++++++++++++++---------
 cuda_bindings/AGENTS.md   |  67 ++++++++
 cuda_core/AGENTS.md       |  65 ++++++++
 cuda_pathfinder/AGENTS.md |  72 ++++++++
 cuda_python/AGENTS.md     |  24 +++
 5 files changed, 491 insertions(+), 77 deletions(-)
 create mode 100644 cuda_bindings/AGENTS.md
 create mode 100644 cuda_core/AGENTS.md
 create mode 100644 cuda_pathfinder/AGENTS.md
 create mode 100644 cuda_python/AGENTS.md

diff --git a/AGENTS.md b/AGENTS.md
index 06fd7da3ed..525d300801 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,77 +1,263 @@
-# cuda_pathfinder agent instructions
-
-You are working on `cuda_pathfinder`, a Python sub-package of the
-[cuda-python](https://github.com/NVIDIA/cuda-python) monorepo. It finds and
-loads NVIDIA dynamic libraries (CTK, third-party, and driver) across Linux and
-Windows.
-
-## Workspace
-
-The workspace root is `cuda_pathfinder/` inside the monorepo. Use the
-`working_directory` parameter on the Shell tool when you need the monorepo root
-(one level up).
-
-## Conventions
-
-- **Python**: all source is pure Python (no Cython in this sub-package).
-- **Testing**: `pytest` with `pytest-mock` (`mocker` fixture). Use
-  `spawned_process_runner` for real-loading tests that need process isolation
-  (dynamic linker state leaks across tests otherwise). Use the
-  `info_summary_append` fixture to emit `INFO` lines visible in CI/QA logs.
-- **STRICTNESS env var**: `CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS`
-  controls whether missing libs are tolerated (`see_what_works`, default) or
-  fatal (`all_must_work`).
-- **Formatting/linting**: rely on pre-commit (runs automatically on commit). Do
-  not run formatters manually.
-- **Imports**: use `from cuda.pathfinder._dynamic_libs...` for internal imports
-  in tests; public API is `from cuda.pathfinder import load_nvidia_dynamic_lib`.
-
-## Testing guidelines
-
-- **Real tests over mocks**: mocks are fine for hard-to-reach branches (e.g.
-  24-bit Python), but every loading path must also have a real-loading test that
-  runs in a spawned child process. Track results with `INFO` lines so CI logs
-  show what actually loaded.
-- **No real lib names in negative tests**: when parametrizing unsupported /
-  invalid libnames, use obviously fake names (`"bogus"`, `"not_a_real_lib"`)
-  to avoid confusion when searching the codebase.
-- **`functools.cache` awareness**: `load_nvidia_dynamic_lib` is cached. Tests
-  that patch internals it depends on must call
-  `load_nvidia_dynamic_lib.cache_clear()` first, or use a child process for
-  isolation.
-
-## Key modules
-
-- `cuda/pathfinder/_dynamic_libs/load_nvidia_dynamic_lib.py` -- main entry
-  point and dispatch logic (CTK vs driver).
-- `cuda/pathfinder/_dynamic_libs/supported_nvidia_libs.py` -- canonical
-  registry of sonames, DLLs, site-packages paths, and dependencies.
-- `cuda/pathfinder/_dynamic_libs/find_nvidia_dynamic_lib.py` -- CTK search
-  cascade (site-packages, conda, CUDA_HOME).
-- `tests/child_load_nvidia_dynamic_lib_helper.py` -- lightweight helper
-  imported by spawned child processes (avoids re-importing the full test
-  module).
-
-### Fix all code review findings from lib-descriptor-refactor review
-
-**Request:** Fix all 8 findings from the external code review.
-
-**Actions (in worktree `cuda_pathfinder_refactor`):**
-1. `search_steps.py`: Restored `os.path.normpath(dirname)` in
-   `_find_lib_dir_using_anchor` (regression from pre-refactor fix). Added
-   `NoReturn` annotation to `raise_not_found`.
-2. `search_platform.py`: Guarded `os.listdir(lib_dir)` in
-   `WindowsSearchPlatform.find_in_lib_dir` with `os.path.isdir` check to
-   prevent crash on missing directory.
-3. `test_descriptor_catalog.py`: Rewrote tautological tests as structural
-   invariant tests (uniqueness, valid names, strategy values, dep graph,
-   soname/dll format, driver lib constraints). 237 new parametrized cases.
-4. `platform_loader.py`: Eliminated `WindowsLoader`/`LinuxLoader` boilerplate
-   classes — assign the platform module directly as `LOADER`. Removed stale
-   `type: ignore`.
-5. `descriptor_catalog.py`: Trimmed default-valued fields from all entries,
-   added `# ---` section comments (CTK / third-party / driver).
-6. `load_nvidia_dynamic_lib.py`: Fixed import layout — `TYPE_CHECKING` block
-   now properly separated after unconditional imports.
-
-All 742 tests pass, all pre-commit hooks green.
+# cuda-python monorepo agent instructions
+
+This file contains repository-wide guidance.
+
+When a subdirectory has its own `AGENTS.md`, treat that file as the primary
+guide for package-specific conventions and workflows.
+
+## Package map
+
+- `cuda_pathfinder/`: Pure-Python library discovery and loading utilities.
+- `cuda_bindings/`: Low-level CUDA host API bindings (Cython-heavy).
+- `cuda_core/`: High-level Pythonic CUDA APIs built on top of bindings.
+- `cuda_python/`: Metapackage and docs aggregation.
+
+# General
+
+- When searching for text or files, prefer using `rg` or `rg --files`
+  respectively because `rg` is much faster than alternatives like `grep`. (If
+  the `rg` command is not found, then use alternatives.)
+- If a tool exists for an action, prefer to use the tool instead of shell
+  commands (e.g `read_file` over `cat`). Strictly avoid raw `cmd`/terminal when
+  a dedicated tool exists. Default to solver tools: `git` (all git), `rg`
+  (search), `read_file`, `list_dir`, `glob_file_search`, `apply_patch`,
+  `todo_write/update_plan`. Use `cmd`/`run_terminal_cmd` only when no listed
+  tool can perform the action.
+- When multiple tool calls can be parallelized (e.g., todo updates with other
+  actions, file searches, reading files), make these tool calls in parallel
+  instead of sequential. Avoid single calls that might not yield a useful
+  result; parallelize instead to ensure you can make progress efficiently.
+- Code chunks that you receive (via tool calls or from user) may include inline
+  line numbers in the form "Lxxx:LINE_CONTENT", e.g. "L123:LINE_CONTENT". Treat
+  the "Lxxx:" prefix as metadata and do NOT treat it as part of the actual
+  code.
+- Default expectation: deliver working code, not just a plan. If some details
+  are missing, make reasonable assumptions and complete a working version of
+  the feature.
+
+
+# Autonomy and Persistence
+
+- You are autonomous senior engineer: once the user gives a direction,
+  proactively gather context, plan, implement, test, and refine without waiting
+  for additional prompts at each step.
+- Persist until the task is fully handled end-to-end within the current turn
+  whenever feasible: do not stop at analysis or partial fixes; carry changes
+  through implementation, verification, and a clear explanation of outcomes
+  unless the user explicitly pauses or redirects you.
+- Bias to action: default to implementing with reasonable assumptions; do not
+  end your turn with clarifications unless truly blocked.
+- Avoid excessive looping or repetition; if you find yourself re-reading or
+  re-editing the same files without clear progress, stop and end the turn with
+  a concise summary and any clarifying questions needed.
+
+
+# Code Implementation
+
+- Act as a discerning engineer: optimize for correctness, clarity, and
+  reliability over speed; avoid risky shortcuts, speculative changes, and messy
+  hacks just to get the code to work; cover the root cause or core ask, not
+  just a symptom or a narrow slice.
+- Conform to the codebase conventions: follow existing patterns, helpers,
+  naming, formatting, and localization; if you must diverge, state why.
+- Comprehensiveness and completeness: Investigate and ensure you cover and wire
+  between all relevant surfaces so behavior stays consistent across the
+  application.
+- Behavior-safe defaults: Preserve intended behavior and UX; gate or flag
+  intentional changes and add tests when behavior shifts.
+- Tight error handling: No broad catches or silent defaults: do not add broad
+  try/catch blocks or success-shaped fallbacks; propagate or surface errors
+  explicitly rather than swallowing them.
+  - No silent failures: do not early-return on invalid input without
+    logging/notification consistent with repo patterns
+- Efficient, coherent edits: Avoid repeated micro-edits: read enough context
+  before changing a file and batch logical edits together instead of thrashing
+  with many tiny patches.
+- Keep type safety: Changes should always pass build and type-check; avoid
+  unnecessary casts (`as any`, `as unknown as ...`); prefer proper types and
+  guards, and reuse existing helpers (e.g., normalizing identifiers) instead of
+  type-asserting.
+- Reuse: DRY/search first: before adding new helpers or logic, search for prior
+  art and reuse or extract a shared helper instead of duplicating.
+- Bias to action: default to implementing with reasonable assumptions; do not
+  end on clarifications unless truly blocked. Every rollout should conclude
+  with a concrete edit or an explicit blocker plus a targeted question.
+
+
+# Editing constraints
+
+- Default to ASCII when editing or creating files. Only introduce non-ASCII or
+  other Unicode characters when there is a clear justification and the file
+  already uses them.
+- Add succinct code comments that explain what is going on if code is not
+  self-explanatory. You should not add comments like "Assigns the value to the
+  variable", but a brief comment might be useful ahead of a complex code block
+  that the user would otherwise have to spend time parsing out. Usage of these
+  comments should be rare.
+- Try to use apply_patch for single file edits, but it is fine to explore other
+  options to make the edit if it does not work well. Do not use apply_patch for
+  changes that are auto-generated (i.e. generating package.json or running
+  a lint or format command like gofmt) or when scripting is more efficient
+  (such as search and replacing a string across a codebase).
+- You may be in a dirty git worktree.
+    * NEVER revert existing changes you did not make unless explicitly
+      requested, since these changes were made by the user.
+    * If asked to make a commit or code edits and there are unrelated changes
+      to your work or changes that you didn't make in those files, don't revert
+      those changes.
+    * If the changes are in files you've touched recently, you should read
+      carefully and understand how you can work with the changes rather than
+      reverting them.
+    * If the changes are in unrelated files, just ignore them and don't revert
+      them.
+- Do not amend a commit unless explicitly requested to do so.
+- While you are working, you might notice unexpected changes that you didn't
+  make. If this happens, STOP IMMEDIATELY and ask the user how they would like
+  to proceed.
+- **NEVER** use destructive commands like `git reset --hard` or `git checkout
+  --` unless specifically requested or approved by the user.
+
+
+# Exploration and reading files
+
+- **Think first.** Before any tool call, decide ALL files/resources you will
+  need.
+- **Batch everything.** If you need multiple files (even from different
+  places), read them together.
+- **multi_tool_use.parallel** Use `multi_tool_use.parallel` to parallelize tool
+  calls and only this.
+- **Only make sequential calls if you truly cannot know the next file without
+  seeing a result first.**
+- **Workflow:** (a) plan all needed reads → (b) issue one parallel batch → (c)
+  analyze results → (d) repeat if new, unpredictable reads arise.
+- Additional notes:
+    - Always maximize parallelism. Never read files one-by-one unless logically unavoidable.
+    - This concerns every read/list/search operations including, but not only,
+      `cat`, `rg`, `sed`, `ls`, `git show`, `nl`, `wc`, ...
+    - Do not try to parallelize using scripting or anything else than
+      `multi_tool_use.parallel`.
+
+
+# Plan tool
+
+When using the planning tool:
+- Skip using the planning tool for straightforward tasks (roughly the easiest
+  25%).
+- Do not make single-step plans.
+- When you made a plan, update it after having performed one of the sub-tasks
+  that you shared on the plan.
+- Unless asked for a plan, never end the interaction with only a plan. Plans
+  guide your edits; the deliverable is working code.
+- Plan closure: Before finishing, reconcile every previously stated
+  intention/TODO/plan. Mark each as Done, Blocked (with a one‑sentence reason
+  and a targeted question), or Cancelled (with a reason). Do not end with
+  in_progress/pending items. If you created todos via a tool, update their
+  statuses accordingly.
+- Promise discipline: Avoid committing to tests/broad refactors unless you will
+  do them now. Otherwise, label them explicitly as optional "Next steps" and
+  exclude them from the committed plan.
+- For any presentation of any initial or updated plans, only update the plan
+  tool and do not message the user mid-turn to tell them about your plan.
+
+
+# Special user requests
+
+- If the user makes a simple request (such as asking for the time) which you
+  can fulfill by running a terminal command (such as `date`), you should do so.
+- If the user asks for a "review", default to a code review mindset: prioritise
+  identifying bugs, risks, behavioural regressions, and missing tests. Findings
+  must be the primary focus of the response - keep summaries or overviews brief
+  and only after enumerating the issues. Present findings first (ordered by
+  severity with file/line references), follow with open questions or
+  assumptions, and offer a change-summary only as a secondary detail. If no
+  findings are discovered, state that explicitly and mention any residual risks
+  or testing gaps.
+
+
+# Frontend tasks
+
+When doing frontend design tasks, avoid collapsing into "AI slop" or safe,
+average-looking layouts. Aim for interfaces that feel intentional, bold, and
+a bit surprising.
+- Typography: Use expressive, purposeful fonts and avoid default stacks (Inter,
+  Roboto, Arial, system).
+- Color & Look: Choose a clear visual direction; define CSS variables; avoid
+  purple-on-white defaults. No purple bias or dark mode bias.
+- Motion: Use a few meaningful animations (page-load, staggered reveals)
+  instead of generic micro-motions.
+- Background: Don't rely on flat, single-color backgrounds; use gradients,
+  shapes, or subtle patterns to build atmosphere.
+- Overall: Avoid boilerplate layouts and interchangeable UI patterns. Vary
+  themes, type families, and visual languages across outputs.
+- Ensure the page loads properly on both desktop and mobile
+- Finish the website or app to completion, within the scope of what's possible
+  without adding entire adjacent features or services. It should be in
+  a working state for a user to run and test.
+
+Exception: If working within an existing website or design system, preserve the
+established patterns, structure, and visual language.
+
+
+# Presenting your work and final message
+
+You are producing plain text that will later be styled by the CLI. Follow these
+rules exactly. Formatting should make results easy to scan, but not feel
+mechanical. Use judgment to decide how much structure adds value.
+
+- Default: be very concise; friendly coding teammate tone.
+- Format: Use natural language with high-level headings.
+- Ask only when needed; suggest ideas; mirror the user's style.
+- For substantial work, summarize clearly; follow final‑answer formatting.
+- Skip heavy formatting for simple confirmations.
+- Don't dump large files you've written; reference paths only.
+- No "save/copy this file" - User is on the same machine.
+- Offer logical next steps (tests, commits, build) briefly; add verify steps if
+  you couldn't do something.
+- For code changes:
+  * Lead with a quick explanation of the change, and then give more details on
+    the context covering where and why a change was made. Do not start this
+    explanation with "summary", just jump right in.
+  * If there are natural next steps the user may want to take, suggest them at
+    the end of your response. Do not make suggestions if there are no natural
+    next steps.
+  * When suggesting multiple options, use numeric lists for the suggestions so
+    the user can quickly respond with a single number.
+- The user does not command execution outputs. When asked to show the output of
+  a command (e.g. `git show`), relay the important details in your answer or
+  summarize the key lines so the user understands the result.
+
+## Final answer structure and style guidelines
+
+- Plain text; CLI handles styling. Use structure only when it helps
+  scanability.
+- Headers: optional; short Title Case (1-3 words) wrapped in **…**; no blank
+  line before the first bullet; add only if they truly help.
+- Bullets: use - ; merge related points; keep to one line when possible; 4–6
+  per list ordered by importance; keep phrasing consistent.
+- Monospace: backticks for commands/paths/env vars/code ids and inline
+  examples; use for literal keyword bullets; never combine with double asterisk.
+- Code samples or multi-line snippets should be wrapped in fenced code blocks;
+  include an info string as often as possible.
+- Structure: group related bullets; order sections general → specific
+  → supporting; for subsections, start with a bolded keyword bullet, then
+  items; match complexity to the task.
+- Tone: collaborative, concise, factual; present tense, active voice;
+  self‑contained; no "above/below"; parallel wording.
+- Don'ts: no nested bullets/hierarchies; no ANSI codes; don't cram unrelated
+  keywords; keep keyword lists short—wrap/reformat if long; avoid naming
+  formatting styles in answers.
+- Adaptation: code explanations → precise, structured with code refs; simple
+  tasks → lead with outcome; big changes → logical walkthrough + rationale
+  + next actions; casual one-offs → plain sentences, no headers/bullets.
+- File References: When referencing files in your response follow the below
+  rules:
+  * Use inline code to make file paths clickable.
+  * Each reference should have a stand alone path. Even if it's the same file.
+  * Accepted: absolute, workspace‑relative, a/ or b/ diff prefixes, or bare
+    filename/suffix.
+  * Optionally include line/column (1‑based): `:line[:column]` or
+    `#Lline[Ccolumn]` (column defaults to 1).
+  * Do not use URIs like `file://`, `vscode://`, or `https://`.
+  * Do not provide range of lines
+  * Examples: `src/app.ts`, src/app.ts:42, b/server/index.js#L10,
+    C:\repo\project\main.rs:12:5
diff --git a/cuda_bindings/AGENTS.md b/cuda_bindings/AGENTS.md
new file mode 100644
index 0000000000..9688c9f94c
--- /dev/null
+++ b/cuda_bindings/AGENTS.md
@@ -0,0 +1,67 @@
+This file describes `cuda_bindings`, the low-level CUDA host API bindings
+subpackage in the `cuda-python` monorepo.
+
+## Scope and principles
+
+- **Role**: provide low-level, close-to-CUDA interfaces under
+  `cuda.bindings.*` with broad API coverage.
+- **Style**: prioritize correctness and API compatibility over convenience
+  wrappers. High-level ergonomics belong in `cuda_core`, not here.
+- **Cross-platform**: preserve Linux and Windows behavior unless a change is
+  intentionally platform-specific.
+
+## Package architecture
+
+- **Public module layer**: Cython modules under `cuda/bindings/` expose user
+  APIs (`driver`, `runtime`, `nvrtc`, `nvjitlink`, `nvvm`, `cufile`, etc.).
+- **Internal binding layer**: `cuda/bindings/_bindings/` provides lower-level
+  glue and loader helpers used by public modules.
+- **Platform internals**: `cuda/bindings/_internal/` contains
+  platform-specific implementation files and support code.
+- **Build/codegen backend**: `build_hooks.py` drives header parsing, template
+  expansion, extension configuration, and Cythonization.
+
+## Generated-source workflow
+
+- **Do not hand-edit generated binding files**: many files under
+  `cuda/bindings/` (including `*.pyx`, `*.pxd`, `*.pyx.in`, and `*.pxd.in`)
+  are generated artifacts.
+- **Generated files are synchronized from another repository**: changes to these
+  files in this repo are expected to be overwritten by the next sync.
+- **If generated output must change**: make the change at the generation source
+  and sync the updated artifacts back here, rather than patching generated files
+  directly in this repo.
+- **Header-driven generation**: parser behavior and required CUDA headers are
+  defined in `build_hooks.py`; update those rules when introducing new symbols.
+- **Platform split files**: keep `_linux.pyx` and `_windows.pyx` variants
+  aligned when behavior should be equivalent.
+
+## Testing expectations
+
+- **Primary tests**: `pytest tests/`
+- **Cython tests**:
+  - build: `tests/cython/build_tests.sh` (or platform equivalent)
+  - run: `pytest tests/cython/`
+- **Examples**: example coverage is pytest-based under `examples/`.
+- **Benchmarks**: run with `pytest --benchmark-only benchmarks/` when needed.
+- **Orchestrated run**: from repo root, `scripts/run_tests.sh bindings`.
+
+## Build and environment notes
+
+- `CUDA_HOME` or `CUDA_PATH` must point to a valid CUDA Toolkit for source
+  builds that parse headers.
+- `CUDA_PYTHON_PARALLEL_LEVEL` controls build parallelism.
+- `CUDA_PYTHON_PARSER_CACHING` controls parser-cache behavior during generation.
+- Runtime behavior is affected by
+  `CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM` and
+  `CUDA_PYTHON_DISABLE_MAJOR_VERSION_WARNING`.
+
+## Editing guidance
+
+- Keep CUDA return/error semantics explicit and avoid broad fallback behavior.
+- Reuse existing helper layers (`_bindings`, `_internal`, `_lib`) before adding
+  new one-off utilities.
+- If you add or change exported APIs, update relevant docs under
+  `docs/source/module/` and tests in `tests/`.
+- Prefer changes that are easy to regenerate/rebuild rather than patching
+  generated output directly.
diff --git a/cuda_core/AGENTS.md b/cuda_core/AGENTS.md
new file mode 100644
index 0000000000..357e228360
--- /dev/null
+++ b/cuda_core/AGENTS.md
@@ -0,0 +1,65 @@
+This file describes `cuda_core`, the high-level Pythonic CUDA subpackage in the
+`cuda-python` monorepo.
+
+## Scope and principles
+
+- **Role**: provide higher-level CUDA abstractions (`Device`, `Stream`,
+  `Program`, `Linker`, memory resources, graphs) on top of `cuda.bindings`.
+- **API intent**: keep interfaces Pythonic while preserving explicit CUDA
+  behavior and error visibility.
+- **Compatibility**: changes should remain compatible with supported
+  `cuda.bindings` major versions (12.x and 13.x).
+
+## Package architecture
+
+- **Main package**: `cuda/core/` contains most Cython modules (`*.pyx`, `*.pxd`)
+  implementing runtime behaviors and public objects.
+- **Subsystems**:
+  - memory/resource stack: `cuda/core/_memory/`
+  - system-level APIs: `cuda/core/system/`
+  - compile/link path: `_program.pyx`, `_linker.pyx`, `_module.pyx`
+  - execution path: `_launcher.pyx`, `_launch_config.pyx`, `_stream.pyx`
+- **C++ helpers**: module-specific C++ implementations live under
+  `cuda/core/_cpp/`.
+- **Build backend**: `build_hooks.py` handles Cython extension setup and build
+  dependency wiring.
+
+## Build and version coupling
+
+- `build_hooks.py` determines CUDA major version from `CUDA_CORE_BUILD_MAJOR`
+  or CUDA headers (`CUDA_HOME`/`CUDA_PATH`) and uses it for build decisions.
+- Source builds require CUDA headers available through `CUDA_HOME` or
+  `CUDA_PATH`.
+- `cuda_core` expects `cuda.bindings` to be present and version-compatible.
+
+## Testing expectations
+
+- **Primary tests**: `pytest tests/`
+- **Cython tests**:
+  - build: `tests/cython/build_tests.sh` (or platform equivalent)
+  - run: `pytest tests/cython/`
+- **Examples**: validate affected examples in `examples/` when changing user
+  workflows or public APIs.
+- **Orchestrated run**: from repo root, `scripts/run_tests.sh core`.
+
+## Runtime/build environment notes
+
+- Runtime env vars commonly relevant:
+  - `CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM`
+  - `CUDA_PYTHON_DISABLE_MAJOR_VERSION_WARNING`
+- Build env vars commonly relevant:
+  - `CUDA_HOME` / `CUDA_PATH`
+  - `CUDA_CORE_BUILD_MAJOR`
+  - `CUDA_PYTHON_PARALLEL_LEVEL`
+  - `CUDA_PYTHON_COVERAGE`
+
+## Editing guidance
+
+- Keep user-facing behaviors coherent with docs and examples, especially around
+  stream semantics, memory ownership, and compile/link flows.
+- Reuse existing shared utilities in `cuda/core/_utils/` before adding new
+  helpers.
+- When changing Cython signatures or cimports, verify related `.pxd` and
+  call-site consistency.
+- Prefer explicit error propagation over silent fallback paths.
+- If you change public behavior, update tests and docs under `docs/source/`.
diff --git a/cuda_pathfinder/AGENTS.md b/cuda_pathfinder/AGENTS.md
new file mode 100644
index 0000000000..52159c84fb
--- /dev/null
+++ b/cuda_pathfinder/AGENTS.md
@@ -0,0 +1,72 @@
+This file describes `cuda_pathfinder`, a Python sub-package of
+[cuda-python](https://github.com/NVIDIA/cuda-python). It locates and loads
+NVIDIA dynamic libraries (CTK, third-party, and driver) across Linux and
+Windows.
+
+## Scope and principles
+
+- **Language**: all implementation code in this package is pure Python.
+- **Public API**: keep user-facing imports stable via `cuda.pathfinder`.
+  Internal modules should stay under `cuda.pathfinder._*`.
+- **Behavior**: loader behavior must remain deterministic and explicit. Avoid
+  "best effort" silent fallbacks that mask why discovery/loading failed.
+- **Cross-platform**: preserve Linux and Windows behavior parity unless a change
+  is explicitly platform-scoped.
+
+## Package architecture
+
+- **Descriptor source-of-truth**: `cuda/pathfinder/_dynamic_libs/descriptor_catalog.py`
+  defines canonical metadata for known libraries.
+- **Registry layers**:
+  - `lib_descriptor.py` builds the name-keyed runtime registry from the catalog.
+  - `supported_nvidia_libs.py` keeps legacy exported tables derived from the
+    catalog for compatibility.
+- **Search pipeline**:
+  - `search_steps.py` implements composable find steps (`site-packages`,
+    `CONDA_PREFIX`, `CUDA_HOME`/`CUDA_PATH`, canary-assisted CTK root flow).
+  - `search_platform.py` and `platform_loader.py` isolate OS-specific logic.
+- **Load orchestration**:
+  - `load_nvidia_dynamic_lib.py` coordinates find/load phases, dependency
+    loading, driver-lib fast path, and cache semantics.
+- **Process isolation helper**:
+  - `cuda/pathfinder/_utils/spawned_process_runner.py` is used where process
+    global dynamic loader state would otherwise leak across tests.
+
+## Editing guidance
+
+- **Edit authored descriptors, not derived tables**: when adding/changing a
+  library, update `descriptor_catalog.py` first; keep derived exports in sync
+  through existing conversion logic and tests.
+- **Respect cache semantics**: `load_nvidia_dynamic_lib` is cached. Never add
+  behavior that closes returned handles or assumes repeated fresh loads.
+- **Keep error contracts intact**:
+  - unknown name -> `DynamicLibUnknownError`
+  - known but unsupported on this OS -> `DynamicLibNotAvailableError`
+  - known/supported but not found/loadable -> `DynamicLibNotFoundError`
+- **Do not hardcode host assumptions**: avoid baking in machine-local paths,
+  shell-specific quoting, or environment assumptions.
+- **Prefer focused abstractions**: if a change is platform-specific, route it
+  through existing platform abstraction points instead of branching in many call
+  sites.
+
+## Testing expectations
+
+- **Primary command**: run `pytest tests/` from `cuda_pathfinder/`.
+- **Real-loading tests**: prefer spawned child-process tests for actual dynamic
+  loading behavior; avoid in-process cross-test interference.
+- **Cache-aware tests**: if a test patches internals used by
+  `load_nvidia_dynamic_lib`, call `load_nvidia_dynamic_lib.cache_clear()`.
+- **Negative-case names**: use obviously fake names (for example
+  `"not_a_real_lib"`) in unknown/invalid-lib tests.
+- **INFO summary in CI logs**: use `info_summary_append` for useful
+  test-context lines visible in terminal summaries.
+- **Strictness toggle**:
+  `CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS` controls whether
+  missing libraries are tolerated (`see_what_works`) or treated as failures
+  (`all_must_work`).
+
+## Useful commands
+
+- Run package tests: `pytest tests/`
+- Run package tests via orchestrator: `../scripts/run_tests.sh pathfinder`
+- Build package docs: `cd docs && ./build_docs.sh`
diff --git a/cuda_python/AGENTS.md b/cuda_python/AGENTS.md
new file mode 100644
index 0000000000..7c4fb9c0b1
--- /dev/null
+++ b/cuda_python/AGENTS.md
@@ -0,0 +1,24 @@
+This file describes `cuda_python`, the metapackage layer in the `cuda-python`
+monorepo.
+
+## Scope
+
+- `cuda_python` is primarily packaging and documentation glue.
+- It does not host substantial runtime APIs like `cuda_core`,
+  `cuda_bindings`, or `cuda_pathfinder`.
+
+## Main files to edit
+
+- `pyproject.toml`: project metadata and dynamic dependency declaration.
+- `setup.py`: dynamic dependency pinning logic for matching `cuda-bindings`
+  versions (release vs pre-release behavior).
+- `docs/`: top-level docs build/aggregation scripts.
+
+## Editing guidance
+
+- Keep this package lightweight; prefer implementing runtime features in the
+  component packages rather than here.
+- Be careful when changing dependency/version logic in `setup.py`; preserve
+  compatibility between metapackage versioning and subpackage constraints.
+- If you update docs structure, ensure `docs/build_all_docs.sh` still collects
+  docs from `cuda_python`, `cuda_bindings`, `cuda_core`, and `cuda_pathfinder`.

From ccb7f372a1dbb5a9189734a4fce5912b3b82fea1 Mon Sep 17 00:00:00 2001
From: Phillip Cloud <417981+cpcloud@users.noreply.github.com>
Date: Fri, 6 Mar 2026 18:04:36 -0500
Subject: [PATCH 2/2] chore: add claude links

---
 CLAUDE.md                 | 1 +
 cuda_bindings/CLAUDE.md   | 1 +
 cuda_core/CLAUDE.md       | 1 +
 cuda_pathfinder/CLAUDE.md | 1 +
 cuda_python/CLAUDE.md     | 1 +
 5 files changed, 5 insertions(+)
 create mode 120000 CLAUDE.md
 create mode 120000 cuda_bindings/CLAUDE.md
 create mode 120000 cuda_core/CLAUDE.md
 create mode 120000 cuda_pathfinder/CLAUDE.md
 create mode 120000 cuda_python/CLAUDE.md

diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 120000
index 0000000000..47dc3e3d86
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1 @@
+AGENTS.md
\ No newline at end of file
diff --git a/cuda_bindings/CLAUDE.md b/cuda_bindings/CLAUDE.md
new file mode 120000
index 0000000000..47dc3e3d86
--- /dev/null
+++ b/cuda_bindings/CLAUDE.md
@@ -0,0 +1 @@
+AGENTS.md
\ No newline at end of file
diff --git a/cuda_core/CLAUDE.md b/cuda_core/CLAUDE.md
new file mode 120000
index 0000000000..47dc3e3d86
--- /dev/null
+++ b/cuda_core/CLAUDE.md
@@ -0,0 +1 @@
+AGENTS.md
\ No newline at end of file
diff --git a/cuda_pathfinder/CLAUDE.md b/cuda_pathfinder/CLAUDE.md
new file mode 120000
index 0000000000..47dc3e3d86
--- /dev/null
+++ b/cuda_pathfinder/CLAUDE.md
@@ -0,0 +1 @@
+AGENTS.md
\ No newline at end of file
diff --git a/cuda_python/CLAUDE.md b/cuda_python/CLAUDE.md
new file mode 120000
index 0000000000..47dc3e3d86
--- /dev/null
+++ b/cuda_python/CLAUDE.md
@@ -0,0 +1 @@
+AGENTS.md
\ No newline at end of file