Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
340 changes: 263 additions & 77 deletions AGENTS.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions CLAUDE.md
67 changes: 67 additions & 0 deletions cuda_bindings/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
This file describes `cuda_bindings`, the low-level CUDA host API bindings
subpackage in the `cuda-python` monorepo.

## Scope and principles

- **Role**: provide low-level, close-to-CUDA interfaces under
`cuda.bindings.*` with broad API coverage.
- **Style**: prioritize correctness and API compatibility over convenience
wrappers. High-level ergonomics belong in `cuda_core`, not here.
- **Cross-platform**: preserve Linux and Windows behavior unless a change is
intentionally platform-specific.

## Package architecture

- **Public module layer**: Cython modules under `cuda/bindings/` expose user
APIs (`driver`, `runtime`, `nvrtc`, `nvjitlink`, `nvvm`, `cufile`, etc.).
- **Internal binding layer**: `cuda/bindings/_bindings/` provides lower-level
glue and loader helpers used by public modules.
- **Platform internals**: `cuda/bindings/_internal/` contains
platform-specific implementation files and support code.
- **Build/codegen backend**: `build_hooks.py` drives header parsing, template
expansion, extension configuration, and Cythonization.

## Generated-source workflow

- **Do not hand-edit generated binding files**: many files under
`cuda/bindings/` (including `*.pyx`, `*.pxd`, `*.pyx.in`, and `*.pxd.in`)
are generated artifacts.
- **Generated files are synchronized from another repository**: changes to these
files in this repo are expected to be overwritten by the next sync.
- **If generated output must change**: make the change at the generation source
and sync the updated artifacts back here, rather than patching generated files
directly in this repo.
- **Header-driven generation**: parser behavior and required CUDA headers are
defined in `build_hooks.py`; update those rules when introducing new symbols.
- **Platform split files**: keep `_linux.pyx` and `_windows.pyx` variants
aligned when behavior should be equivalent.

## Testing expectations

- **Primary tests**: `pytest tests/`
- **Cython tests**:
- build: `tests/cython/build_tests.sh` (or platform equivalent)
- run: `pytest tests/cython/`
- **Examples**: example coverage is pytest-based under `examples/`.
- **Benchmarks**: run with `pytest --benchmark-only benchmarks/` when needed.
- **Orchestrated run**: from repo root, `scripts/run_tests.sh bindings`.

## Build and environment notes

- `CUDA_HOME` or `CUDA_PATH` must point to a valid CUDA Toolkit for source
builds that parse headers.
- `CUDA_PYTHON_PARALLEL_LEVEL` controls build parallelism.
- `CUDA_PYTHON_PARSER_CACHING` controls parser-cache behavior during generation.
- Runtime behavior is affected by
`CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM` and
`CUDA_PYTHON_DISABLE_MAJOR_VERSION_WARNING`.

## Editing guidance

- Keep CUDA return/error semantics explicit and avoid broad fallback behavior.
- Reuse existing helper layers (`_bindings`, `_internal`, `_lib`) before adding
new one-off utilities.
- If you add or change exported APIs, update relevant docs under
`docs/source/module/` and tests in `tests/`.
- Prefer changes that are easy to regenerate/rebuild rather than patching
generated output directly.
1 change: 1 addition & 0 deletions cuda_bindings/CLAUDE.md
65 changes: 65 additions & 0 deletions cuda_core/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
This file describes `cuda_core`, the high-level Pythonic CUDA subpackage in the
`cuda-python` monorepo.

## Scope and principles

- **Role**: provide higher-level CUDA abstractions (`Device`, `Stream`,
`Program`, `Linker`, memory resources, graphs) on top of `cuda.bindings`.
- **API intent**: keep interfaces Pythonic while preserving explicit CUDA
behavior and error visibility.
- **Compatibility**: changes should remain compatible with supported
`cuda.bindings` major versions (12.x and 13.x).

## Package architecture

- **Main package**: `cuda/core/` contains most Cython modules (`*.pyx`, `*.pxd`)
implementing runtime behaviors and public objects.
- **Subsystems**:
- memory/resource stack: `cuda/core/_memory/`
- system-level APIs: `cuda/core/system/`
- compile/link path: `_program.pyx`, `_linker.pyx`, `_module.pyx`
- execution path: `_launcher.pyx`, `_launch_config.pyx`, `_stream.pyx`
- **C++ helpers**: module-specific C++ implementations live under
`cuda/core/_cpp/`.
- **Build backend**: `build_hooks.py` handles Cython extension setup and build
dependency wiring.

## Build and version coupling

- `build_hooks.py` determines CUDA major version from `CUDA_CORE_BUILD_MAJOR`
or CUDA headers (`CUDA_HOME`/`CUDA_PATH`) and uses it for build decisions.
- Source builds require CUDA headers available through `CUDA_HOME` or
`CUDA_PATH`.
- `cuda_core` expects `cuda.bindings` to be present and version-compatible.

## Testing expectations

- **Primary tests**: `pytest tests/`
- **Cython tests**:
- build: `tests/cython/build_tests.sh` (or platform equivalent)
- run: `pytest tests/cython/`
- **Examples**: validate affected examples in `examples/` when changing user
workflows or public APIs.
- **Orchestrated run**: from repo root, `scripts/run_tests.sh core`.

## Runtime/build environment notes

- Runtime env vars commonly relevant:
- `CUDA_PYTHON_CUDA_PER_THREAD_DEFAULT_STREAM`
- `CUDA_PYTHON_DISABLE_MAJOR_VERSION_WARNING`
- Build env vars commonly relevant:
- `CUDA_HOME` / `CUDA_PATH`
- `CUDA_CORE_BUILD_MAJOR`
- `CUDA_PYTHON_PARALLEL_LEVEL`
- `CUDA_PYTHON_COVERAGE`

## Editing guidance

- Keep user-facing behaviors coherent with docs and examples, especially around
stream semantics, memory ownership, and compile/link flows.
- Reuse existing shared utilities in `cuda/core/_utils/` before adding new
helpers.
- When changing Cython signatures or cimports, verify related `.pxd` and
call-site consistency.
- Prefer explicit error propagation over silent fallback paths.
- If you change public behavior, update tests and docs under `docs/source/`.
1 change: 1 addition & 0 deletions cuda_core/CLAUDE.md
72 changes: 72 additions & 0 deletions cuda_pathfinder/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
This file describes `cuda_pathfinder`, a Python sub-package of
[cuda-python](https://github.com/NVIDIA/cuda-python). It locates and loads
NVIDIA dynamic libraries (CTK, third-party, and driver) across Linux and
Windows.

## Scope and principles

- **Language**: all implementation code in this package is pure Python.
- **Public API**: keep user-facing imports stable via `cuda.pathfinder`.
Internal modules should stay under `cuda.pathfinder._*`.
- **Behavior**: loader behavior must remain deterministic and explicit. Avoid
"best effort" silent fallbacks that mask why discovery/loading failed.
- **Cross-platform**: preserve Linux and Windows behavior parity unless a change
is explicitly platform-scoped.

## Package architecture

- **Descriptor source-of-truth**: `cuda/pathfinder/_dynamic_libs/descriptor_catalog.py`
defines canonical metadata for known libraries.
- **Registry layers**:
- `lib_descriptor.py` builds the name-keyed runtime registry from the catalog.
- `supported_nvidia_libs.py` keeps legacy exported tables derived from the
catalog for compatibility.
- **Search pipeline**:
- `search_steps.py` implements composable find steps (`site-packages`,
`CONDA_PREFIX`, `CUDA_HOME`/`CUDA_PATH`, canary-assisted CTK root flow).
- `search_platform.py` and `platform_loader.py` isolate OS-specific logic.
- **Load orchestration**:
- `load_nvidia_dynamic_lib.py` coordinates find/load phases, dependency
loading, driver-lib fast path, and cache semantics.
- **Process isolation helper**:
- `cuda/pathfinder/_utils/spawned_process_runner.py` is used where process
global dynamic loader state would otherwise leak across tests.

## Editing guidance

- **Edit authored descriptors, not derived tables**: when adding/changing a
library, update `descriptor_catalog.py` first; keep derived exports in sync
through existing conversion logic and tests.
- **Respect cache semantics**: `load_nvidia_dynamic_lib` is cached. Never add
behavior that closes returned handles or assumes repeated fresh loads.
- **Keep error contracts intact**:
- unknown name -> `DynamicLibUnknownError`
- known but unsupported on this OS -> `DynamicLibNotAvailableError`
- known/supported but not found/loadable -> `DynamicLibNotFoundError`
- **Do not hardcode host assumptions**: avoid baking in machine-local paths,
shell-specific quoting, or environment assumptions.
- **Prefer focused abstractions**: if a change is platform-specific, route it
through existing platform abstraction points instead of branching in many call
sites.

## Testing expectations

- **Primary command**: run `pytest tests/` from `cuda_pathfinder/`.
- **Real-loading tests**: prefer spawned child-process tests for actual dynamic
loading behavior; avoid in-process cross-test interference.
- **Cache-aware tests**: if a test patches internals used by
`load_nvidia_dynamic_lib`, call `load_nvidia_dynamic_lib.cache_clear()`.
- **Negative-case names**: use obviously fake names (for example
`"not_a_real_lib"`) in unknown/invalid-lib tests.
- **INFO summary in CI logs**: use `info_summary_append` for useful
test-context lines visible in terminal summaries.
- **Strictness toggle**:
`CUDA_PATHFINDER_TEST_LOAD_NVIDIA_DYNAMIC_LIB_STRICTNESS` controls whether
missing libraries are tolerated (`see_what_works`) or treated as failures
(`all_must_work`).

## Useful commands

- Run package tests: `pytest tests/`
- Run package tests via orchestrator: `../scripts/run_tests.sh pathfinder`
- Build package docs: `cd docs && ./build_docs.sh`
1 change: 1 addition & 0 deletions cuda_pathfinder/CLAUDE.md
24 changes: 24 additions & 0 deletions cuda_python/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
This file describes `cuda_python`, the metapackage layer in the `cuda-python`
monorepo.

## Scope

- `cuda_python` is primarily packaging and documentation glue.
- It does not host substantial runtime APIs like `cuda_core`,
`cuda_bindings`, or `cuda_pathfinder`.

## Main files to edit

- `pyproject.toml`: project metadata and dynamic dependency declaration.
- `setup.py`: dynamic dependency pinning logic for matching `cuda-bindings`
versions (release vs pre-release behavior).
- `docs/`: top-level docs build/aggregation scripts.

## Editing guidance

- Keep this package lightweight; prefer implementing runtime features in the
component packages rather than here.
- Be careful when changing dependency/version logic in `setup.py`; preserve
compatibility between metapackage versioning and subpackage constraints.
- If you update docs structure, ensure `docs/build_all_docs.sh` still collects
docs from `cuda_python`, `cuda_bindings`, `cuda_core`, and `cuda_pathfinder`.
1 change: 1 addition & 0 deletions cuda_python/CLAUDE.md