Skip to content

Add CUDA memory management APIs#1524

Open
alinpahontu2912 wants to merge 2 commits intodotnet:mainfrom
alinpahontu2912:add_cuda_memory_apis
Open

Add CUDA memory management APIs#1524
alinpahontu2912 wants to merge 2 commits intodotnet:mainfrom
alinpahontu2912:add_cuda_memory_apis

Conversation

@alinpahontu2912
Copy link
Member

Fixes
Add the following torch.cuda APIs:

  • empty_cache() - Release unoccupied cached memory (add empty_cache function #1521)
  • memory_allocated() - Current GPU memory occupied by tensors
  • max_memory_allocated() - Peak GPU memory occupied by tensors
  • reset_peak_memory_stats() - Reset peak memory tracking
  • memory_reserved() - Current GPU memory managed by caching allocator
  • max_memory_reserved() - Peak GPU memory managed by caching allocator
  • mem_get_info() - Free and total memory on device
  • set_device() - Set current CUDA device
  • current_device() - Get current CUDA device index

These APIs are commonly used in PyTorch workflows for memory management and debugging, and are needed by TorchSharpExamples users.

Native implementations use c10::cuda::CUDACachingAllocator with #if defined(USE_CUDA) guards for CPU-only build compatibility.

Includes unit tests for all new APIs.

alinpahontu2912 and others added 2 commits February 13, 2026 11:17
Add the following torch.cuda APIs:
- empty_cache() - Release unoccupied cached memory (dotnet#1521)
- memory_allocated() - Current GPU memory occupied by tensors
- max_memory_allocated() - Peak GPU memory occupied by tensors
- reset_peak_memory_stats() - Reset peak memory tracking
- memory_reserved() - Current GPU memory managed by caching allocator
- max_memory_reserved() - Peak GPU memory managed by caching allocator
- mem_get_info() - Free and total memory on device
- set_device() - Set current CUDA device
- current_device() - Get current CUDA device index

These APIs are commonly used in PyTorch workflows for memory
management and debugging, and are needed by TorchSharpExamples users.

Native implementations use c10::cuda::CUDACachingAllocator with
#if defined(USE_CUDA) guards for CPU-only build compatibility.

Includes unit tests for all new APIs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant