-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Update base image to 25.12 #8738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
Signed-off-by: Yun Liu <yunl@nvidia.com>
📝 WalkthroughWalkthroughThis PR updates the NVIDIA PyTorch base Docker image from 24.10 to 25.12 and removes its associated torch patching step. It introduces torch.compiler guards in RetinaNet's NaN/Inf validation checks to skip validation during graph compilation. A new helper method in TransChex extracts hidden states robustly across transformer versions. The TRT compiler now falls back to an alternative CUDA runtime import path. ONNX export logic adapts to PyTorch 2.6+ by using dynamo-based export instead of scripting. Signal filtering specifies explicit tensor dtype. Subprocess utilities improve error capture and handling. Various test tolerances are relaxed, and onnxruntime's Python version constraint is removed. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ❌ 5❌ Failed checks (3 warnings, 2 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@monai/networks/utils.py`:
- Around line 722-729: The comment and version check around torch.__version__
and selection of mode_to_export are incorrect: dynamo became the default
exporter in PyTorch 2.9, and PyTorch 2.6–2.8 require explicitly enabling dynamo;
also dynamo doesn't fully support ScriptModule. Update the comment to state that
dynamo became default in 2.9 and note that dynamo may not support ScriptModule,
and change the logic around mode_to_export (symbols: torch.__version__,
_pt_major_minor, mode_to_export, model, torch.jit.script) to only bypass
scripting when running on PyTorch >= 2.9 (or when dynamo is explicitly enabled)
and otherwise use torch.jit.script(model, **kwargs); ensure the comment
documents this behavior and the rationale so the exporter picks the correct
input type.
🧹 Nitpick comments (5)
tests/test_utils.py (1)
60-61: Inconsistent naming:_has_hf_localvshas_hf_hub.All other availability flags (
has_req,has_gdown,has_hf_hub) are unprefixed._has_hf_localbreaks the pattern.Proposed fix
-hf_local_entry_error, _has_hf_local = optional_import("huggingface_hub.errors", name="LocalEntryNotFoundError") +hf_local_entry_error, has_hf_local = optional_import("huggingface_hub.errors", name="LocalEntryNotFoundError")And update the guard on line 77:
-if _has_hf_local: +if has_hf_local:tests/networks/test_convert_to_onnx.py (1)
36-36: 100x relaxation onatol— verify this is necessary.
atolwent from1e-4to1e-2. That's a significant jump. If the dynamo-based ONNX export path introduces this much drift, it's worth confirming these tolerances are tight enough to still catch real regressions.monai/utils/misc.py (1)
882-887: Asymmetriccapture_outputdefault logic is fine but the docstring is stale.Line 872 says capture_output "will be overriden with the debug status" — but when
check=True, it now defaults toTrueregardless of debug. Consider updating the docstring to match.monai/networks/nets/transchex.py (1)
229-238: MissingArgsandReturnsin docstring.As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."
Proposed docstring fix
`@staticmethod` def _get_hidden_states(layer_output): """Extract hidden states from BertLayer output. Compatible with both older transformers (returns a tuple) and newer transformers >=5.0 (may return a tensor directly). + + Args: + layer_output: output from a BertLayer, either a tensor or a tuple + whose first element contains the hidden states. + + Returns: + The hidden states tensor. """ if isinstance(layer_output, torch.Tensor): return layer_output return layer_output[0]monai/apps/detection/networks/retinanet_network.py (1)
201-206: Samestacklevelnit here.Proposed fix
- warnings.warn("box_regression is NaN or Inf.") + warnings.warn("box_regression is NaN or Inf.", stacklevel=2)
| # In PyTorch 2.6+, torch.onnx.export defaults to the dynamo-based exporter | ||
| # which uses torch.export.export internally and does not support ScriptModule. | ||
| # Pass the raw nn.Module directly; the new exporter captures all code paths. | ||
| _pt_major_minor = tuple(int(x) for x in torch.__version__.split("+")[0].split(".")[:2]) | ||
| if _pt_major_minor >= (2, 6): | ||
| mode_to_export = model | ||
| else: | ||
| mode_to_export = torch.jit.script(model, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd /tmp && git ls-files monai/networks/utils.pyRepository: Project-MONAI/MONAI
Length of output: 133
🏁 Script executed:
# Get broader context around lines 722-729
sed -n '700,750p' monai/networks/utils.pyRepository: Project-MONAI/MONAI
Length of output: 2564
🌐 Web query:
PyTorch 2.6 torch.onnx.export dynamo default behavior ScriptModule support
💡 Result:
-
PyTorch 2.6 default behavior:
torch.onnx.export(..., dynamo=...)exists, but it is not the default in 2.6. The ONNX docs note thatdynamoonly becameTrueby default in PyTorch 2.9, implying that in 2.6–2.8 you must opt in withdynamo=True. [1] -
What
dynamo=Truedoes (2.6+): it uses the new exporter path based ontorch.export.ExportedProgram(rather than the legacy TorchScript-based ONNX exporter). It is described as the recommended approach and returns anONNXProgram. [1] -
torch.jit.ScriptModule/ TorchScript support withdynamo=True:- While the API surface lists TorchScript types as acceptable inputs, the dynamo/
torch.export-based exporter does not fully support TorchScript-specific models/behaviors. - The PyTorch ONNX team explicitly called out that
torch.onnx.export(..., dynamo=True)may need to “auto-fallback” to the TorchScript-based exporter when the input is atorch.jit.ScriptModule/torch.jit.ScriptFunction, because those are TorchScript-oriented and not reliably handled by the dynamo path. [2] - Practical takeaway for PyTorch 2.6: if your model is a
ScriptModule, expect best support withdynamo=False(legacy/TorchScript exporter), and treatdynamo=Trueas potentially unsupported unless you confirm it works for your specific model/operators. [2]
- While the API surface lists TorchScript types as acceptable inputs, the dynamo/
Sources: PyTorch ONNX API docs (notes on dynamo default change in 2.9 and exporter behavior) [1], PyTorch issue discussing lack of TorchScript (ScriptModule) support in dynamo=True path and proposed fallback [2].
The code comment about PyTorch 2.6 torch.onnx.export defaults is incorrect.
PyTorch 2.6–2.8 do not default to dynamo-based exporter; dynamo became the default only in PyTorch 2.9. In 2.6–2.8, you must explicitly set dynamo=True. Additionally, dynamo does not fully support ScriptModule. Correct the comment to accurately reflect this versioning, and reconsider the version logic—simply passing the raw model for >= 2.6 may not achieve the intended behavior.
🤖 Prompt for AI Agents
In `@monai/networks/utils.py` around lines 722 - 729, The comment and version
check around torch.__version__ and selection of mode_to_export are incorrect:
dynamo became the default exporter in PyTorch 2.9, and PyTorch 2.6–2.8 require
explicitly enabling dynamo; also dynamo doesn't fully support ScriptModule.
Update the comment to state that dynamo became default in 2.9 and note that
dynamo may not support ScriptModule, and change the logic around mode_to_export
(symbols: torch.__version__, _pt_major_minor, mode_to_export, model,
torch.jit.script) to only bypass scripting when running on PyTorch >= 2.9 (or
when dynamo is explicitly enabled) and otherwise use torch.jit.script(model,
**kwargs); ensure the comment documents this behavior and the rationale so the
exporter picks the correct input type.
Fixes #8585.
Description
Update base image to nvcr.io/nvidia/pytorch:25.12-py3
Types of changes
./runtests.sh -f -u --net --coverage../runtests.sh --quick --unittests --disttests.make htmlcommand in thedocs/folder.