Skip to content

✨[Feature] Route FP16 Layer Norm properly #4061

@narendasan

Description

@narendasan

Is your feature request related to a problem? Please describe.

We now see these warnings:

WARNING:torch_tensorrt [TensorRT Conversion Context]:Detected layernorm nodes in FP16.
WARNING:torch_tensorrt [TensorRT Conversion Context]:Running layernorm after self-attention with FP16 Reduce or Pow may cause overflow. Forcing Reduce or Pow Layers in FP32 precision, or exporting the model to use INormalizationLayer (available with ONNX opset >= 17) can help preserving accuracy.

Describe the solution you'd like

Add a lowering pass to detect this case and properly route the subgraph to the right converters

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions