Skip to content

Comments

Arm backend: Add FP16 tests of models (mv3, ic3)#17586

Open
martinlsm wants to merge 1 commit intopytorch:mainfrom
martinlsm:marlin-fp16-models-tests
Open

Arm backend: Add FP16 tests of models (mv3, ic3)#17586
martinlsm wants to merge 1 commit intopytorch:mainfrom
martinlsm:marlin-fp16-models-tests

Conversation

@martinlsm
Copy link
Collaborator

@martinlsm martinlsm commented Feb 20, 2026

Add testing of the following models executed in FP16:

  • MobileNetV3
  • InceptionV3

This patch verifies that the Arm backend is able to lower full models in FP16 to valid TOSA, and execute them with acceptable numerical accuracy.

cc @digantdesai @SS-JIA @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

Add testing of the following models executed in FP16:
 - MobileNetV3
 - InceptionV3

This patch verifies that the Arm backend is able to lower full models in
FP16 to valid TOSA, and execute them with acceptable numerical accuracy.

Signed-off-by: Martin Lindström <Martin.Lindstroem@arm.com>
Change-Id: Ice3c6913598d540f7c7a52e403260943a7c8c597
Copilot AI review requested due to automatic review settings February 20, 2026 11:23
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 20, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17586

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures

As of commit 1376dd0 with merge base bd6a75d (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2026
@martinlsm
Copy link
Collaborator Author

@pytorchbot label ciflow/trunk

@martinlsm
Copy link
Collaborator Author

@pytorchbot label "partner: arm"

@pytorch-bot pytorch-bot bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Feb 20, 2026
@martinlsm
Copy link
Collaborator Author

@pytorchbot label "release notes: none"

@pytorch-bot pytorch-bot bot added the release notes: none Do not include this in the release notes label Feb 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds FP16 end-to-end model tests for the Arm backend to validate FP16 lowering to TOSA and ensure outputs remain within acceptable numeric error.

Changes:

  • Add an FP16 variant of the MobileNetV3 (small) TOSA FP pipeline test.
  • Add an FP16 variant of the InceptionV3 TOSA FP pipeline test.
  • Configure looser absolute tolerances for FP16 output comparisons.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
backends/arm/test/models/test_mobilenet_v3_arm.py Adds a new slow TOSA FP16 model test using an FP16 MobileNetV3 module + FP16 inputs.
backends/arm/test/models/test_inception_v3_arm.py Adds a new slow TOSA FP16 model test using an FP16 InceptionV3 module + FP16 inputs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +27 to +31
mv3_fp16 = models.mobilenet_v3_small(weights=models.MobileNet_V3_Small_Weights).to(
torch.float16
)
mv3_fp16 = mv3_fp16.eval()

Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mv3_fp16 is instantiated and converted at import time, which forces a second model construction + weight load even when the FP16 test isn’t selected. Consider creating the FP16 model inside test_mv3_tosa_FP_fp16() (or via a cached pytest fixture) to reduce test import time and memory usage.

Copilot uses AI. Check for mistakes.
aten_op=[],
exir_op=[],
use_to_edge_transform_and_lower=True,
atol=2e-2,
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This FP16 test relaxes atol but leaves rtol at the default (1e-3). For reduced-precision model tests elsewhere (e.g. bf16), both tolerances are typically relaxed; consider specifying an appropriate rtol here as well to avoid overly strict relative comparisons and potential flakiness.

Suggested change
atol=2e-2,
atol=2e-2,
rtol=1e-2,

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +28
ic3_fp16 = models.inception_v3(weights=models.Inception_V3_Weights).to(torch.float16)
ic3_fp16 = ic3_fp16.eval()
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ic3_fp16 is instantiated and converted at import time, which forces a second model construction + weight load even when the FP16 test isn’t selected. Consider creating the FP16 model inside test_ic3_tosa_FP_fp16() (or via a cached pytest fixture) to reduce test import time and memory usage.

Copilot uses AI. Check for mistakes.
aten_op=[],
exir_op=[],
use_to_edge_transform_and_lower=True,
atol=1e-2,
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This FP16 test relaxes atol but leaves rtol at the default (1e-3). For reduced-precision model tests elsewhere (e.g. bf16), both tolerances are typically relaxed; consider specifying an appropriate rtol here as well to avoid overly strict relative comparisons and potential flakiness.

Suggested change
atol=1e-2,
atol=1e-2,
rtol=1e-2,

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@zingo zingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK to merge when atol/rtol is numped to tests pass.
e.g. This seem to get

FAILED backends/arm/test/models/test_inception_v3_arm.py::test_ic3_tosa_FP_fp16 - AssertionError: Output 0 does not match reference output.
	Given atol: 0.01, rtol: 0.001.
	Output tensor shape: torch.Size([1, 1000]), dtype: torch.float16
	Difference: max: 0.03515625, abs: 0.03515625, mean abs error: 0.005782970428466797.
	-- Model vs. Reference --
	 Numel: 1000, 1000
	Median: -0.06884765625, -0.06573486328125
	  Mean: -0.02605916690826416, -0.026028093814849853
	   Max: 2.703125, 2.67578125
	   Min: -2.623046875, -2.62109375
= 1 failed, 91 passed, 3 skipped, 7 xfailed, 952 warnings in 963.15s (0:16:03) =

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants