Fix depthwise conv detection for groups==in_channels==1 by JakeStevens · Pull Request #17590 · pytorch/executorch

JakeStevens · 2026-02-20T16:17:51Z

Summary:
GitHub PR 17528 broke test_silero_vad_16k_quantized_opt3 by removing the
len(weight_shape) == 4 guard from the depthwise detection logic. This
caused regular 1D convolutions with in_channels == groups == 1 (e.g.
Silero VAD's learned STFT conv) to be misclassified as depthwise.

This diff fixes the bug and centralizes the depthwise check into a single
is_depthwise_conv(groups, in_channels) utility in utils.py that
enforces groups > 1 and groups == in_channels. All 7 depthwise
detection sites across 4 files now use this shared function:

utils.py: Added is_depthwise_conv() — the single source of truth.
ops_registrations.py: 4 meta functions updated (quantized_conv2d_nhwc,
per_tensor, asym8s, asym8u).
ref_implementations.py: Updated quantized_conv2d_nhwc_per_tensor.
replace_ops.py: Updated ReplaceConvWithChannelLastConvPass.
type_dispatch.py: Updated CompileTimeTypeDispatchPass (also fixed a
pre-existing bug where groups > 1 was missing entirely).
test_ref_implementations.py: Fixed the test harness to squeeze the
IC=1 dim for depthwise weights when converting to channels_last format,
matching the actual replace_ops.py pipeline behavior.
test_replace_ops_passes.py: Added regression test for the
in_channels==groups==1 case and a positive test for 1D depthwise.

Differential Revision: D93869048

pytorch-bot · 2026-02-20T16:17:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17590

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures, 1 Unrelated Failure

As of commit 7c3bba6 with merge base d6e8ad1 ():

NEW FAILURES - The following jobs have failed:

pull / test-mediatek-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 19bbf6933d2ab2cb6871af2fb6699603f5196e67a903d7cd09831b85ae68df16 /exec failed with exit code 2
pull / test-openvino-linux / linux-job (gh)
RuntimeError: Command docker exec -t 23849e1ae6d786cda8d3187d8d779dfee33248695db4c37e8799142fbd1e5f6d /exec failed with exit code 1
pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t cf355f1ec6af76f94cdf556da48387d0f13c2d1644ebcbe08905839a453aab52 /exec failed with exit code 1
pull / test-samsung-quantmodels-linux / linux-job (gh)
RuntimeError: Command docker exec -t ea0ce7d08bff3b1aea4e49655839368842da8ffe951f8658d1965c4ab8614a18 /exec failed with exit code 1
pull / unittest-buck / linux / linux-job (gh)
RuntimeError: Command docker exec -t 9aef6be0e3635a0030efc32d8153cd152b2dd65d85b79cfd9d4edb6e40843d2a /exec failed with exit code 3

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-buck / macos / macos-job (gh) (trunk failure)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 3

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-02-20T16:17:58Z

@JakeStevens has exported this pull request. If you are a Meta employee, you can view the originating Diff in D93869048.

github-actions · 2026-02-20T16:18:33Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: The generic C++ kernel's depthwise NHWC entry points (`quantized_conv2d_nhwc_depthwise_asym8sxsym8s_asym8s_per_tensor_out` and the uint8 variant) were incorrectly delegating to `quantized_conv2d_nhwc`, which assumes a regular weight layout of [OC, KH, KW, IC]. Depthwise NHWC weights use a fundamentally different layout: [*kernel_size, OC] (i.e., [KH, KW, OC] for 2D, [K, OC] for 1D). This caused incorrect memory access patterns and dimension-out-of-range errors for 1D depthwise convolutions. This diff adds a dedicated `quantized_conv2d_nhwc_depthwise` function that correctly handles the depthwise weight layout, including conv1d support. It also adds a regression test (`test_quantized_conv1d_depthwise_nhwc_out`) that exercises the full NHWC depthwise pipeline at opt_level=4, which triggers both `ReplaceConvWithChannelLastConvPass` and `CompileTimeTypeDispatchPass`. Reviewed By: mcremon-meta Differential Revision: D93620973

Summary: GitHub PR 17528 broke `test_silero_vad_16k_quantized_opt3` by removing the `len(weight_shape) == 4` guard from the depthwise detection logic. This caused regular 1D convolutions with `in_channels == groups == 1` (e.g. Silero VAD's learned STFT conv) to be misclassified as depthwise. This diff fixes the bug and centralizes the depthwise check into a single `is_depthwise_conv(groups, in_channels)` utility in `utils.py` that enforces `groups > 1 and groups == in_channels`. All 7 depthwise detection sites across 4 files now use this shared function: - **utils.py**: Added `is_depthwise_conv()` — the single source of truth. - **ops_registrations.py**: 4 meta functions updated (quantized_conv2d_nhwc, per_tensor, asym8s, asym8u). - **ref_implementations.py**: Updated `quantized_conv2d_nhwc_per_tensor`. - **replace_ops.py**: Updated `ReplaceConvWithChannelLastConvPass`. - **type_dispatch.py**: Updated `CompileTimeTypeDispatchPass` (also fixed a pre-existing bug where `groups > 1` was missing entirely). - **test_ref_implementations.py**: Fixed the test harness to squeeze the IC=1 dim for depthwise weights when converting to channels_last format, matching the actual `replace_ops.py` pipeline behavior. - **test_replace_ops_passes.py**: Added regression test for the `in_channels==groups==1` case and a positive test for 1D depthwise. Reviewed By: mcremon-meta Differential Revision: D93869048

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 20, 2026

meta-codesync bot added fb-exported meta-exported labels Feb 20, 2026

mcremon-meta approved these changes Feb 20, 2026

View reviewed changes

JakeStevens added 2 commits February 20, 2026 12:10

JakeStevens force-pushed the export-D93869048 branch from b10d817 to 7c3bba6 Compare February 20, 2026 20:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix depthwise conv detection for groups==in_channels==1#17590

Fix depthwise conv detection for groups==in_channels==1#17590
JakeStevens wants to merge 2 commits intopytorch:mainfrom
JakeStevens:export-D93869048

JakeStevens commented Feb 20, 2026

Uh oh!

pytorch-bot bot commented Feb 20, 2026 •

edited

Loading

Uh oh!

meta-codesync bot commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

JakeStevens commented Feb 20, 2026

Uh oh!

pytorch-bot bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17590

❌ 5 New Failures, 1 Unrelated Failure

Uh oh!

meta-codesync bot commented Feb 20, 2026

Uh oh!

github-actions bot commented Feb 20, 2026

This PR needs a release notes: label

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pytorch-bot bot commented Feb 20, 2026 •

edited

Loading

This PR needs a `release notes:` label