[AI Subsystem] AI object mask and AI denoising #20322

andriiryzhkov · 2026-02-11T11:08:50Z

This PR introduces an AI subsystem into darktable with two features built on top of it:

AI Object Mask — a new mask tool that lets users select objects in the image by clicking on them. It uses the Light HQ-SAM model to segment objects, then automatically vectorizes the result into path masks (using ras2vect) that integrate with darktable's existing mask system.
AI Denoise — a denoising module powered by the NAFNet model. This was initially developed as a simpler test case for the AI subsystem and is included here as a bonus feature.

Both models are converted to ONNX format for inference. Conversion scripts live in a separate repository: https://github.com/andriiryzhkov/darktable-ai. Models are not bundled with darktable — they are downloaded from GitHub Releases after the app is installed, with SHA256 verification. A new dependency on libarchive is added to handle extracting the downloaded model archives.

AI subsystem design

The AI subsystem is currently built on top of ONNX Runtime, though the backend is abstracted to allow adding other inference engines in the future. ONNX Runtime is used from pre-built packages distributed on GitHub. On Windows, ONNX Runtime is built with MSVC, so using pre-built binaries is the natural approach for us — I initially expected this to be a problem, but discovered this is common practice among other open-source projects and works well.

The system is organized in three layers:

Backend (src/ai/): Wraps ONNX Runtime C API behind opaque handles. Handles session creation, tensor I/O, float16 conversion, and hardware acceleration provider selection (CoreML, CUDA, ROCm, DirectML). Providers are enabled via runtime dynamic symbol lookup rather than compile-time linking, so there are no build dependencies on vendor-specific libraries. A separate segmentation.c implements the SAM two-stage encoder/decoder pipeline with embedding caching and iterative mask refinement.
Model management (src/common/ai_models.c): Registry that tracks available models, their download status, and user preferences. Downloads model packages from GitHub Releases with SHA256 verification, path traversal protection, and version-aware tag matching. Uses libarchive for safe extraction with symlink and dotdot protections. Thread-safe — all public getters return struct copies, not pointers into the registry.
UI and modules: The object mask tool (src/develop/masks/object.c) runs SAM encoding in a background thread to keep the UI responsive. The user sees a "working..." overlay during encoding, then clicks to place foreground/background prompts. Right-click finalizes by vectorizing the raster mask into Bézier path forms. AI denoise module (src/libs/denoise_ai.c) and preferences tab (src/gui/preferences_ai.c) provide the remaining user-facing features.

Fixes: #12295, #19078, #19310

TurboGit · 2026-02-11T12:33:25Z

Models are not bundled with darktable —

Perfect!

they are downloaded from GitHub Releases after the app is installed, with SHA256 verification. A new dependency on libarchive is added to handle extracting the downloaded model archives.

Can this be simplified (no SHA256) for now to allow testers to download the models using current master?

andriiryzhkov · 2026-02-11T12:37:56Z

For testing purposes, you can skip the download mechanism entirely and just manually place model files in ~/.local/share/darktable/models/. Each model is a directory with a config.json and ONNX files — the AI backend scans that path at startup.

If placed manually, there are no SHA256 checks.

victoryforce · 2026-02-11T12:42:45Z

@andriiryzhkov Thank you for such a great contribution!

For macOS CI to complete successfully, libarchive should be added to .ci/Brewfile.

andriiryzhkov · 2026-02-11T12:46:24Z

@victoryforce thank you for the advise! Done.

andriiryzhkov · 2026-02-11T20:05:34Z

Can you give some details about the data model(s) used in this PR and what training data was used for it/them?

I did not train those models by myself. Just use pre-build model weights. If you are interested learning about the training data, I advise you to check origial repositories of the models:

paperdigits · 2026-02-11T20:10:58Z

I'd hazard a semi-educated guess that the training data from facebook does not match our moral or ethical standard.

andriiryzhkov · 2026-02-11T20:14:46Z

From my point of view, darktable as an open source tool for photographers should make sure that it's components are from a valid source with a clear legal origin.

Absolutely agree with you. Both models are from well known authors in the Computer Vision field and supported by academic publications. SAM models are widely used in photography industry these days.
I welcome everybody to review the sources.

paperdigits · 2026-02-11T20:20:42Z

SAM models are widely used in photography industry these days.

This means absolutely nothing in the context of this conversation.

I welcome everybody to review the sources.

We're asking if you've reviewed them, and if yes, do you think they meet the ethical and moral standards of this project?

TurboGit · 2026-02-11T20:21:45Z

This feature is/will not be enabled by default. User need to download the models. If you don't agree with the way the models are trained... don't download them, don't use the AI feature in Darktable. As simple as that.

andriiryzhkov · 2026-02-11T20:34:51Z

We're asking if you've reviewed them, and if yes, do you think they meet the ethical and moral standards of this project?

Yes, I reviewed both models before selecting them.

Light HQ-SAM (AI mask): Based on Meta's Segment Anything, released under Apache-2.0 by SysCV (ETH Zurich). The HQ-SAM extension was published at NeurIPS 2023. Training uses SA-1B dataset (publicly released by Meta under Apache-2.0) plus standard segmentation benchmarks. No non-commercial restrictions.

NAFNet (AI denoise): Released under MIT license by Megvii Research. Published at ECCV 2022. Trained on standard academic datasets (SIDD, GoPro, REDS) that are publicly available for research. No restrictive terms on the weights.

Both are from peer-reviewed research with clearly documented training data and permissive licenses compatible with GPL-3. I see no ethical concerns — no undisclosed web scraping, no personal data in training sets, no non-commercial clauses.

paperdigits · 2026-02-11T20:51:16Z

Its been made clear to me that my comments are not welcome here and that the project is "bigger than me." I'll show myself out.

TurboGit · 2026-02-11T21:15:41Z

I was also thinking about HQ-SAM B model, but speed of image encoding will be significantly reduced.

If by any mean you have such model I'll test it to see if the object are better recognized.

jenshannoschwalm · 2026-02-11T21:40:12Z

This feature is/will not be enabled by default. User need to download the models. If you don't agree with the way the models are trained... don't download them, don't use the AI feature in Darktable. As simple as that.

A very clear position. For me the important points would be:

User has to download the models by some intentional trigger/button or whatever that UI action might be
We should assist doing so, maybe even offering selections
But we should never ship/distribute/keep such models within dt

wpferguson · 2026-02-12T02:20:26Z

On the ethics question I'm going to use Lua as an example and the rules that were given to me when I started maintaining the repo

I wrote a script that exports an image and sends it to GIMP as an external editor and then returns the result.

If someone wrote a similar script that calls photoshop to do the same thing and then wanted it merged into the Lua repository the answer would be no, because photoshop is not FOSS.

However, if someone wrote a script that let you specify whatever external editor you wanted to use (i.e. ext_editors.lua) that would (did) get merged. That way Lua is not promoting the use of commercial software, but it's not prohibiting it either.

So if we build modules (denoise, masking, etc) that use AI models they shouldn't require specific models. If we require specific models then it's like writing a script that requires photoshop. If we have an AI module (denoise) that can use different models (NAFnet, NIND, etc) supplied by the user then we have something similar to the ext_editors script.

andriiryzhkov · 2026-02-12T07:57:51Z

So if we build modules (denoise, masking, etc) that use AI models they shouldn't require specific models. If we require specific models then it's like writing a script that requires photoshop. If we have an AI module (denoise) that can use different models (NAFnet, NIND, etc) supplied by the user then we have something similar to the ext_editors script.

That's a great analogy and I agree with the principle. This is actually how the implementation works:

The AI backend (src/ai/) is a generic ONNX Runtime wrapper — it can run any ONNX model. The model registry is a JSON-based catalog that describes available models with metadata (task type, model files). Note that ONNX models are pure neural networks with no pre/post-processing baked in — all image preparation (colorspace conversion, normalization, tiling) is handled by darktable's code, not by model metadata.

For model portability, the key constraint is the input/output tensor dimensions that darktable expects for a specific task:

Denoise (NAFNet): Input [1, 3, H, W] float32 (sRGB image tile), output [1, 3, H, W] float32 (denoised tile of the same size). Any model that takes a noisy RGB tile and returns a denoised RGB tile of identical dimensions will work as a drop-in replacement.
Segmentation (Light HQ-SAM): Two-stage encoder/decoder architecture. Encoder takes [1, 3, 1024, 1024] and produces image embeddings [1, 256, 64, 64] + intermediate features [1, 1, 64, 64, 160]. Decoder takes 7 inputs (embeddings, point coordinates [1, N, 2], point labels [1, N], previous mask [1, 1, 256, 256], etc.) and outputs a mask [1, 1, H, W] + IoU score + low-res mask. Any SAM-compatible encoder/decoder pair with these tensor shapes will work.

As long as a model follows these tensor dimension requirements for a given task, it will work. Users can already install models manually into the models directory (~/.local/share/darktable/models/<model-id>/) and select which model to use for a specific module via darktable configuration. As a next step, we can reduce the requirement to register models in ai_models.json — models placed in the models folder would be auto-discovered at startup, while the registry would only serve as a catalog of models available for download.

…issing

andriiryzhkov · 2026-02-12T10:04:34Z

Speaking as a package maintainer for both the flatpak and Nix package, this "download at build time" stuff is not good. It won't work for either package that I maintain, and it'll make the build fail in both cases. I'd suggest we not do that.

@paperdigits FindONNXRuntime.cmake searches for ONNX Runtime in this order:

Source tree — src/external/onnxruntime/ (manually pre-installed)
Build tree — build/_deps/onnxruntime/ (populated by prior auto-download)
System package — libonnxruntime-dev via standard find_path/find_library
Auto-download — fallback, fetches pre-built binary from GitHub

So if a system package is available, it will be used and no download happens. The auto-download is only a convenience fallback for developers who don't have ONNX Runtime installed.

If neither the system package nor auto-download provides ONNX Runtime (or if libarchive is missing), AI features are automatically disabled (BUILD_AI=OFF) and the build continues normally without them. You can also explicitly disable auto-download with ONNXRUNTIME_OFFLINE=ON — the build will simply skip AI features instead of failing.

For flatpak/Nix packaging: if you provide libonnxruntime-dev (or equivalent), AI is enabled; if you don't, AI is silently disabled and the rest of darktable builds as usual. The only known issue is Ubuntu 26.04's libonnxruntime-dev 1.21.0 which has broken linkage to libonnx — so I'm removing it from CI for now to pass build checks. Once Ubuntu 26.04 packages are stabilized, we can switch CI back to the system package.

MikoMikarro · 2026-02-12T13:06:47Z

Hi! I love that these kinds of features are being pushed forward!
Totally agree on the use of OnnXRuntime, and, amazingly, the AI Denoise seems to be compatible. I tried to approach the denoise as a RAW denoise more than a ProcessedRAW denoise, but I'll test how @andriiryzhkov approach is working.

When I first attempted the masking, Onnx was discarded due to its "hard" build in Windows, but it seems you have that sorted out. My approach to fix that was to use OpenCV DNN, with some script I remember, I could "fix" the dynamic size of layers, and the models would work "native" with the OpenCV DNN. I suppose that it is easier to compile/bundle with DT.

Definitely love to see ONNX used here, much more flexible to load different models!

andriiryzhkov · 2026-02-12T13:51:07Z

@MikoMikarro I am glad you are in the game!

We can add OpenCV DNN as another backend provider. It has a bit limited support of neural network operators, but still usable for some models.

andriiryzhkov · 2026-02-12T14:48:07Z

If by any mean you have such model I'll test it to see if the object are better recognized.

@TurboGit You can download HQ-SAM B model from here https://github.com/andriiryzhkov/darktable-ai/releases/download/5.5.0.4/mask-hq-sam-b.zip.

Unpack it to ~/.local/share/darktable/models/ and in darktablerc update parameter plugins/darkroom/masks/object/model=mask-hq-sam-b. This will enable HQ-SAM B variant of the model.

TurboGit · 2026-02-12T16:44:01Z

@andriiryzhkov : Ok I've tried this new model and still not impressed :) We can call me a standard user on this as I have never used IA for this in any software. My expectations was maybe a bit high... But seeing some demo on the SAM model I was expecting quite a better object segmentation. Here are some examples, each time I did a single click on something on the image:

unexpected mask on squirrel, head of penguin

impossible to select the color palette

shirt not fully selected, some part of water and something on the object in the hand

bottle on the left partly selected, unexpected parts on the bottle sticker selected

unexpected part under the cup selected

napkin only partly selected

I can continue... In fact I haven't seen a case where it was good. This new model is a bit better than the light version but far from the quality expected for integration.

Let's me ask a question. Do you have some test cases where it works perfect?

wpferguson · 2026-02-12T16:45:27Z

The model registry

Is this provided by darktable or is this an external entity.?

The reason I ask is, taking my example above, I merge the ext_editors script and then provide a drop down listing all the Adobe products that it will run. I can't provide that list otherwise darktable is "endorsing" Adobe's software.

Another reason is darktable assumes liability for recommending the model. Better to provide the user a list of places that have model collections and let them make the choice.

TurboGit · 2026-02-12T16:52:58Z

ref: #20322 (comment)

I don't want to be made fun of, so here is the full sentence of what I said to paperdigits on matrix:

Well the messages on the PR are definitely voiced strongly against the PR. I encourage you to voice your point here or anywhere else (I'm also voicing against AI with my friends) but I try to keep open minded on work done for a whole community. That's my point. We can't agree with everything integrated on Darktable, that's my case too, but I'm no one to trash something because I don't like it.
Said another way, I'm not Darktable, you are not Darktable... The projet is bigger than us.

Or maybe I'm a fool...

wpferguson · 2026-02-12T18:06:02Z

Or maybe I'm a fool...

AI is a POLARIZING subject right now. Some people love it and some people hate or fear it, and there are STRONG opinions on both sides.

My thoughts

If this gets merged I will probably not use it because it's too slow for my workflow.
My original thought about how to get AI into darktable was darktable providing "hooks" to get data in and out and using Lua to interface to external modules. That seems to have been overtaken by events 😄
The ONNX runtime is fine. I don't think anyone has a problem with that except for windows integration. If we can't bundle it I have some ideas.
The models seem to be the controversial part of the issue, so I think we offload that onto the user and let the user choose the models they are comfortable with.
We could ask hanatos his views on AI, since he's the "father" of darktable.

@TurboGit you are doing fine. You make the best choice you can based on the information available to you. If you don't have enough information then open an RFC issue, though with maybe some guidelines like:

you can make 1 comment which is your opinion (so we don't end up in a flame war)
you can't reference another comment
you can upvote or downvote other comments.,

TurboGit · 2026-02-12T18:27:07Z

If this gets merged I will probably not use it because it's too slow for my workflow.

Yes, the latest model is a bit slow the light version was almost instantaneous on my side. But my main concern now is the quality of the segmentation. At this stage it is not helping users at all, maybe the training needs to be tweaked... I don't know and I know next to nothing about IA so I let the experts discuss this point.

My original thought about how to get AI into darktable was darktable providing "hooks" to get data in and out and using Lua to interface to external modules.

That would work good for AI denoise but for masking we need fast UI interaction to display the mask add or remove to it. Would that work with Lua?

The models seem to be the controversial part of the issue, so I think we offload that onto the user and let the user choose the models they are comfortable with.

I can understand that, that's why the models are not and will never be distributed with Darktable. Also the AI feature is not activated by default.

@TurboGit you are doing fine.

Thanks, I've been maintainer for 7 years now, maybe that's too much for a single man :)

I fear that the RFC or poll will be a place of fight :) On such hot topic I think we should discuss with the core developers and find a way forward (or not).

wpferguson · 2026-02-12T19:32:27Z

Would that work with Lua?

Only if the AI script would support it and could create the display and interaction.

I fear that the RFC or poll will be a place of fight :)

That was my thought too, which was why I added all the conditions. I could definitely see that not ending well.

Thanks, I've been maintainer for 7 years now, maybe that's too much for a single man :)

I once interviewed for a job and they asked me what I wanted to do. My answer was "Let me tell you what I don't want to do. I don't want to be the boss, I don't want to be in charge. I just want to work". I feel your pain.

I think you've done an incredible job of "herding cats". Dealing with lots of personalities, language barriers, users, developers, issues, and still keeping everything on track is quite an accomplishment. It's also a LOT of work for one person. Should we look at some way to share the load or delegate some tasks?

elstoc · 2026-02-12T21:12:54Z

This will be virtually impossible to document (and I'm not sure I'm willing to even merge such documentation) without mentioning or seeming to recommend specific models.

As I said in the IRC chat, if we could provide something extremely generic to allow certain operations to be handed off to another "external program" (like lua but probably needs to be more integrated into the pixelpipe) that'd be fine with me (i.e. not explicit AI integration).

If we wanted to source our own open dataset and volunteers to train a model of our own that would also be fine with me (though I'm still slightly uncomfortable about the environmental impacts of AI, at least the licensing and "data sweatshops" concerns would be alleviated).

But it's really really hard to source good reliable and verifiable information about how most of these models have been trained (both from a data and a human point-of-view) and AI is such a divisive issue there's a good chance of a proper split in the community here, and difficult decisions being made by package maintainers.

I for one will have to decide whether I'm comfortable enough with this to continue contributing to the project.

andriiryzhkov and others added 23 commits February 9, 2026 22:49

AI Denoise (#2)

66be64c

AI Mask (#3)

642e2a8

Fix stack overflow risk

f358e03

Fix missing dt_control_busy_leave() on error

2e37e4a

Fix potential OOB write in _preprocess_image

ac6ebf5

Fix UI freeze on cleanup during encoding

d169276

Fix race in download progress callback

197ed21

Fix #ifdef inside enum

2899a58

Fix BUILD_AI for AI mask

a6cbabe

Fix registry->repository accessed without lock

873cfff

Fix no overflow check on tensor element_count

3f7a01c

Fix model_id path traversal

8539cd4

Fix formnb global to be thread-safe

9f37719

Low severity fixes

e1391a0

Update model registry

179403c

Fetch model checksums from GitHub Releases API

a7a3e7c

Fix release tag matching and add download logging

43b1f23

Guards for DEVELOP_MASKS_NB_SHAPES

e5630d0

Remove HAVE_AI

dd2260e

Fix mask mouse events blocked by busy state during AI encoding

3d914c2

Run AI mask encoding in background thread

eae9a0d

Fix code style

df7aeb3

Guard AI object mask when model is not available

bc3a2f7

andriiryzhkov mentioned this pull request Feb 11, 2026

AI Masks #12295

Open

Add CI dep libarchive on Linux and Windows

5dd6d84

Add CI dep libarchive on mscOS

0ddca1c

Support system-installed ONNX Runtime on Linux

3b2bb7c

andriiryzhkov added 3 commits February 12, 2026 10:17

Fix Linux CI: add libonnx-dev for broken Ubuntu libonnxruntime linkage

afab2fb

Remove system libonnxruntime from CI to fix Ubuntu 26.04 broken linkage

7aa44ec

Make AI features optional: gracefully disable when dependencies are m…

3ad1e7d

…issing

andriiryzhkov added 2 commits February 12, 2026 11:25

Fix FindONNXRuntime double-invocation causing missing library install

26b8009

Auto-discover user-installed AI models from models directory

f941d95

Improve AI preferences and make mask model configurable

b809bef

Support multiple SAM model variants with dynamic shapes and CPU fallback

39986e2

[AI Subsystem] AI object mask and AI denoising #20322

Are you sure you want to change the base?

[AI Subsystem] AI object mask and AI denoising #20322

Uh oh!

Conversation

andriiryzhkov commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI subsystem design

Uh oh!

TurboGit commented Feb 11, 2026

Uh oh!

andriiryzhkov commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

victoryforce commented Feb 11, 2026

Uh oh!

andriiryzhkov commented Feb 11, 2026

Uh oh!

andriiryzhkov commented Feb 11, 2026

Uh oh!

paperdigits commented Feb 11, 2026

Uh oh!

andriiryzhkov commented Feb 11, 2026

Uh oh!

paperdigits commented Feb 11, 2026

Uh oh!

TurboGit commented Feb 11, 2026

Uh oh!

andriiryzhkov commented Feb 11, 2026

Uh oh!

paperdigits commented Feb 11, 2026

Uh oh!

TurboGit commented Feb 11, 2026

Uh oh!

jenshannoschwalm commented Feb 11, 2026

Uh oh!

wpferguson commented Feb 12, 2026

Uh oh!

andriiryzhkov commented Feb 12, 2026

Uh oh!

andriiryzhkov commented Feb 12, 2026

Uh oh!

MikoMikarro commented Feb 12, 2026

Uh oh!

andriiryzhkov commented Feb 12, 2026

Uh oh!

andriiryzhkov commented Feb 12, 2026

Uh oh!

TurboGit commented Feb 12, 2026

Uh oh!

wpferguson commented Feb 12, 2026

Uh oh!

TurboGit commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wpferguson commented Feb 12, 2026

Uh oh!

TurboGit commented Feb 12, 2026

Uh oh!

wpferguson commented Feb 12, 2026

Uh oh!

elstoc commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

andriiryzhkov commented Feb 11, 2026 •

edited

Loading

andriiryzhkov commented Feb 11, 2026 •

edited

Loading

TurboGit commented Feb 12, 2026 •

edited

Loading

elstoc commented Feb 12, 2026 •

edited

Loading