Mishandling of missing buscar shuffled scores


<img width="888" height="435" alt="Image" src="https://github.com/user-attachments/assets/57cda78c-3464-4fae-8353-77408e2666fb" />

We found a bug here https://github.com/WayScience/buscar/pull/64#discussion_r2771546820

The issue is that the shuffled scores can sometimes return None when the process fails—specifically when no clusters or on/off signatures can be identified. This behavior arises directly from how the shuffling is implemented.

The shuffling occurs before calculating the Buscar scores per treatment, which means the first point of failure is in the construction of the on/off signatures. In many cases, all features end up in the “off-morphological” signature, which is makes sense after shuffling.

Occasionally, by chance, one or two features may appear significant, producing a valid score, but these are effectively random and result in unstable or misleading values.

When examining the scores, we also noticed that the shuffled scores are systematically much lower. Since this process runs over 10 iterations, there is a high chance that 0.0 values outnumber actual computed shuffled scores. As a result, the 0.0s become dominant and drive the average downward, making the mean score much lower than the true underlying value.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mishandling of missing buscar shuffled scores #69

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mishandling of missing buscar shuffled scores #69

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions