Skip to content

Information content values not being calculated correctly for conflated cliques #368

@gaurav

Description

@gaurav

Looking at the source code for normalization, it looks like to me like the info_content value is only being loaded for the conflation clique leader, not for individual values within a clique:

info_contents = await get_info_content(app, canonical_nonan)

We currently set things up so that conflationed IDs with smaller IC values are sorted to the front. The only way that (currently) you could get a lower ID later in the sequence is if there's a CURIE with an IC and a prefix that's later in the order than other values, but at the moment CHEBIs for chemicals and NCBIGene identifiers for genes are the only conflated prefixes we have with ICs. So I will need to come up with a test case to test this. I started investigating this in PR #366.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions