Skip to content

[IR Container] Phase 2.3 Basic shared ptr#5960

Open
mdavis36 wants to merge 1 commit intomd/fusion-stmt-regfrom
md/phase2-shared-ptr
Open

[IR Container] Phase 2.3 Basic shared ptr#5960
mdavis36 wants to merge 1 commit intomd/fusion-stmt-regfrom
md/phase2-shared-ptr

Conversation

@mdavis36
Copy link
Collaborator

@mdavis36 mdavis36 commented Feb 12, 2026

Summary

Replace unique_ptr<IrContainer> with shared_ptr<IrContainer> in Fusion and add the container-side registration API that tracks which Fusions share a given container.

This is the foundational change of Phase 2. The pointer type change alone has no behavioral impact — single-Fusion containers behave identically under shared_ptr. The tracking infrastructure (sharing_fusions_, addFusion/removeFusion) lays the groundwork for all subsequent tasks.

Parallel compilation is disabled in-code via kPhase2DisableParallelCompile = true as a precaution during the transition. This ensures CI runs serial compilation for later PRs without requiring environment variables. Parallel compilation is restored in #5971 .

Relationship to Phase 2

This is the core type change that enables the shared container model:

Phase 1 (unique_ptr — exclusive ownership):
  Fusion A ──→ unique_ptr<IrContainer_A> ──→ {vals_A, exprs_A}
  Fusion B ──→ unique_ptr<IrContainer_B> ──→ {vals_B, exprs_B}

Phase 2 (shared_ptr — shared storage):
  Fusion A ─┐
             ├──→ shared_ptr<IrContainer> ──→ {vals_A, vals_B, exprs_A, exprs_B}
  Fusion B ─┘

  IrContainer tracks: sharing_fusions_ = {A, B}

Without this change, no subsequent Phase 2 work (per-Fusion tracking, shared-container copy/move, thread safety) is possible. The tracking API is the mechanism that makes container sharing safe — it allows the container to know its consumers, enabling correct cleanup when Fusions are destroyed.

CI Risk

Low. For single-Fusion containers (all existing usage), shared_ptr is behaviorally identical to unique_ptr. No accessor paths change. Parallel compilation is serialized by the in-code constant.

@mdavis36
Copy link
Collaborator Author

!test

@github-actions
Copy link

github-actions bot commented Feb 12, 2026

Review updated until commit f8ff364

Description

  • Transition Fusion::ir_container_ from unique_ptr to shared_ptr for container sharing

  • Add Fusion tracking API to IrContainer (addFusion/removeFusion/transferFusion/sharingCount)

  • Remove IrContainer::parent_ since 1:1 relationship no longer holds

  • Disable parallel compilation during shared_ptr transition with kPhase2DisableParallelCompile flag

Changes walkthrough

Relevant files
Enhancement
fusion.cpp
Transition Fusion to shared_ptr IrContainer                           

csrc/fusion.cpp

  • Change ir_container_ from unique_ptr to shared_ptr
  • Remove parent_ pointer updates in swap function
  • Update constructor to use addFusion(this) and make_shared
  • Update destructor to call removeFusion(this)
  • Update IrCloner copy call to pass dest_fusion parameter
  • Update inContainer check to use sharing_fusions_ instead of parent_
  • +8/-10   
    container.cpp
    Add Fusion tracking infrastructure to IrContainer               

    csrc/ir/container.cpp

  • Remove parent_ member variable and swap updates
  • Update copy function to accept dest_fusion parameter
  • Add new Fusion tracking API methods
    (addFusion/removeFusion/transferFusion/sharingCount/hasMultipleFusions/sharingFusions)
  • Update inContainer to check sharing_fusions_ instead of parent_
  • +31/-5   
    fusion.h
    Update Fusion header for shared_ptr transition                     

    csrc/fusion.h

  • Change ir_container_ member from unique_ptr to shared_ptr
  • Add ir_container_ptr() method to access shared_ptr
  • +5/-2     
    container.h
    Update IrContainer header for shared_ptr and tracking       

    csrc/ir/container.h

  • Remove parent_ member variable
  • Update copy function signature to accept dest_fusion parameter
  • Add Fusion tracking API declarations
  • +11/-9   
    Configuration changes
    fusion_kernel_runtime.cpp
    Disable parallel compilation during shared_ptr transition

    csrc/runtime/fusion_kernel_runtime.cpp

  • Add kPhase2DisableParallelCompile constant set to true
  • Use constant to disable parallel compilation in compileFusionParallel
  • Use constant to disable parallel compilation in error handling
  • +7/-2     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Memory Management Safety

    The transition from unique_ptr to shared_ptr requires careful validation of ownership semantics. The removeFusion call in the destructor needs to be verified to prevent dangling references and ensure proper cleanup when multiple fusions share the same container.

    Fusion::~Fusion() {
      clear();
      if (ir_container_) {
        ir_container_->removeFusion(this);
      }
    }
    Container Sharing Logic

    The new sharing_fusions_ tracking mechanism needs validation to ensure it correctly handles fusion addition, removal, and transfer operations. The hasMultipleFusions() and sharingCount() functions should be tested with various fusion sharing scenarios.

    void IrContainer::addFusion(Fusion* fusion) {
      sharing_fusions_.insert(fusion);
    }
    
    void IrContainer::removeFusion(Fusion* fusion) {
      sharing_fusions_.erase(fusion);
    }
    
    void IrContainer::transferFusion(Fusion* from, Fusion* to) {
      sharing_fusions_.erase(from);
      sharing_fusions_.insert(to);
    }
    
    size_t IrContainer::sharingCount() const {
      return sharing_fusions_.size();
    }
    
    bool IrContainer::hasMultipleFusions() const {
      return sharing_fusions_.size() > 1;
    }
    
    const std::unordered_set<Fusion*>& IrContainer::sharingFusions() const {
      return sharing_fusions_;
    }
    Performance Impact

    The kPhase2DisableParallelCompile flag disables parallel compilation which may significantly impact performance. This should be temporary, but the PR should include performance benchmarks or at least acknowledge the performance regression during this transition phase.

    // TODO: Remove when std::shared_mutex is added to IrContainer.
    constexpr bool kPhase2DisableParallelCompile = true;

    Test failures

    • (Low, 1) Minor numerical mismatch in Thunder vs Torch instance_norm nvFuser CUDA test

      Test Name A100 Source
      thunder.tests.test_ops.test_core_vs_torch_consistency_instance_norm_nvfuser_cuda_thunder.dtypes.float32

    @mdavis36 mdavis36 force-pushed the md/phase2-shared-ptr branch from 3a199c8 to 53e5045 Compare February 12, 2026 22:09
    @mdavis36 mdavis36 changed the title [WIP] phase2 basic shared ptr [IR Container] Phase2 Basic shared ptr Feb 12, 2026
    @mdavis36 mdavis36 changed the title [IR Container] Phase2 Basic shared ptr [IR Container] Phase 2.3 Basic shared ptr Feb 18, 2026
    Change Fusion::ir_container_ from unique_ptr to shared_ptr to enable
    future container sharing between Fusions. Add Fusion tracking API to
    IrContainer (addFusion/removeFusion/transferFusion/sharingCount).
    Remove IrContainer::parent_ since the 1:1 relationship no longer holds.
    Disable parallel compilation during the shared_ptr transition.
    @mdavis36
    Copy link
    Collaborator Author

    !test

    @mdavis36 mdavis36 marked this pull request as ready for review February 18, 2026 06:37
    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Feb 18, 2026

    Greptile Summary

    This PR replaces std::unique_ptr<IrContainer> with std::shared_ptr<IrContainer> in Fusion and introduces container-side tracking infrastructure (sharing_fusions_ set with addFusion/removeFusion/transferFusion) to support the upcoming Phase 2 shared container model. Parallel compilation is temporarily disabled via kPhase2DisableParallelCompile.

    • Ownership transition: Fusion::ir_container_ changes from unique_ptr to shared_ptr. For the current single-Fusion-per-container usage, behavior is identical.
    • Tracking infrastructure: IrContainer now maintains std::unordered_set<Fusion*> sharing_fusions_ instead of a single Fusion* parent_ backpointer. The inContainer validation now checks membership in the set rather than equality with a single parent.
    • Lifecycle correctness: Constructor registers via addFusion, destructor unregisters via removeFusion. Copy/move constructors delegate to the default constructor (which registers) before swap/copy operations.
    • IrContainer::swap correctly omits sharing_fusions_ since it swaps container contents (vals/exprs), not ownership — the Fusions still point to the same shared_ptr<IrContainer> objects after swap.
    • Several new API methods (transferFusion, sharingCount, hasMultipleFusions, sharingFusions, ir_container_ptr) are added as infrastructure for future PRs but are currently unused.

    Confidence Score: 4/5

    • Low risk — for single-Fusion containers (all existing usage), shared_ptr is behaviorally identical to unique_ptr, and the tracking infrastructure is additive.
    • The core type change from unique_ptr to shared_ptr has no behavioral impact for single-Fusion containers. Lifecycle tracking (addFusion/removeFusion) is correctly wired into constructors and destructors. The only concern is a stale comment. Deducted one point because several new API methods are untested dead code.
    • csrc/fusion.cpp has a stale comment referencing the removed parent_ backpointer model.

    Important Files Changed

    Filename Overview
    csrc/fusion.cpp Transitions unique_ptr to shared_ptr, adds addFusion/removeFusion lifecycle calls. Constructor, destructor, copy, move, and swap are all correctly updated. One stale comment references the removed parent_ model.
    csrc/fusion.h Changes ir_container_ from unique_ptr to shared_ptr, adds protected ir_container_ptr() accessor. Clean header-level changes with no issues.
    csrc/ir/container.h Replaces parent_ pointer with sharing_fusions_ set and adds multi-Fusion tracking API. Several methods (transferFusion, sharingCount, hasMultipleFusions, sharingFusions, ir_container_ptr) are added as infrastructure for future use but currently unused.
    csrc/ir/container.cpp Implements the Fusion tracking API and updates inContainer validation and copy to use the new model. swap correctly does not swap sharing_fusions_ since it swaps content, not ownership.
    csrc/runtime/fusion_kernel_runtime.cpp Adds kPhase2DisableParallelCompile constant to serialize compilation during the shared_ptr transition. Simple, low-risk change with clear TODO for removal.

    Class Diagram

    %%{init: {'theme': 'neutral'}}%%
    classDiagram
        class IrContainer {
            -unordered_set~Fusion*~ sharing_fusions_
            -deque~unique_ptr~Val~~ vals_up_
            -deque~unique_ptr~Expr~~ exprs_up_
            +addFusion(Fusion*)
            +removeFusion(Fusion*)
            +transferFusion(Fusion*, Fusion*)
            +sharingCount() size_t
            +hasMultipleFusions() bool
            +sharingFusions() unordered_set~Fusion*~
            #copy(IrContainer*, IrContainer*, Fusion*) IrCloner
            #swap(IrContainer, IrContainer)
        }
    
        class Fusion {
            -shared_ptr~IrContainer~ ir_container_
            #ir_container() IrContainer*
            #ir_container_ptr() shared_ptr~IrContainer~
            +Fusion()
            +~Fusion()
            +swap(Fusion, Fusion)
            +copy(Fusion*, Fusion*) IrCloner
        }
    
        class Statement {
            -Fusion* ir_container_
            +container() Fusion*
            +fusion() Fusion*
        }
    
        Fusion "1..*" --> "1" IrContainer : shared_ptr
        Statement "*" --> "1" Fusion : raw pointer
        IrContainer "1" --> "1..*" Fusion : sharing_fusions_
    
    Loading

    Last reviewed commit: f8ff364

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    5 files reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Feb 18, 2026

    Additional Comments (1)

    csrc/fusion.cpp
    Stale comment references removed code

    These comments reference "parent backpointers" and say "each Fusion owns a different IrContainer", both of which are artifacts of the old parent_ pointer model. Since parent_ was removed and the comment no longer accurately describes the code, it should be updated. The actual purpose of the loop below is to update Statement::ir_container_ pointers so that statements point to the correct owning Fusion after the content swap.

      // After swapping container contents, update Statement::ir_container_
      // pointers so each Statement points to the Fusion whose container now
      // holds it.
    

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant

    Comments