[DO NOT MERGRE] Re-use p2p-client by alpe · Pull Request #3065 · evstack/ev-node

alpe · 2026-02-12T07:43:56Z

Overview

github-actions · 2026-02-12T07:44:08Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Feb 13, 2026, 12:58 PM

claude · 2026-02-12T07:44:11Z

Claude finished @alpe's task in 2m 17s —— View job

Comprehensive Code Review

I've completed a thorough review of this P2P client refactoring PR. Below are my findings organized by category.

🏗️ Architecture & Design

✅ Strengths

Centralized P2P Client Lifecycle - The refactoring successfully centralizes P2P client management at the FullNode/LightNode level, eliminating redundant initialization across mode switches. This is a solid architectural improvement.
Interface Abstraction (P2PClient interface in pkg/sync/sync_service.go:40-47) - Good decoupling that makes sync services testable and reduces direct dependencies on concrete P2P implementation.
Concurrent Sync Service Startup (node/failover.go:192-208) - Using errgroup.WithContext to start header and data sync services concurrently is an effective optimization to reduce cumulative startup delays.

⚠️ Issues & Concerns

Missing Reconfigure() Implementation - The PR description and Gemini review mention a Reconfigure() method on p2p.Client to update configuration without restarting. However, I cannot find this method in pkg/p2p/client.go. This is a critical gap:
- The current code has no mechanism to reconfigure the P2P client during mode switches
- There's no Reconfigure(conf config.P2PConfig) method as claimed
- Fix this →
Misleading Comment in node/failover.go - Gemini's review correctly identified an inaccurate comment. There's no call to Reconfigure() in setupFailoverState. Comments should accurately reflect the code.
P2P Client Start/Stop in LightNode (node/light.go:123-125, 163-165) - The LightNode still calls P2P.Start() and P2P.Close() within its Run() method, unlike FullNode which manages this at a higher level. This inconsistency creates an asymmetry:
- FullNode: P2P client lifecycle managed in FullNode.Run (lines 285-292 in node/full.go)
- LightNode: P2P client lifecycle managed in LightNode.Run (lines 123-125, 163-165 in node/light.go)
- This inconsistency could lead to confusion and makes the refactoring incomplete.
- Fix this →

🔒 Concurrency & Thread Safety

✅ Strengths

Proper Error Aggregation (pkg/p2p/client.go:177-187) - Using errors.Join in Close() ensures all cleanup errors are captured and reported.
Concurrent Sync Startup with Proper Error Handling (node/failover.go:192-211) - The errgroup pattern correctly propagates errors from either sync service startup failure and provides proper cleanup via defer statements.

⚠️ Issues & Concerns

Race Condition in p2p.Client.Start() (pkg/p2p/client.go:124-139):
```
func (c *Client) Start(ctx context.Context) error {
    if c.started {  // Line 125 - READ without lock
        return nil
    }
    // ...
    c.started = true  // Line 172 - WRITE without lock
    return nil
}
```
- The started field is accessed without synchronization
- Concurrent calls to Start() could lead to race conditions
- This is a data race that could cause undefined behavior
- Should use sync.atomic.Bool or mutex protection
- Fix this →
Unsafe rawHost Field Access (pkg/p2p/client.go) - The new rawHost field (line 58 in the struct) is set in Start() but not accessed anywhere in the code. If it's intended for future use, concurrent access patterns need to be considered.

🛡️ Error Handling & Resilience

✅ Strengths

Robust Pending Block Recovery (block/internal/executing/pending.go) - The new file provides crash recovery for blocks that were partially saved. This prevents block production from getting stuck after a crash.
Proper Context Propagation - Throughout the codebase, context.Context is consistently threaded through for cancellation and timeout management.

⚠️ Issues & Concerns

Silent Error in Pending Block Deletion (block/internal/executing/executor.go:529-531):

if err := e.deletePendingBlock(e.ctx, batch); err != nil {
    e.logger.Warn().Err(err).Uint64("height", newHeight).Msg("failed to delete pending block metadata")
}

Failure to delete pending metadata is logged but not returned
This could lead to accumulation of stale pending blocks
Consider making this a hard error or implementing a cleanup mechanism
Fix this →

No Validation of Pending Block Height (block/internal/executing/executor.go:432-436):

pendingHeader, pendingData, err := e.getPendingBlock(ctx)
if err == nil && pendingHeader != nil && pendingHeader.Height() == newHeight {
    // use pending block
}

If pending block height doesn't match newHeight, it's silently ignored
No warning or cleanup for mismatched pending blocks
Could indicate a serious state inconsistency that should be investigated
Fix this →

Missing Height Guard Implementation - While the PR description mentions "height guards in HeaderStoreGetter and DataStoreGetter", I don't see the actual implementation in pkg/store/store_adapter.go:802-805 and 872-875. The HasAt() methods directly call GetHeader() and GetBlockData() without checking against committed height. This could expose uncommitted data via P2P.

Fix this →

🧪 Testing & Coverage

⚠️ Issues & Concerns

46.23% Patch Coverage - Codecov reports 50 lines missing coverage in changes:

block/internal/executing/pending.go: 14 missing, 11 partials (39% coverage)
node/failover.go: 3 missing, 3 partials (60% coverage)
block/internal/executing/executor.go: 1 missing, 3 partials (33% coverage)
pkg/cmd/run_node.go: 4 missing (0% coverage)

Critical gaps:

Pending block recovery logic is undertested - only 39% coverage for the new crash recovery mechanism
No tests for concurrent sync service startup in failover.go
P2P client lifecycle changes lack integration tests

Missing Test for Height Guard - The PR adds TestHeaderStoreGetter_HeightGuard but I don't see this test in the file content I read. Need to verify it's properly testing the committed height check.
No Tests for P2P Client Reuse Across Mode Switches - The core benefit of this PR (P2P client persistence) doesn't appear to have dedicated tests verifying:

P2P client survives aggregator→sync mode switch
P2P client survives sync→aggregator mode switch
DHT and PubSub state is preserved

🔐 Security

⚠️ Issues & Concerns

Potential Data Leakage via P2P - The height guard implementation is critical for security. Without proper guards:

Unsigned blocks could be served during executor's early save (before commit)
Uncommitted state could leak to P2P peers
This violates the consistency model where only committed blocks should be visible
The current HasAt() implementations in HeaderStoreGetter and DataStoreGetter don't appear to implement these guards correctly

No Validation of P2P Client Pointer - In multiple places (node/failover.go:48, 82), the p2pClient *p2p.Client is passed but never checked for nil before use. While the caller should ensure it's non-nil, defensive checks would prevent potential nil pointer dereferences.

📊 Performance

✅ Strengths

Reduced Startup Overhead - Concurrent sync service startup (lines 192-208 in node/failover.go) is a measurable improvement that reduces blocking time during initialization.
Eliminated P2P Stack Re-initialization - By reusing the P2P client across mode switches, the expensive libp2p host, DHT, and PubSub setup is avoided.

⚠️ Issues & Concerns

No Cleanup of Old P2P Connections - When switching modes (e.g., from sync to aggregator), there's no mechanism to clean up P2P connections that may no longer be relevant. The P2P client persists with all its existing connections and subscriptions.

📝 Code Quality & Maintainability

⚠️ Issues & Concerns

Inconsistent Error Handling - Multiple patterns:

errors.Join in some places (pkg/p2p/client.go:180)
fmt.Errorf wrapping in others
Silent logging in places (executor.go:530)
Consider standardizing on errors.Join for multi-error scenarios

Magic Constants - block/internal/executing/pending.go:14-16 uses string constants "pending_header" and "pending_data" without explaining their purpose or scope. Consider adding documentation about their lifecycle.
Unused rawHost Field - Added to Client struct but never used. Either use it or remove it.

🎯 Critical Issues Summary

Must Fix Before Merge

Missing Reconfigure() method - Core feature mentioned in PR description is not implemented
Race condition in Client.Start() - Data race on started field
Height guards not properly implemented - Security risk of serving uncommitted data
FullNode/LightNode P2P lifecycle inconsistency - Incomplete refactoring

Should Fix Before Merge

Pending block recovery is undertested (39% coverage)
No tests for P2P client reuse across mode switches
Mismatched pending block heights not handled with warnings/cleanup
Silent pending block deletion failures could accumulate stale data

✅ Positive Takeaways

The architectural direction is sound - centralizing P2P client management is the right approach
Error aggregation with errors.Join is properly applied
Concurrent sync service startup is a good optimization
Pending block recovery mechanism addresses a real crash recovery scenario

📌 Recommendations

Implement the missing Reconfigure() method or update PR description to reflect actual implementation
Fix the race condition in Client.Start() using atomic.Bool
Complete the height guard implementation to prevent serving uncommitted blocks
Add comprehensive integration tests for P2P client lifecycle across mode switches
Unify P2P client lifecycle management between FullNode and LightNode
Increase test coverage for pending block recovery to at least 70%
Document the PR as WIP (already marked with "[DO NOT MERGRE]") until critical issues are resolved

gemini-code-assist · 2026-02-12T07:44:28Z

Summary of Changes

Hello @alpe, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the P2P client's lifecycle management within the node architecture. By centralizing the P2P client's creation and making it persistent across node mode switches, it aims to improve startup efficiency and reduce resource overhead. Additionally, it introduces concurrent initialization for synchronization services and strengthens data consistency checks in the store getters, ensuring that only fully committed data is exposed via the P2P layer.

Highlights

Centralized P2P Client Management: The P2P client is now instantiated once at the top-level FullNode or LightNode creation and passed down to sub-components. This ensures the P2P client's lifecycle is managed independently of node mode switches (e.g., between aggregator and sync modes), preventing costly teardown and re-initialization of the libp2p stack.
Decoupled P2P Client Lifecycle: The p2p.Client now includes a Reconfigure method, allowing its internal configuration to be updated without restarting the underlying libp2p host, PubSub, or DHT. This is crucial for seamless transitions between node operational modes.
Concurrent Sync Service Startup: The header and data synchronization services in failover.go now start concurrently using errgroup.WithContext. This optimization reduces cumulative startup delays, especially in scenarios where peers might initially lack blocks.
Enhanced Header and Data Store Guards: New height guards have been implemented in HeaderStoreGetter and DataStoreGetter. These guards ensure that only headers and data at or below the committed store height are returned, preventing the P2P layer from serving unsigned or uncommitted data (e.g., from an executor's early save during crash recovery).
Refactored Node Constructors: Node constructors (newFullNode, newLightNode, newSyncMode, newAggregatorMode) and related test helpers have been updated to accept an initialized p2p.Client or crypto.PrivKey directly, simplifying their signatures and reinforcing the centralized P2P client management.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

apps/testapp/go.mod
- Uncommented the local replace directive for github.com/evstack/ev-node.
apps/testapp/go.sum
- Removed specific github.com/evstack/ev-node entries, reflecting dependency changes.
block/components_test.go
- Added clarifying comments for nil arguments in test component creation functions.
node/failover.go
- Removed unused imports for p2p/key and go-datastore.
- Updated newSyncMode and newAggregatorMode to accept a p2p.Client instead of nodeKey, rootDB, and daStore.
- Modified setupFailoverState to receive p2p.Client and use rktStore for sync services.
- Removed p2pClient.Start() and p2pClient.Close() from failoverState.Run as the client is now managed at a higher level.
- Implemented concurrent startup for header and data sync services in failoverState.Run.
node/full.go
- Updated import from p2p/key to p2p.
- Added p2pClient field to the FullNode struct.
- Modified newFullNode to accept p2p.Client instead of nodeKey.
- Updated leaderFactory and followerFactory to pass the p2pClient to mode constructors.
- Added logic to FullNode.Run to start and close the p2pClient once for the node's entire lifecycle.
node/helpers_test.go
- Updated imports to use github.com/evstack/ev-node/pkg/p2p and github.com/libp2p/go-libp2p/core/crypto.
- Modified createTestComponents to return crypto.PrivKey directly and removed key.NodeKey struct creation.
- Updated newTestNode and related test helper functions to handle crypto.PrivKey and p2p.Client parameters.
- Added newTestP2PClient helper function for creating P2P clients in tests.
node/light.go
- Removed unused import for p2p/key.
- Modified newLightNode to accept p2p.Client instead of nodeKey.
- Removed p2p.NewClient call from newLightNode.
- Updated error handling for ln.P2P.Close() in LightNode.Run.
node/light_test.go
- Added import for github.com/evstack/ev-node/pkg/p2p.
- Modified TestLightNodeLifecycle to explicitly create and pass a p2p.Client.
node/node.go
- Updated import from p2p/key to p2p.
- Modified NewNode to accept p2p.Client instead of nodeKey.
- Updated calls to newLightNode and newFullNode to pass the p2pClient.
node/single_sequencer_integration_test.go
- Updated calls to createTestComponents and createNodeWithCustomComponents to align with privKey parameter changes.
pkg/cmd/run_node.go
- Added import for github.com/evstack/ev-node/pkg/p2p.
- Instantiated p2p.NewClient before node.NewNode and passed the client to the node creation function.
pkg/cmd/run_node_test.go
- Added import for github.com/stretchr/testify/require.
- Modified createTestComponents to generate and return a dummy nodeKey.
pkg/p2p/client.go
- Added rawHost and started fields to Client struct for internal state management.
- Modified Start method to prevent re-starting and track the rawHost.
- Updated Close method to use errors.Join for aggregating errors.
- Added PrivKey() method to expose the node's private key.
- Added Reconfigure(conf config.P2PConfig) method for in-place configuration updates.
pkg/p2p/utils_test.go
- Removed errors import and modified testNet.Close() to ignore errors for test simplicity.
pkg/store/header_store_adapter_test.go
- Updated context.Background() to t.Context() in TestHeaderStoreAdapter_GetFromPendingByHash.
- Added TestHeaderStoreGetter_HeightGuard to verify committed height checks.
pkg/store/store_adapter.go
- Implemented height guards in HeaderStoreGetter.GetByHeight and HasAt to ensure data consistency.
- Implemented height guards in DataStoreGetter.GetByHeight and HasAt for similar data consistency checks.
pkg/sync/sync_service.go
- Removed direct import of github.com/evstack/ev-node/pkg/p2p.
- Introduced P2PClient interface to abstract P2P client dependencies.
- Updated SyncService and its constructors (NewDataSyncService, NewHeaderSyncService, newSyncService) to use the P2PClient interface.
test/e2e/failover_e2e_test.go
- Added a Kill() call for old processes in TestHASequencerRollingRestartE2E to ensure clean state before restarting nodes.

Activity

The pull request title indicates '[DO NOT MERGRE]', suggesting this is a work-in-progress or experimental branch not yet ready for merging.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

codecov · 2026-02-12T07:47:16Z

Codecov Report

❌ Patch coverage is 46.23656% with 50 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.80%. Comparing base (6a7c84c) to head (09f4a76).

Files with missing lines	Patch %	Lines
block/internal/executing/pending.go	39.02%	14 Missing and 11 partials ⚠️
node/failover.go	60.00%	3 Missing and 3 partials ⚠️
block/internal/executing/executor.go	33.33%	1 Missing and 3 partials ⚠️
node/full.go	50.00%	2 Missing and 2 partials ⚠️
pkg/cmd/run_node.go	0.00%	4 Missing ⚠️
node/light.go	0.00%	1 Missing and 1 partial ⚠️
pkg/p2p/client.go	75.00%	1 Missing and 1 partial ⚠️
pkg/store/store.go	66.66%	1 Missing and 1 partial ⚠️
node/node.go	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3065      +/-   ##
==========================================
- Coverage   61.89%   61.80%   -0.10%     
==========================================
  Files         111      112       +1     
  Lines       11113    11160      +47     
==========================================
+ Hits         6878     6897      +19     
- Misses       3496     3513      +17     
- Partials      739      750      +11

Flag	Coverage Δ
combined	`61.80% <46.23%> (-0.10%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

gemini-code-assist

Code Review

This pull request refactors the P2P client lifecycle and integration within the node architecture. The primary change involves centralizing the creation and management of the P2P client in the FullNode and LightNode constructors, passing an already initialized p2p.Client instance rather than a nodeKey for internal P2P client creation. This ensures the P2P client is long-lived and persists across mode switches (e.g., between aggregator and sync modes), avoiding costly re-initialization of the libp2p stack. Consequently, imports related to p2p/key and ipfs/go-datastore were removed from several files, and failover.go now runs header and data sync services concurrently to improve startup performance. Test helper functions were updated to reflect these changes, now returning crypto.PrivKey and creating p2p.Client instances. Additionally, pkg/store/store_adapter.go was updated to include a height guard in HeaderStoreGetter and DataStoreGetter to prevent serving uncommitted or future block data, and a new test TestHeaderStoreGetter_HeightGuard was added to verify this behavior. A review comment highlighted an inaccuracy in node/failover.go regarding the Reconfigure() call, which should be addressed for clarity.

gemini-code-assist · 2026-02-12T07:48:02Z

node/failover.go

-	}
-	defer f.p2pClient.Close() // nolint: errcheck
+	// P2P client persists across mode switches (started/closed by FullNode.Run).
+	// Reconfigure() was already called in setupFailoverState to re-bootstrap DHT.


This comment states that Reconfigure() is called in setupFailoverState, but this call does not appear to be present in the setupFailoverState function. This is misleading and could cause confusion about how the P2P client is reconfigured during mode switches. Please remove or update this comment to accurately reflect the code's behavior.

Re-use p2p-client

d45758e

github-actions bot assigned alpe Feb 12, 2026

gemini-code-assist bot reviewed Feb 12, 2026

View reviewed changes

alpe added 3 commits February 12, 2026 15:00

x

bdca5fb

Minor updates

387c3fb

Merge branch 'main' into alex/p2p_exec_ai1

09f4a76

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGRE] Re-use p2p-client#3065

[DO NOT MERGRE] Re-use p2p-client#3065
alpe wants to merge 4 commits intomainfrom
alex/p2p_exec_ai1

alpe commented Feb 12, 2026

Uh oh!

github-actions bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

claude bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 12, 2026

Uh oh!

codecov bot commented Feb 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

alpe commented Feb 12, 2026

Overview

Uh oh!

github-actions bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comprehensive Code Review

🏗️ Architecture & Design

✅ Strengths

⚠️ Issues & Concerns

🔒 Concurrency & Thread Safety

✅ Strengths

⚠️ Issues & Concerns

🛡️ Error Handling & Resilience

✅ Strengths

⚠️ Issues & Concerns

🧪 Testing & Coverage

⚠️ Issues & Concerns

🔐 Security

⚠️ Issues & Concerns

📊 Performance

✅ Strengths

⚠️ Issues & Concerns

📝 Code Quality & Maintainability

⚠️ Issues & Concerns

🎯 Critical Issues Summary

Must Fix Before Merge

Should Fix Before Merge

✅ Positive Takeaways

📌 Recommendations

Uh oh!

gemini-code-assist bot commented Feb 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

codecov bot commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Feb 12, 2026 •

edited

Loading

claude bot commented Feb 12, 2026 •

edited

Loading

codecov bot commented Feb 12, 2026 •

edited

Loading