Fix memory leaks, error handling, and type-safety issues in N-API bindings#1422
Fix memory leaks, error handling, and type-safety issues in N-API bindings#1422minggangw wants to merge 2 commits intoRobotWebTools:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens rclnodejs initialization and several N-API bindings by addressing memory leaks, improving error propagation, and reducing integer truncation risk when bridging ROS/RCL data into JavaScript.
Changes:
- Fix missing frees on
*_TAKE_FAILEDpaths and reorder cleanup vs. JS exception throwing in native bindings. - Improve error-message reliability by capturing
rcl_get_error_string()beforercl_reset_error(). - Add context rollback on init failure (generator step) and a regression test to ensure re-init is possible afterward.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
test/test-init-shutdown.js |
Adds a regression test for init rollback when the generator step fails. |
src/rcl_subscription_bindings.cpp |
Introduces RCPPUTILS_SCOPE_EXIT for exception-safe cleanup and adjusts subscription creation cleanup/throw ordering. |
src/rcl_service_bindings.cpp |
Frees header on take-failed path to prevent leaks. |
src/rcl_client_bindings.cpp |
Returns sequence numbers as double to reduce truncation vs. prior 32-bit casts. |
src/rcl_action_server_bindings.cpp |
Frees headers on take-failed paths; improves error handling ordering; widens sequence number conversions. |
src/rcl_action_client_bindings.cpp |
Improves error handling ordering; widens sequence number conversions. |
index.js |
Wraps generator post-init steps with rollback-on-failure to avoid half-initialized contexts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| int64_t sequence_number; | ||
| THROW_ERROR_IF_NOT_EQUAL( | ||
| rcl_action_send_goal_request(action_client, buffer, &sequence_number), | ||
| RCL_RET_OK, rcl_get_error_string().str); | ||
|
|
||
| return Napi::Number::New(env, static_cast<int32_t>(sequence_number)); | ||
| return Napi::Number::New(env, static_cast<double>(sequence_number)); | ||
| } |
There was a problem hiding this comment.
sequence_number is int64_t but is now returned as a JS Number (double). That improves on the prior int32_t cast, but it still isn’t a lossless 64-bit representation past Number.MAX_SAFE_INTEGER, and these sequences are later used as Map keys for correlating goal/cancel/result responses. Consider returning Napi::BigInt (consistent with other int64-valued APIs in this repo) or otherwise documenting/enforcing a safe range to prevent correlation bugs due to precision loss.
| if (ret != RCL_RET_ACTION_CLIENT_TAKE_FAILED) { | ||
| return Napi::Number::New(env, static_cast<int32_t>(sequence_number)); | ||
| return Napi::Number::New(env, static_cast<double>(sequence_number)); | ||
| } |
There was a problem hiding this comment.
sequence_number is int64_t but is returned as a JS Number (double). While this avoids the previous 32-bit narrowing, it still cannot represent all 64-bit values exactly beyond Number.MAX_SAFE_INTEGER, and JS uses these sequence numbers to correlate action responses. Consider using Napi::BigInt for a lossless representation (or documenting/enforcing a safe range).
| try { | ||
| context.tryShutdown(); | ||
| } catch (shutdownError) { | ||
| error.message += ` Failed to roll back context after init failure: ${shutdownError.message}`; | ||
| } | ||
| throw error; |
There was a problem hiding this comment.
In the catch block, the code mutates error.message via +=. In strict mode this can throw if error isn’t an Error object (e.g., a rejected promise with a string/primitive), which would mask the original init failure and potentially skip the intended rollback. Consider normalizing to an Error instance (or wrapping a new error with cause) before appending details, and ensure the rollback failure is reported without risking a secondary throw.
| try { | |
| context.tryShutdown(); | |
| } catch (shutdownError) { | |
| error.message += ` Failed to roll back context after init failure: ${shutdownError.message}`; | |
| } | |
| throw error; | |
| let toThrow = error; | |
| try { | |
| context.tryShutdown(); | |
| } catch (shutdownError) { | |
| const shutdownMessage = | |
| shutdownError instanceof Error | |
| ? shutdownError.message | |
| : String(shutdownError); | |
| if (error instanceof Error) { | |
| error.message += | |
| ` Failed to roll back context after init failure: ${shutdownMessage}`; | |
| } else { | |
| const originalMessage = | |
| typeof error === 'string' ? error : String(error); | |
| toThrow = new Error( | |
| `${originalMessage} Failed to roll back context after init failure: ${shutdownMessage}` | |
| ); | |
| } | |
| } | |
| throw toThrow; |
| RCPPUTILS_SCOPE_EXIT({ rmw_serialized_message_fini(&msg); }); | ||
|
|
||
| Napi::Buffer<char> buffer = Napi::Buffer<char>::Copy( | ||
| env, reinterpret_cast<char*>(msg.buffer), msg.buffer_length); | ||
| THROW_ERROR_IF_NOT_EQUAL(rmw_serialized_message_fini(&msg), RCL_RET_OK, | ||
| "Failed to deallocate message buffer"); | ||
|
|
There was a problem hiding this comment.
In RclTakeRaw, rmw_serialized_message_fini(&msg) is now executed via RCPPUTILS_SCOPE_EXIT but its return value is ignored. If *_fini fails and sets the RMW/RCUTILS error state, this can silently leak an error into subsequent calls (and it also drops the previous explicit error reporting). Consider keeping the scope-exit for exception safety, but still checking the *_fini return on the normal (no-exception) path and clearing the error state if cleanup fails.
| RCPPUTILS_SCOPE_EXIT({ | ||
| rcl_subscription_content_filter_options_fini(subscription, &options); | ||
| }); |
There was a problem hiding this comment.
GetContentFilter now finalizes rcl_subscription_content_filter_options_t via RCPPUTILS_SCOPE_EXIT, but the return code from rcl_subscription_content_filter_options_fini() is no longer checked and any error string is no longer surfaced to JS. This is a behavioral regression vs the prior explicit cleanup path and can also leave the global RCL error state uncleared if *_fini fails. Consider finalizing in a way that preserves exception safety but still checks/clears errors on the normal return path.
| const generateAllStub = sinon | ||
| .stub(generator, 'generateAll') | ||
| .rejects(new Error('generator failed')); | ||
|
|
||
| const indexPath = require.resolve('../index.js'); | ||
| delete require.cache[indexPath]; | ||
| const freshRclnodejs = require('../index.js'); | ||
|
|
||
| const failedContext = new freshRclnodejs.Context(); | ||
|
|
||
| await assert.rejects(async () => { | ||
| await freshRclnodejs.init(failedContext); | ||
| }, /generator failed/); | ||
| assert.ok(freshRclnodejs.isShutdown(failedContext)); | ||
|
|
||
| generateAllStub.restore(); | ||
|
|
||
| const recoveredContext = new freshRclnodejs.Context(); | ||
| await assert.doesNotReject(async () => { | ||
| await freshRclnodejs.init(recoveredContext); | ||
| }); | ||
| freshRclnodejs.shutdown(recoveredContext); | ||
| freshRclnodejs.removeSignalHandlers(); |
There was a problem hiding this comment.
This test stubs generator.generateAll but restores the stub only after several assertions. If any assertion fails (or if require('../index.js') throws), the stub may not be restored and could cascade failures into later tests. Use a try/finally (or a sinon sandbox with sandbox.restore() in afterEach) to guarantee restoration and cleanup (removeSignalHandlers) even on failure.
| const generateAllStub = sinon | |
| .stub(generator, 'generateAll') | |
| .rejects(new Error('generator failed')); | |
| const indexPath = require.resolve('../index.js'); | |
| delete require.cache[indexPath]; | |
| const freshRclnodejs = require('../index.js'); | |
| const failedContext = new freshRclnodejs.Context(); | |
| await assert.rejects(async () => { | |
| await freshRclnodejs.init(failedContext); | |
| }, /generator failed/); | |
| assert.ok(freshRclnodejs.isShutdown(failedContext)); | |
| generateAllStub.restore(); | |
| const recoveredContext = new freshRclnodejs.Context(); | |
| await assert.doesNotReject(async () => { | |
| await freshRclnodejs.init(recoveredContext); | |
| }); | |
| freshRclnodejs.shutdown(recoveredContext); | |
| freshRclnodejs.removeSignalHandlers(); | |
| let generateAllStub; | |
| let freshRclnodejs; | |
| try { | |
| generateAllStub = sinon | |
| .stub(generator, 'generateAll') | |
| .rejects(new Error('generator failed')); | |
| const indexPath = require.resolve('../index.js'); | |
| delete require.cache[indexPath]; | |
| freshRclnodejs = require('../index.js'); | |
| const failedContext = new freshRclnodejs.Context(); | |
| await assert.rejects(async () => { | |
| await freshRclnodejs.init(failedContext); | |
| }, /generator failed/); | |
| assert.ok(freshRclnodejs.isShutdown(failedContext)); | |
| const recoveredContext = new freshRclnodejs.Context(); | |
| await assert.doesNotReject(async () => { | |
| await freshRclnodejs.init(recoveredContext); | |
| }); | |
| freshRclnodejs.shutdown(recoveredContext); | |
| } finally { | |
| if (generateAllStub && typeof generateAllStub.restore === 'function') { | |
| generateAllStub.restore(); | |
| } | |
| if ( | |
| freshRclnodejs && | |
| typeof freshRclnodejs.removeSignalHandlers === 'function' | |
| ) { | |
| freshRclnodejs.removeSignalHandlers(); | |
| } | |
| } |
| int64_t sequence_number; | ||
| THROW_ERROR_IF_NOT_EQUAL(rcl_send_request(client, buffer, &sequence_number), | ||
| RCL_RET_OK, rcl_get_error_string().str); | ||
|
|
||
| return Napi::Number::New(env, static_cast<uint32_t>(sequence_number)); | ||
| return Napi::Number::New(env, static_cast<double>(sequence_number)); | ||
| } |
There was a problem hiding this comment.
These sequence numbers originate as int64_t, but are returned to JS as Napi::Number (IEEE-754 double). This avoids the previous 32-bit truncation, but it still cannot represent all 64-bit integers (precision is only guaranteed up to Number.MAX_SAFE_INTEGER). Since sequence numbers are used as Map keys on the JS side (e.g., for matching responses), precision loss could cause key collisions for large values. Consider returning a Napi::BigInt (with a lossless check) or explicitly documenting/enforcing a safe range.
Fix memory leaks, error handling, and type-safety issues in N-API bindings
Memory leaks (Critical)
Added missing
free(header)onTAKE_FAILEDearly-return paths wheremalloc'drmw_request_id_t*was never freed.src/rcl_action_server_bindings.cpp—ActionTakeResultRequest,ActionTakeGoalRequest,ActionTakeCancelRequestsrc/rcl_service_bindings.cpp—RclTakeRequestError handling misordering (High)
Captured
rcl_get_error_string()before callingrcl_reset_error()to avoid passing empty error messages to JavaScript.src/rcl_action_client_bindings.cpp—ActionTakeFeedback,ActionTakeStatussrc/rcl_action_server_bindings.cpp—ActionTakeGoalResponse,ActionTakeCancelResponse,ActionTakeResultResponseSequence number truncation (High)
Changed
int32_t/uint32_t→doublecasts for 64-bit sequence numbers to avoid silent truncation above 2^31.src/rcl_action_client_bindings.cpp,src/rcl_action_server_bindings.cpp,src/rcl_client_bindings.cppException-safe cleanup (Medium)
Used
RCPPUTILS_SCOPE_EXITinsrc/rcl_subscription_bindings.cppforRclTakeRawandGetContentFilterto ensure resource finalization even if N-API calls throw. Reorderedfree(subscription)beforeThrowAsJavaScriptException()inCreateSubscription.Context rollback on init failure (Medium)
Wrapped post-init steps in
index.jswithtry/catchthat callscontext.tryShutdown()ifgenerator.generateAll()fails, preventing a half-initialized context from blocking re-initialization. Added corresponding test intest/test-init-shutdown.js.Fix: #1420