Skip to content

Fix RuntimeApiClientFuture Visibility and Response Handling #1110

@darklight3it

Description

@darklight3it

Background

The RuntimeApiClientFuture enum in lambda-runtime/src/layers/api_client.rs has several design issues that prevent proper error handling and violate Rust API design best practices. These issues were identified during discussions in PRs #1091 and #1105.

Current Problems

1. Missing #[non_exhaustive] Attribute

The RuntimeApiClientFuture enum is publicly exposed but lacks the #[non_exhaustive] attribute:

pub enum RuntimeApiClientFuture<F> {
    First(#[pin] F, Arc<Client>),
    Second(#[pin] BoxFuture<'static, Result<http::Response<Incoming>, BoxError>>),
}

This is a semver hazard because:

  • Adding new variants becomes a breaking change for downstream users who pattern match on this enum
  • The enum is accessible as an associated type on RuntimeApiClientService, which is publicly visible
  • Any user building custom runtimes on top of our runtime client could be affected

2. Incorrect Visibility

The enum and its variants are fully public, exposing internal implementation details that were likely not intended to be part of the public API. As noted in PR #1105:

3. Cannot Access Response Body

The current implementation discards the response body for non-2xx responses, making it impossible to read error details from the Lambda Runtime API:

Ok(resp) if !resp.status().is_success() => {
    let status = resp.status();
    
    log_or_print!(
        tracing: tracing::error!(status = %status, "Lambda Runtime API returned non-200 response"),
        fallback: eprintln!("Lambda Runtime API returned non-200 response: status={status}")
    );
    
    // We cannot access resp.body() here to log the actual error message!
    
    break Ok(()); // Returns Ok despite error!
}

This is particularly problematic for 410 Gone responses (function timeouts), where we have to add hardcoded messages instead of reading the actual error from the API.

4. Returns Ok(()) for Non-2xx Status Codes

The most counterintuitive behavior is that the client returns Ok(()) even when the Lambda Runtime API returns error status codes:

Ok(resp) if !resp.status().is_success() => {
    // ... logging ...
    
    // Return Ok to maintain existing contract - runtime continues despite API errors
    break Ok(());
}

This makes error handling confusing. A non-2xx HTTP response should be treated as an error, not success.

Impact Assessment

Based on code search analysis mentioned in PR #1105:

The impact is expected to be minimal:

  • Internal machinery doesn't rely on the specific type surface
  • Only users who built custom higher-level runtimes would be affected
  • Forks of the api client would need updates

Proposed Solutions

Since these changes are breaking, we should bundle them together in a single major version bump. Here are the recommended approaches:

Option 1: Add #[non_exhaustive] and Fix Response Handling (Recommended)

Changes:

  1. Add #[non_exhaustive] to RuntimeApiClientFuture
  2. Change the return type to properly represent errors
  3. Read and log response bodies for non-2xx responses
  4. Return Err for non-2xx status codes instead of Ok

Implementation:

#[non_exhaustive]
pub enum RuntimeApiClientFuture<F> {
    First(#[pin] F, Arc<Client>),
    Second(#[pin] BoxFuture<'static, Result<http::Response<Incoming>, BoxError>>),
}

impl<F> Future for RuntimeApiClientFuture<F>
where
    F: Future<Output = Result<http::Request<Body>, BoxError>>,
{
    type Output = Result<(), BoxError>;

    fn poll(mut self: Pin<&mut Self>, cx: &mut task::Context<'_>) -> task::Poll<Self::Output> {
        task::Poll::Ready(loop {
            match self.as_mut().project() {
                // ... First variant handling ...
                RuntimeApiClientFutureProj::Second(fut) => match ready!(fut.poll(cx)) {
                    Ok(resp) if !resp.status().is_success() => {
                        let status = resp.status();
                        
                        // Read the response body to get actual error details
                        let body = read_body_to_string(resp).await;
                        
                        log_or_print!(
                            tracing: tracing::error!(
                                status = %status, 
                                body = %body,
                                "Lambda Runtime API returned non-2xx response"
                            ),
                            fallback: eprintln!(
                                "Lambda Runtime API returned non-2xx response: status={status}, body={body}"
                            )
                        );
                        
                        // Return Err for non-2xx responses
                        break Err(format!("Runtime API error: {} - {}", status, body).into());
                    }
                    Ok(_) => break Ok(()),
                    Err(err) => break Err(err),
                },
            }
        })
    }
}

Pros:

  • Fixes all issues in one go
  • Proper error semantics
  • Better debugging experience with actual error messages
  • Future-proof with #[non_exhaustive]

Cons:

  • Breaking change for anyone matching on the enum
  • Breaking change for anyone relying on Ok(()) for non-2xx responses

Option 2: Deprecate and Replace with New Type

Changes:

Option 2: Deprecate and Replace with New Type

Changes:

  1. Create a new RuntimeApiClientFutureV2 with correct attributes
  2. Create a new RuntimeApiClientServiceV2 that uses the new future type
  3. Deprecate the old RuntimeApiClientFuture and RuntimeApiClientService
  4. Update internal usage to new types
  5. Remove old types in next major version

Rationale:
Since RuntimeApiClientFuture is exposed as an associated type on RuntimeApiClientService, we cannot change the future type without also creating a new service type. The service's Future associated type is part of its public API contract.

Implementation:

// Old types - deprecated
#[deprecated(since = "0.x.0", note = "Use RuntimeApiClientServiceV2 instead")]
pub struct RuntimeApiClientService<S> {
    inner: S,
    client: Arc<Client>,
}

#[deprecated(since = "0.x.0", note = "Use RuntimeApiClientFutureV2 instead")]
pub enum RuntimeApiClientFuture<F> {
    First(#[pin] F, Arc<Client>),
    Second(#[pin] BoxFuture<'static, Result<http::Response<Incoming>, BoxError>>),
}

// New types - correct design
pub struct RuntimeApiClientServiceV2<S> {
    inner: S,
    client: Arc<Client>,
}

impl<S> Service<LambdaInvocation> for RuntimeApiClientServiceV2<S>
where
    S: Service<LambdaInvocation>,
    S::Future: Future<Output = Result<http::Request<Body>, BoxError>>,
    S::Error: Into<BoxError>,
{
    type Response = ();
    type Error = BoxError;
    type Future = RuntimeApiClientFutureV2<S::Future>;
    
    // ... implementation ...
}

#[non_exhaustive]
pub enum RuntimeApiClientFutureV2<F> {
    First(#[pin] F, Arc<Client>),
    Second(#[pin] BoxFuture<'static, Result<http::Response<Incoming>, BoxError>>),
}

Pros:

  • Gentler migration path
  • Gives users time to update
  • Old behavior remains available during deprecation period

Cons:

  • Maintains broken behavior longer
  • Significantly more code to maintain during transition (duplicate service + future implementations)
  • Still requires breaking change eventually
  • Users need to update both service and future type references
  • More complex migration story

Option 3 Changing the overall behaviour of client returning body.

Another solution proposed by @jlizen is to directly change how the client works at in lambda-rust-api-client. Like the following.

self.client
    .request(req)
    .map_err(Into::into)
    .then(|res| async move {
        match res {
            Ok(resp) if !resp.status().is_success() => {
                let status = resp.status();
                match resp.into_body().await {
                    // convert into a BoxError containing the status and optional body
                    // which is nicely loggable in the upper layer, or you could also log it here.
                    // If you need to specifically log ONLY this case rather than other BoxErrors, you could newtype
                    // a marker type that we downcast the stderror to
                    Err(BoxError::new(format!("my message with status code + body")))
                }
            }
            _ => res,
        }
    })
    .boxed()

Pros:

  • In this case we are only breaking the behaviour of client returning OK instead of ERR. It is a behavioural breaking chane.
  • Proper error semantics
  • Better debugging experience with actual error messages

Cons:

  • RuntimeApiClientFuture is not fixed
  • await at Client level.

Error Types

Consider creating a dedicated error type for Runtime API errors:

#[derive(Debug, thiserror::Error)]
pub enum RuntimeApiError {
    #[error("Runtime API returned {status}: {body}")]
    NonSuccessResponse { status: StatusCode, body: String },
    
    #[error("Request build failed: {0}")]
    RequestBuildError(#[source] BoxError),
    
    #[error("Network error: {0}")]
    NetworkError(#[source] BoxError),
}

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions