Skip to content

Local processing job requires role, but doesn't use it #5562

@moose-in-australia

Description

@moose-in-australia

PySDK Version

  • PySDK V2 (2.x)
  • PySDK V3 (3.x)

Describe the bug
Processor.__init__ unconditionally requires an IAM role, even when running in Local Mode (instance_type="local"). The LocalSagemakerClient.create_processing_job method never uses the role (it's absorbed by **kwargs), yet the base Processor class raises a ValueError if one isn't provided. This forces users to pass a dummy role string when running processing jobs locally.

To reproduce

from sagemaker.core import FrameworkProcessor
from sagemaker.core.local import LocalSession
from sagemaker.core.helper.session_helper import Session
from sagemaker.core.image_uris import retrieve

region = Session().boto_region_name
local_session = LocalSession()

processor_image_uri = retrieve(
    framework="sklearn",
    version="1.4-2",
    region=region
)

# This raises ValueError: "An AWS IAM role is required to create a Processing job."
processor = FrameworkProcessor(
    image_uri=processor_image_uri,
    instance_type="local",
    instance_count=1,
    sagemaker_session=local_session,
    base_job_name="local-processing-test",
    command=["python3"],
)

Expected behavior
When instance_type is "local" or "local_gpu", the role parameter should be optional since the role is not used in Local Mode.

Screenshots or logs

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:15                                                                                   │
│                                                                                                  │
│   12region=region                                                                           │
│   13 )                                                                                           │
│   14                                                                                             │
│ ❱ 15 processor = FrameworkProcessor(                                                             │
│   16image_uri=processor_image_uri,                                                          │
│   17#role="arn:aws:iam::123456789012:role/DummyRole",                                       │18instance_type="local",                                                                  │
│                                                                                                  │
│ /opt/conda/lib/python3.12/site-packages/sagemaker/core/processing.py:1091 in __init__            │
│                                                                                                  │
│   1088 │   │   if not command:                                                                   │
│   1089 │   │   │   command = ["python"]                                                          │
│   1090 │   │                                                                                     │
│ ❱ 1091 │   │   super().__init__(                                                                 │
│   1092 │   │   │   role=role,                                                                    │
│   1093 │   │   │   image_uri=image_uri,                                                          │
│   1094 │   │   │   command=command,                                                              │
│                                                                                                  │
│ /opt/conda/lib/python3.12/site-packages/sagemaker/core/processing.py:755 in __init__             │
│                                                                                                  │
│    752 │   │                                                                                     │
│    753 │   │   self.command = command                                                            │
│    754 │   │                                                                                     │
│ ❱  755 │   │   super(ScriptProcessor, self).__init__(                                            │
│    756 │   │   │   role=role,                                                                    │
│    757 │   │   │   image_uri=image_uri,                                                          │
│    758 │   │   │   instance_count=instance_count,                                                │
│                                                                                                  │
│ /opt/conda/lib/python3.12/site-packages/sagemaker/core/processing.py:217 in __init__             │
│                                                                                                  │
│    214 │   │   │   # Now we marked that as Optional because we can fetch it from SageMakerConfi  │215 │   │   │   # Because of marking that parameter as optional, we should validate if it is  │216 │   │   │   # after fetching the config.                                                  │
│ ❱  217 │   │   │   raise ValueError("An AWS IAM role is required to create a Processing job.")   │
│    218 │   │                                                                                     │
│    219 │   │   self.env = resolve_value_from_config(                                             │
│    220 │   │   │   env, PROCESSING_JOB_ENVIRONMENT_PATH, sagemaker_session=self.sagemaker_sessi  │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: An AWS IAM role is required to create a Processing job.

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 3.4.0
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): scikit-learn
  • Framework version: 1.4-2
  • Python version: 3.10
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N

Additional context
The LocalSagemakerClient.create_processing_job method signature confirms the role is unused in local mode - RoleArn is not an explicit parameter and is silently discarded via **kwargs. The current workaround is to pass any non-empty string as the role (e.g., role="arn:aws:iam::123456789012:role/DummyRole"), but this is confusing for users who expect local mode to work without AWS IAM configuration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions