Skip to content

Comments

feat: MAAS deploy workflow and dynamic inventory integration#1338

Merged
michael-balint merged 2 commits intoNVIDIA:masterfrom
dholt:feature/maas-deploy-workflow-v2
Feb 20, 2026
Merged

feat: MAAS deploy workflow and dynamic inventory integration#1338
michael-balint merged 2 commits intoNVIDIA:masterfrom
dholt:feature/maas-deploy-workflow-v2

Conversation

@dholt
Copy link
Contributor

@dholt dholt commented Feb 20, 2026

Summary

  • Add scripts/maas_deploy.sh with --os, --profile, --status, --release, and --tags-only flags for repeatable VM deploy/tag/test/redeploy cycles. Reads config from config/maas-inventory.yml (no hardcoded secrets).
  • Update maas_inventory.py to exit gracefully when MAAS is not configured (returns empty inventory instead of error), enabling dual inventory without breaking non-MAAS users.
  • Wire ansible.cfg for dual inventory (static + dynamic). Add machines field to config.example/maas-inventory.yml.

Test plan

  • maas_deploy.sh --status shows VM state table
  • maas_deploy.sh --os noble --profile k8s deploys and tags correctly
  • maas_deploy.sh --profile slurm --tags-only re-tags (clean slate) without redeploying
  • maas_inventory.py --list returns correct groups after tagging
  • ansible -m ping all reaches all hosts via dynamic inventory
  • ansible-playbook playbooks/k8s-cluster.yml completes successfully (710 ok on control plane)
  • Empty config returns {"_meta":{"hostvars":{}}} (no error)
  • No hardcoded secrets in committed files

🤖 Generated with Claude Code

dholt and others added 2 commits February 20, 2026 09:55
Add scripts/maas_deploy.sh with --os, --profile, --status, --release,
and --tags-only flags for repeatable VM deploy/tag/test/redeploy cycles.
Reads config from config/maas-inventory.yml (no hardcoded secrets).

Update maas_inventory.py to exit gracefully when MAAS is not configured
(returns empty inventory instead of error), enabling dual inventory in
ansible.cfg without breaking non-MAAS users.

Wire ansible.cfg for dual inventory (static + dynamic). Add machines
field to config.example/maas-inventory.yml. Add test-playbooks skill.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Douglas Holt <dholt@nvidia.com>
Recognize placeholder values from config.example (angle brackets,
CONSUMER_KEY:TOKEN_KEY:TOKEN_SECRET) as unconfigured and return
empty inventory instead of attempting to connect and failing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Douglas Holt <dholt@nvidia.com>
@michael-balint michael-balint merged commit 0618771 into NVIDIA:master Feb 20, 2026
18 checks passed
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds MAAS (Metal as a Service) integration to DeepOps, enabling automated VM deployment and dynamic inventory management for testing workflows. It introduces a deployment script for VM lifecycle management and updates the dynamic inventory script to gracefully handle unconfigured MAAS environments, allowing dual static/dynamic inventory support.

Changes:

  • Added maas_deploy.sh script for repeatable VM deploy/tag/release operations with support for multiple Ubuntu versions and deployment profiles
  • Updated maas_inventory.py to return empty inventory when unconfigured instead of failing, enabling coexistence with static inventory
  • Introduced TAG_ALIASES for backward compatibility with old hyphenated K8s group names (kube-master → kube_control_plane)

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/maas_deploy.sh New deployment script providing VM lifecycle management (deploy, release, status, tagging) with k8s and slurm profiles
scripts/maas_inventory.py Updated to gracefully handle unconfigured MAAS, added TAG_ALIASES for K8s group name migration, updated GROUP_CHILDREN to use underscores
config.example/maas-inventory.yml Added machines field for specifying hostnames managed by maas_deploy.sh
ansible.cfg Added dynamic inventory script to inventory path alongside static inventory
.claude/skills/test-playbooks.md New documentation for testing playbooks across Ubuntu versions using MAAS deployment
Comments suppressed due to low confidence (3)

scripts/maas_deploy.sh:110

  • If --os is provided without a value, the script will fail with a cryptic "unbound variable" error due to set -u. Consider adding explicit validation to provide a clearer error message, for example: [[ -n "${2:-}" ]] || { echo "ERROR: --os requires a value"; exit 1; }
            --os)
                DISTRO_SERIES="$2"; shift 2 ;;

scripts/maas_deploy.sh:112

  • If --profile is provided without a value, the script will fail with a cryptic "unbound variable" error due to set -u. Consider adding explicit validation to provide a clearer error message, for example: [[ -n "${2:-}" ]] || { echo "ERROR: --profile requires a value"; exit 1; }
            --profile)
                PROFILE="$2"; shift 2 ;;

scripts/maas_deploy.sh:173

  • The hostname is inserted directly into the URL without URL encoding. While MAAS hostnames are typically restricted to safe characters, consider URL-encoding the hostname for robustness (e.g., using Python's urllib.parse.quote or curl's --data-urlencode).
    maas_get "/machines/?hostname=${hostname}" | python3 -c "

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

api_url) [[ -z "${MAAS_API_URL:-}" ]] && MAAS_API_URL="$value" ;;
api_key) [[ -z "${MAAS_API_KEY:-}" ]] && MAAS_API_KEY="$value" ;;
ssh_user) [[ -z "${MAAS_SSH_USER:-}" ]] && MAAS_SSH_USER="$value" ;;
ssh_bastion) [[ -z "${MAAS_SSH_PROXY:-}" ]] && MAAS_SSH_PROXY="ssh -W %h:%p -q ${value}" ;;
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable value should be quoted to prevent word splitting and glob expansion. While the actual usage of MAAS_SSH_PROXY is properly quoted later, the assignment here could fail if the ssh_bastion value contains spaces or special characters. Change to: MAAS_SSH_PROXY="ssh -W %h:%p -q \"${value}\""

Suggested change
ssh_bastion) [[ -z "${MAAS_SSH_PROXY:-}" ]] && MAAS_SSH_PROXY="ssh -W %h:%p -q ${value}" ;;
ssh_bastion) [[ -z "${MAAS_SSH_PROXY:-}" ]] && MAAS_SSH_PROXY="ssh -W %h:%p -q \"${value}\"" ;;

Copilot uses AI. Check for mistakes.
Comment on lines +194 to +198
local network_filter="${MAAS_NETWORK:-}"
maas_get "/machines/${system_id}/" | python3 -c "
import json, sys
m = json.load(sys.stdin)
network = '${network_filter}'
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The network_filter variable is interpolated directly into Python code without proper escaping. If the value contains single quotes or backslashes, it could break the Python string or lead to code injection. Consider using a safer approach such as passing the value as a command-line argument to Python or properly escaping the value before interpolation.

Suggested change
local network_filter="${MAAS_NETWORK:-}"
maas_get "/machines/${system_id}/" | python3 -c "
import json, sys
m = json.load(sys.stdin)
network = '${network_filter}'
maas_get "/machines/${system_id}/" | python3 -c "
import json, os, sys
m = json.load(sys.stdin)
network = os.environ.get('MAAS_NETWORK', '')

Copilot uses AI. Check for mistakes.
inventory[group] = {"hosts": [], "vars": {}}
elif "hosts" not in inventory[group]:
inventory[group]["hosts"] = []
inventory[group]["hosts"].append(hostname)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a machine has both an old tag and its aliased new tag (e.g., both "kube-master" and "kube_control_plane"), the host will be added to the group's host list twice, resulting in duplicate entries. Consider checking if the host is already in the group before appending, or using a set to avoid duplicates.

Suggested change
inventory[group]["hosts"].append(hostname)
if hostname not in inventory[group]["hosts"]:
inventory[group]["hosts"].append(hostname)

Copilot uses AI. Check for mistakes.
Comment on lines +152 to +153
local consumer_key token_key token_secret
IFS=':' read -r consumer_key token_key token_secret <<< "$MAAS_API_KEY"
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MAAS_API_KEY is split without validating that it contains exactly 3 parts (consumer_key:token_key:token_secret). If the API key is malformed, this could lead to unexpected behavior. Consider adding validation similar to the Python script (maas_inventory.py lines 68-72), or at least checking that all three variables are non-empty after the split.

Suggested change
local consumer_key token_key token_secret
IFS=':' read -r consumer_key token_key token_secret <<< "$MAAS_API_KEY"
# Validate MAAS_API_KEY format: consumer_key:token_key:token_secret
if [[ -z "${MAAS_API_KEY-}" ]]; then
echo "ERROR: MAAS_API_KEY is not set. Expected format: consumer_key:token_key:token_secret"
exit 1
fi
local consumer_key token_key token_secret extra
IFS=':' read -r consumer_key token_key token_secret extra <<< "$MAAS_API_KEY"
if [[ -n "${extra:-}" || -z "${consumer_key}" || -z "${token_key}" || -z "${token_secret}" ]]; then
echo "ERROR: MAAS_API_KEY must be in the format consumer_key:token_key:token_secret"
exit 1
fi

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants