[SREP-3811] Add change-volume-type command for EBS volume migration#857
[SREP-3811] Add change-volume-type command for EBS volume migration#857charlesgong wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new CLI command Changes
Sequence DiagramsequenceDiagram
participant User as User/CLI
participant Cmd as Command Handler
participant OCM as OCM Service
participant AWS as AWS EC2
User->>Cmd: Invoke change-volume-type (flags)
Cmd->>Cmd: Validate inputs (clusterID, targetType, IOPS/throughput)
Cmd->>OCM: Init client & fetch cluster
OCM-->>Cmd: Cluster details (provider, infraID, region)
Cmd->>AWS: DescribeVolumes (volume info)
AWS-->>Cmd: Volume details (type, size, tags, IOPS, throughput, state)
Cmd->>Cmd: Verify ownership, current type, modification status
alt Dry-run
Cmd-->>User: Print prospective change summary
else Execute change
Cmd->>AWS: ModifyVolume (VolumeType [+ IOPS/Throughput])
AWS-->>Cmd: Modification started ack
loop Poll (periodic, up to ~60min)
Cmd->>AWS: DescribeVolumesModifications
AWS-->>Cmd: Modification status (modifying / completed / failed)
end
Cmd->>AWS: DescribeVolumes (final state)
AWS-->>Cmd: Final volume info
Cmd-->>User: Report result or error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: charlesgong The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (3)
cmd/cluster/changevolumetype_test.go (3)
46-61: Default-value test is disconnected from actual runtime defaults.This test checks zero-values on a raw struct, but runtime defaults come from Cobra flag binding in
newCmdChangeVolumeType(). It gives weak signal on real CLI behavior.Please assert defaults from the constructed command flags instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cluster/changevolumetype_test.go` around lines 46 - 61, The test TestChangeVolumeType_OptionsDefaults is asserting zero-values on a raw changeVolumeTypeOptions struct instead of the actual CLI defaults; change it to construct the command via newCmdChangeVolumeType(), then read the bound flag defaults from that command (e.g., using cmd.Flags().GetInt/GetBool or Flags().Lookup("iops"/"throughput"/"dry-run").Value) and assert those values match expected runtime defaults; keep the test name and validate the iops, throughput and dry-run flag defaults from the command rather than the raw struct so it reflects real CLI behavior.
7-44: This test is tautological and doesn’t validate production behavior.
TestChangeVolumeType_ValidateTargetTypeduplicates a local list/loop instead of testing command/runtime validation. It can pass while real validation logic regresses.Refactor to assert behavior through a shared validation helper used by production code.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cluster/changevolumetype_test.go` around lines 7 - 44, The test TestChangeVolumeType_ValidateTargetType is tautological because it reimplements validation logic; update it to call the actual production validation helper (e.g., ValidateVolumeType, validateTargetType, or the command's Validate method) instead of using the local validTypes slice and loop. Replace the local loop with calls to the production validator for each tt.targetType and assert that the returned error presence matches tt.expectError (use t.Run as-is), ensuring the test imports or references the real validation function used by the change-volume-type command.
83-85: Required-flag behavior should be asserted, not left as a note.Line 83-85 acknowledges missing verification for required flags. That leaves a regression gap for a safety-critical command.
Please add an assertion that required-flag enforcement metadata/behavior is present.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cluster/changevolumetype_test.go` around lines 83 - 85, Add an actual assertion that the command enforces required flags by exercising the Cobra command rather than relying on metadata notes: in changevolumetype_test.go (e.g., inside TestChangeVolumeTypeCmd or the existing test block that builds the command via the function that creates the change-volume-type command) execute the command with required flags omitted and assert Execute() returns a non-nil error and the error message indicates a missing/required flag (e.g., contains "required" or "required flag"). This verifies runtime enforcement of MarkFlagRequired behavior instead of leaving it as a comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cluster/changevolumetype.go`:
- Around line 247-304: The monitoring loop that polls
awsClient.DescribeVolumesModifications (the for i := 0; i < 120; i++ loop using
ticker and o.volumeID) can exit due to timeout without ever reaching
VolumeModificationStateCompleted, yet the function always prints "Volume type
change completed successfully"; fix this by introducing an explicit completion
status flag (e.g., completed bool) set to true only when state ==
types.VolumeModificationStateCompleted, and set an error/return when the loop
finishes without completed (or return a “still in progress” error) instead of
falling through to the success message; update the final status block and the
final fmt.Printf + return to reflect failure/unknown when completed is false and
still return nil only when completed is true.
- Around line 147-156: After calling awsClient.DescribeVolumes for o.volumeID,
validate ownership by inspecting describeOutput.Volumes (iterate through
describeOutput.Volumes) and checking the volume Tags for the cluster identifier
(e.g., presence of a tag like "kubernetes.io/cluster/<clusterName>" or a
"Cluster"/"cluster-id" tag matching o.clusterName or o.clusterID); if no
matching tag is found return an error refusing to modify the volume. Perform
this check before calling ModifyVolume to ensure the volume belongs to the
target cluster; reference awsClient.DescribeVolumes, describeOutput.Volumes,
o.volumeID and the code path that calls ModifyVolume when implementing the
guard.
- Around line 78-79: The flags for iops and throughput are currently given
defaults and then unconditionally applied, causing silent overrides; update the
change-volume-type logic to treat --iops and --throughput as opt-in by checking
cmd.Flags().Changed("iops") and cmd.Flags().Changed("throughput") (where the
code currently reads/assigns ops.iops and ops.throughput in the migration/apply
section) and only set those fields on the target volume/request when the
corresponding flag was explicitly provided; leave existing values untouched when
the flag is not changed.
- Around line 64-68: The command currently swallows errors by using Run and
printing them; change it to return errors so Cobra exits non-zero on failure:
replace the Run func with a RunE func that returns ops.run() (e.g., RunE:
func(cmd *cobra.Command, args []string) error { return ops.run() }), ensuring
any existing fmt.Printf("Error: ...") is removed so the error is propagated to
the CLI caller.
- Around line 126-127: The cloud provider check in changevolumetype.go uses a
case-sensitive comparison against "aws" which will reject providers like "AWS";
update the check in the function that currently calls
cluster.CloudProvider().ID() to perform a case-insensitive comparison (e.g., use
strings.EqualFold or compare strings.ToLower(cluster.CloudProvider().ID()) ==
"aws"), adjust the error return to include the actual provider ID as before, and
ensure the strings package is imported if not already.
---
Nitpick comments:
In `@cmd/cluster/changevolumetype_test.go`:
- Around line 46-61: The test TestChangeVolumeType_OptionsDefaults is asserting
zero-values on a raw changeVolumeTypeOptions struct instead of the actual CLI
defaults; change it to construct the command via newCmdChangeVolumeType(), then
read the bound flag defaults from that command (e.g., using
cmd.Flags().GetInt/GetBool or
Flags().Lookup("iops"/"throughput"/"dry-run").Value) and assert those values
match expected runtime defaults; keep the test name and validate the iops,
throughput and dry-run flag defaults from the command rather than the raw struct
so it reflects real CLI behavior.
- Around line 7-44: The test TestChangeVolumeType_ValidateTargetType is
tautological because it reimplements validation logic; update it to call the
actual production validation helper (e.g., ValidateVolumeType,
validateTargetType, or the command's Validate method) instead of using the local
validTypes slice and loop. Replace the local loop with calls to the production
validator for each tt.targetType and assert that the returned error presence
matches tt.expectError (use t.Run as-is), ensuring the test imports or
references the real validation function used by the change-volume-type command.
- Around line 83-85: Add an actual assertion that the command enforces required
flags by exercising the Cobra command rather than relying on metadata notes: in
changevolumetype_test.go (e.g., inside TestChangeVolumeTypeCmd or the existing
test block that builds the command via the function that creates the
change-volume-type command) execute the command with required flags omitted and
assert Execute() returns a non-nil error and the error message indicates a
missing/required flag (e.g., contains "required" or "required flag"). This
verifies runtime enforcement of MarkFlagRequired behavior instead of leaving it
as a comment.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4690c43f-bc48-426c-9e6f-66d48a34e24d
📒 Files selected for processing (3)
cmd/cluster/changevolumetype.gocmd/cluster/changevolumetype_test.gocmd/cluster/cmd.go
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/osdctl_cluster_change-volume-type.md`:
- Line 17: Update the user-facing sentence to improve grammar by replacing
"Requires elevated backplane access (--reason flag)" with "Requires backplane
elevation (--reason flag)"; locate the line in
docs/osdctl_cluster_change-volume-type.md that mentions the --reason flag and
perform this exact text swap so the description reads "Requires backplane
elevation (--reason flag)".
- Line 49: The flag description for -t, --target-type currently lists only (gp3,
gp2, io1, io2) but the implementation and validation allow st1 and sc1 as well;
update the documentation text for the -t/--target-type flag to include st1 and
sc1 and ensure it matches the actual validation set used by the changeVolumeType
validation logic (the --target-type flag and its validation list).
- Around line 19-43: The examples and command/help snippets for the osdctl
cluster change-volume-type documentation lack explicit fence languages; update
the example blocks that begin with the sample osdctl cluster change-volume-type
invocations to use ```bash fences and wrap the plain command/flags blocks (the
triple-backtick blocks showing "osdctl cluster change-volume-type --cluster-id
..." and the "### Options" help block) with ```text fences so markdownlint
passes and rendering is clear; locate the blocks around the example invocations
and the command/help snippets in docs/osdctl_cluster_change-volume-type.md and
replace their opening/closing triple backticks with the appropriate
language-tagged fences.
In `@docs/README.md`:
- Line 1411: The README flag help for "-t, --target-type string" is missing st1
and sc1; update the documented target volume types to include "st1" and "sc1" so
it reads something like "(gp3, gp2, io1, io2, st1, sc1)"; also scan for the same
flag/usage string in the CLI implementation (the code that defines the
-t/--target-type flag) and ensure the help text and any validation logic in that
implementation accept st1 and sc1 to keep docs and behavior consistent.
- Around line 1371-1395: The Examples/Flags markdown is breaking lint because
the example commands are not inside a fenced code block and markdown headings
are being parsed; wrap the example lines (the "Examples:" label and both osdctl
command examples) in a fenced block annotated with bash (use ```bash ... ```),
change the single-line usage snippet to a fenced block annotated with text
(```text ... ```), and ensure the Flags section's code fence is properly
opened/closed so the Flags list remains inside its own fenced block; locate the
Examples/Flags content around the existing osdctl cluster change-volume-type
examples and update those three fences accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fbfc4a59-e7c5-4e83-96f0-5a8c81f798de
📒 Files selected for processing (3)
docs/README.mddocs/osdctl_cluster.mddocs/osdctl_cluster_change-volume-type.md
| Examples: | ||
| # Migrate a master volume from io1 to gp3 | ||
| osdctl cluster change-volume-type \ | ||
| --cluster-id 2abc123def456 \ | ||
| --volume-id vol-1234567890abcdef0 \ | ||
| --target-type gp3 \ | ||
| --iops 3000 \ | ||
| --throughput 125 \ | ||
| --reason "SREP-3811 - Master volume cost optimization" | ||
|
|
||
| # Dry run to preview changes | ||
| osdctl cluster change-volume-type \ | ||
| --cluster-id 2abc123def456 \ | ||
| --volume-id vol-1234567890abcdef0 \ | ||
| --target-type gp3 \ | ||
| --reason "SREP-3811" \ | ||
| --dry-run | ||
|
|
||
| ``` | ||
| osdctl cluster change-volume-type --cluster-id <cluster-id> --volume-id <volume-id> --target-type <type> [flags] | ||
| ``` | ||
|
|
||
| ### Options | ||
|
|
||
| ``` |
There was a problem hiding this comment.
Wrap examples and command blocks with explicit markdown fence languages.
Current formatting causes markdownlint failures and reduces render clarity. Use bash for examples and text for flag/help blocks.
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 20-20: Headings must start at the beginning of the line
(MD023, heading-start-left)
[warning] 29-29: Headings must start at the beginning of the line
(MD023, heading-start-left)
[warning] 37-37: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
[warning] 41-41: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
[warning] 43-43: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/osdctl_cluster_change-volume-type.md` around lines 19 - 43, The examples
and command/help snippets for the osdctl cluster change-volume-type
documentation lack explicit fence languages; update the example blocks that
begin with the sample osdctl cluster change-volume-type invocations to use
```bash fences and wrap the plain command/flags blocks (the triple-backtick
blocks showing "osdctl cluster change-volume-type --cluster-id ..." and the "###
Options" help block) with ```text fences so markdownlint passes and rendering is
clear; locate the blocks around the example invocations and the command/help
snippets in docs/osdctl_cluster_change-volume-type.md and replace their
opening/closing triple backticks with the appropriate language-tagged fences.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
docs/README.md (1)
1372-1374:⚠️ Potential issue | 🟡 MinorAdd explicit fence languages for the new usage/options blocks.
The new fenced blocks are missing language identifiers, which triggers markdownlint (MD040).
🛠️ Proposed doc fix
-``` +```text osdctl cluster change-volume-type --cluster-id <cluster-id> --volume-id <volume-id> --target-type <type> [flags]@@
-+text
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--cluster string The name of the kubeconfig cluster to use
-C, --cluster-id string Cluster ID (internal ID or external ID)
@@
-t, --target-type string Target volume type (gp2, gp3, io1, io2, st1, sc1)
--throughput int32 Throughput in MB/s (for gp3 only). If not specified, uses AWS defaults.
-v, --volume-id string EBS volume ID (e.g., vol-1234567890abcdef0)Also applies to: 1378-1397
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/README.md` around lines 1372 - 1374, The fenced code blocks showing the osdctl cluster change-volume-type usage and options are missing language identifiers causing markdownlint MD040; update the opening fences for both the usage block (the block containing "osdctl cluster change-volume-type --cluster-id <cluster-id> --volume-id <volume-id> --target-type <type> [flags]") and the options block (the block listing flags like "--as string", "--cluster string", "-C, --cluster-id string", "-t, --target-type string", etc.) to include an explicit language (e.g., use ```text) for each fenced block (also apply the same change to the other occurrence mentioned around the second block).docs/osdctl_cluster_change-volume-type.md (1)
20-22:⚠️ Potential issue | 🟡 MinorUse language-tagged fences for usage/examples/options blocks.
These blocks currently fail markdownlint MD040 and render less clearly without explicit fence languages.
🛠️ Proposed doc fix
-``` +```text osdctl cluster change-volume-type --cluster-id <cluster-id> --volume-id <volume-id> --target-type <type> [flags]@@
-+bashMigrate a master volume from io1 to gp3 with custom IOPS
osdctl cluster change-volume-type
@@
osdctl cluster change-volume-type
--cluster-id 2abc123def456
--volume-id vol-1234567890abcdef0
--target-type gp3
--reason "SREP-3811"
--dry-run@@ -``` +```text -C, --cluster-id string Cluster ID (internal ID or external ID) --dry-run Dry run - show what would be changed without executing @@ -t, --target-type string Target volume type (gp2, gp3, io1, io2, st1, sc1) --throughput int32 Throughput in MB/s (for gp3 only). If not specified, uses AWS defaults. -v, --volume-id string EBS volume ID (e.g., vol-1234567890abcdef0)@@
-+text
--as string Username to impersonate for the operation. User could be a regular user or a service account in a namespace.
--cluster string The name of the kubeconfig cluster to use
@@
--skip-aws-proxy-check aws_proxy Don't use the configured aws_proxy value
-S, --skip-version-check skip checking to see if this is the most recent releaseAlso applies to: 26-50, 54-63, 67-78
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/osdctl_cluster_change-volume-type.md` around lines 20 - 22, Update the code fences for the usage, examples and options blocks of the osdctl cluster change-volume-type documentation so they include appropriate language tags (e.g., use ```bash for command examples and ```text for plain output/option lists) — locate the blocks containing the usage line "osdctl cluster change-volume-type --cluster-id <cluster-id> --volume-id <volume-id> --target-type <type> [flags]" and the example commands shown with backslashes and the options list (flags like --dry-run, --target-type, --throughput, --volume-id, --as, --cluster, --skip-aws-proxy-check, --skip-version-check) and replace the plain triple-backtick fences with the correct fenced language markers (bash or text) for those blocks across the file.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cluster/changevolumetype.go`:
- Around line 112-123: Add explicit validation that rejects unsupported
IOPS/throughput flag combinations instead of silently omitting them: in the
change volume type command where targetType is validated (references:
o.targetType, validTypes slice) and in the later block handling performance
params (references: o.Iops, o.Throughput), check the chosen targetType’s allowed
parameters (e.g., gp3 supports both iops and throughput, gp2/io1/io2 have
specific rules) and return an error when a user supplies --iops or --throughput
that are not valid for the selected o.targetType; ensure the error is returned
before building the API request so dry-run and real runs both fail loudly with a
descriptive message.
- Around line 183-201: The current ownership check (loop over volume.Tags using
strings.Contains on *tag.Key and *tag.Value) is too permissive; update the logic
in changevolumetype.go (where infraID := cluster.InfraID(), clusterOwned
variable and the loop over volume.Tags) to perform exact matching: require
*tag.Key == "kubernetes.io/cluster/"+infraID for the cluster tag and require
exact equality on the Name tag value (*tag.Key == "Name" && *tag.Value ==
infraID) (or another strict equality rule your project prefers) instead of
strings.Contains, so only resources with precise matching tags are considered
cluster-owned.
---
Duplicate comments:
In `@docs/osdctl_cluster_change-volume-type.md`:
- Around line 20-22: Update the code fences for the usage, examples and options
blocks of the osdctl cluster change-volume-type documentation so they include
appropriate language tags (e.g., use ```bash for command examples and ```text
for plain output/option lists) — locate the blocks containing the usage line
"osdctl cluster change-volume-type --cluster-id <cluster-id> --volume-id
<volume-id> --target-type <type> [flags]" and the example commands shown with
backslashes and the options list (flags like --dry-run, --target-type,
--throughput, --volume-id, --as, --cluster, --skip-aws-proxy-check,
--skip-version-check) and replace the plain triple-backtick fences with the
correct fenced language markers (bash or text) for those blocks across the file.
In `@docs/README.md`:
- Around line 1372-1374: The fenced code blocks showing the osdctl cluster
change-volume-type usage and options are missing language identifiers causing
markdownlint MD040; update the opening fences for both the usage block (the
block containing "osdctl cluster change-volume-type --cluster-id <cluster-id>
--volume-id <volume-id> --target-type <type> [flags]") and the options block
(the block listing flags like "--as string", "--cluster string", "-C,
--cluster-id string", "-t, --target-type string", etc.) to include an explicit
language (e.g., use ```text) for each fenced block (also apply the same change
to the other occurrence mentioned around the second block).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 80cb2fe3-19f2-4dba-a57b-0e4e541f66d0
📒 Files selected for processing (4)
cmd/cluster/changevolumetype.gocmd/cluster/changevolumetype_test.godocs/README.mddocs/osdctl_cluster_change-volume-type.md
🚧 Files skipped from review as they are similar to previous changes (1)
- cmd/cluster/changevolumetype_test.go
cmd/cluster/changevolumetype.go
Outdated
| // Validate volume belongs to this cluster | ||
| clusterOwned := false | ||
| infraID := cluster.InfraID() | ||
| for _, tag := range volume.Tags { | ||
| if tag.Key != nil && tag.Value != nil { | ||
| // Check for kubernetes.io/cluster/<infra-id> tag | ||
| if strings.HasPrefix(*tag.Key, "kubernetes.io/cluster/") && strings.Contains(*tag.Key, infraID) { | ||
| clusterOwned = true | ||
| break | ||
| } | ||
| // Also check for Name tag containing infra ID | ||
| if *tag.Key == "Name" && strings.Contains(*tag.Value, infraID) { | ||
| clusterOwned = true | ||
| break | ||
| } | ||
| } | ||
| } | ||
| if !clusterOwned { | ||
| return fmt.Errorf("volume %s does not belong to cluster %s (infra ID: %s). Refusing to modify volume that may belong to another cluster", o.volumeID, cluster.ID(), infraID) |
There was a problem hiding this comment.
Harden volume ownership validation with exact tag matching.
Using strings.Contains on tag key/value can match unrelated resources that happen to include the infra ID substring. For a mutating operation, this check should be strict to avoid cross-cluster volume modification risk.
🛠️ Proposed ownership-check hardening
// Validate volume belongs to this cluster
clusterOwned := false
infraID := cluster.InfraID()
+ expectedClusterTagKey := "kubernetes.io/cluster/" + infraID
for _, tag := range volume.Tags {
if tag.Key != nil && tag.Value != nil {
- // Check for kubernetes.io/cluster/<infra-id> tag
- if strings.HasPrefix(*tag.Key, "kubernetes.io/cluster/") && strings.Contains(*tag.Key, infraID) {
+ // Prefer exact ownership tag match
+ if *tag.Key == expectedClusterTagKey {
clusterOwned = true
break
}
- // Also check for Name tag containing infra ID
- if *tag.Key == "Name" && strings.Contains(*tag.Value, infraID) {
+ // Optional fallback: tighten Name tag match pattern
+ if *tag.Key == "Name" && strings.HasPrefix(*tag.Value, infraID+"-") {
clusterOwned = true
break
}
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/cluster/changevolumetype.go` around lines 183 - 201, The current
ownership check (loop over volume.Tags using strings.Contains on *tag.Key and
*tag.Value) is too permissive; update the logic in changevolumetype.go (where
infraID := cluster.InfraID(), clusterOwned variable and the loop over
volume.Tags) to perform exact matching: require *tag.Key ==
"kubernetes.io/cluster/"+infraID for the cluster tag and require exact equality
on the Name tag value (*tag.Key == "Name" && *tag.Value == infraID) (or another
strict equality rule your project prefers) instead of strings.Contains, so only
resources with precise matching tags are considered cluster-owned.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cluster/changevolumetype.go`:
- Line 90: The CLI flag ops.reason is required but never passed into the
elevation/auth flow; update the run() path to forward the reason into the cloud
config/elevation call: when invoking osdCloud.CreateAWSV2Config(connection,
cluster) (and any intermediate helper that prepares elevation), add a reason
parameter (e.g., pass ops.reason) and update the CreateAWSV2Config signature and
callers to accept and propagate that reason so the AWS elevation call receives
the audit reason; alternatively, if you prefer not to enforce it yet, remove the
required flag declaration for --reason (cmd.Flags().StringVar(&ops.reason,
"reason", ...)) so it is optional until wired.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a275a1db-9123-45de-8fd1-9abe963bc753
📒 Files selected for processing (1)
cmd/cluster/changevolumetype.go
| cmd.Flags().StringVarP(&ops.clusterID, "cluster-id", "C", "", "Cluster ID (internal ID or external ID)") | ||
| cmd.Flags().StringVarP(&ops.volumeID, "volume-id", "v", "", "EBS volume ID (e.g., vol-1234567890abcdef0)") | ||
| cmd.Flags().StringVarP(&ops.targetType, "target-type", "t", "", "Target volume type (gp2, gp3, io1, io2, st1, sc1)") | ||
| cmd.Flags().StringVar(&ops.reason, "reason", "", "Reason for elevation (OHSS/PD/JIRA ticket)") |
There was a problem hiding this comment.
--reason is required but not used in the execution path.
Line 90/Line 101 require an elevation reason, but o.reason is never consumed in run(). The current AWS setup call at Line 172 (osdCloud.CreateAWSV2Config(connection, cluster)) also has no reason input (see pkg/osdCloud/aws.go:183-197), so audit/elevation attribution is not actually enforced by this command path. Please wire --reason into the elevation/auth flow (or stop requiring it until it is wired).
Also applies to: 101-101, 106-176
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/cluster/changevolumetype.go` at line 90, The CLI flag ops.reason is
required but never passed into the elevation/auth flow; update the run() path to
forward the reason into the cloud config/elevation call: when invoking
osdCloud.CreateAWSV2Config(connection, cluster) (and any intermediate helper
that prepares elevation), add a reason parameter (e.g., pass ops.reason) and
update the CreateAWSV2Config signature and callers to accept and propagate that
reason so the AWS elevation call receives the audit reason; alternatively, if
you prefer not to enforce it yet, remove the required flag declaration for
--reason (cmd.Flags().StringVar(&ops.reason, "reason", ...)) so it is optional
until wired.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
cmd/cluster/changevolumetype.go (2)
156-176:⚠️ Potential issue | 🟠 Major
--reasonremains display-only, not enforced in elevation/auth flow.
o.reasonis printed, but the execution path still callsosdCloud.CreateAWSV2Config(connection, cluster)without carrying reason metadata, so backend audit attribution is not guaranteed.Suggested direction
- cfg, err := osdCloud.CreateAWSV2Config(connection, cluster) + cfg, err := osdCloud.CreateAWSV2Config(connection, cluster, o.reason)And update
pkg/osdCloud/aws.go+ underlying backplane query config plumbing to propagate reason through elevation/auth calls.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cluster/changevolumetype.go` around lines 156 - 176, The CLI only prints o.reason but does not propagate it into the elevation/auth flow; modify the call site and plumbing so the reason is carried into backend auth: change the CreateAWSV2Config call in changevolumetype (where o.reason is printed) to pass the reason (e.g., CreateAWSV2Config(connection, cluster, o.reason)), update the signature of osdCloud.CreateAWSV2Config and its implementations in pkg/osdCloud/aws.go to accept and forward the reason string through any elevation/auth request builders, and ensure the backplane/elevation request metadata includes that reason so audit attribution is preserved.
193-212:⚠️ Potential issue | 🔴 CriticalOwnership fallback is still too permissive for a mutating operation.
Allowing
Nametag prefix (infraID + "-") can still match unrelated resources. For cross-cluster safety, rely on strict ownership tag matching only.Safer ownership check
clusterOwned := false infraID := cluster.InfraID() expectedClusterTagKey := "kubernetes.io/cluster/" + infraID for _, tag := range volume.Tags { if tag.Key != nil && tag.Value != nil { - // Exact match on kubernetes.io/cluster/<infra-id> ownership tag - if *tag.Key == expectedClusterTagKey { + // Exact ownership tag key + expected ownership values + if *tag.Key == expectedClusterTagKey && (*tag.Value == "owned" || *tag.Value == "shared") { clusterOwned = true break } - // Fallback: Name tag with infra ID prefix - if *tag.Key == "Name" && strings.HasPrefix(*tag.Value, infraID+"-") { - clusterOwned = true - break - } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/cluster/changevolumetype.go` around lines 193 - 212, The ownership check is too permissive because it accepts a Name tag prefix match; update the logic in the block that computes clusterOwned (which uses cluster.InfraID(), expectedClusterTagKey, and iterates volume.Tags) to only set clusterOwned when an exact match on the kubernetes.io/cluster/<infra-id> tag key is found and remove the fallback that checks for *tag.Key == "Name" and strings.HasPrefix(*tag.Value, infraID+"-"); keep the existing error return using o.volumeID, cluster.ID(), and infraID when clusterOwned is false.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/cluster/changevolumetype.go`:
- Around line 321-355: The loop currently treats an empty
modOutput.VolumesModifications by breaking out and ultimately triggering a
timeout; instead handle transient empty lists by continuing to poll: in the loop
that reads modOutput.VolumesModifications, replace the break when
len(modOutput.VolumesModifications) == 0 with a continue (optionally logging
that the list was empty and sleeping/polling again), so you don't prematurely
exit before latestMod/ModificationState checks (references:
modOutput.VolumesModifications, latestMod, ModificationState, completed,
VolumeModificationStateCompleted/Failed).
---
Duplicate comments:
In `@cmd/cluster/changevolumetype.go`:
- Around line 156-176: The CLI only prints o.reason but does not propagate it
into the elevation/auth flow; modify the call site and plumbing so the reason is
carried into backend auth: change the CreateAWSV2Config call in changevolumetype
(where o.reason is printed) to pass the reason (e.g.,
CreateAWSV2Config(connection, cluster, o.reason)), update the signature of
osdCloud.CreateAWSV2Config and its implementations in pkg/osdCloud/aws.go to
accept and forward the reason string through any elevation/auth request
builders, and ensure the backplane/elevation request metadata includes that
reason so audit attribution is preserved.
- Around line 193-212: The ownership check is too permissive because it accepts
a Name tag prefix match; update the logic in the block that computes
clusterOwned (which uses cluster.InfraID(), expectedClusterTagKey, and iterates
volume.Tags) to only set clusterOwned when an exact match on the
kubernetes.io/cluster/<infra-id> tag key is found and remove the fallback that
checks for *tag.Key == "Name" and strings.HasPrefix(*tag.Value, infraID+"-");
keep the existing error return using o.volumeID, cluster.ID(), and infraID when
clusterOwned is false.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 998dc627-0235-49e7-ba3b-fbef2232e71e
📒 Files selected for processing (1)
cmd/cluster/changevolumetype.go
Add a new osdctl cluster subcommand to change EBS volume types (e.g., io1 to gp3) for AWS cluster volumes without downtime. Features: - Online volume modification using AWS Elastic Volumes API - Supports all EBS types: gp2, gp3, io1, io2, st1, sc1 - Dry-run mode for previewing changes - Real-time progress monitoring with 60-minute timeout - Strict volume ownership validation via kubernetes.io/cluster tags - Early rejection of invalid --iops/--throughput combinations - Elevation reason tracking for SRE audit trail Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@charlesgong: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| // newCmdChangeVolumeType creates the cobra command for changing an EBS volume type | ||
| // on an AWS cluster. It registers required flags (cluster-id, volume-id, target-type, reason) | ||
| // and optional performance flags (iops, throughput, dry-run). | ||
| func newCmdChangeVolumeType() *cobra.Command { |
There was a problem hiding this comment.
This was not the intent of the work prescribed in SREP-3811. Since we use CPMS, we can just change the root volume type and let CPMS rollout new nodes with that type, vs doing an in-place migration.
This commit adds a new 'cluster change-volume-type' command to osdctl to support migrating EBS volumes from one type to another (e.g., io1 to gp3) for cost optimization and performance tuning.
Features:
This command is particularly useful for SREP-3811 - migrating master node volumes from io1 to gp3 for cost optimization while maintaining performance.
Examples:
osdctl cluster change-volume-type
--cluster-id 2abc123def456
--volume-id vol-1234567890abcdef0
--target-type gp3
--iops 3000
--throughput 125
--reason "SREP-3811 - Master volume cost optimization"
JIRA: SREP-3811
Summary by CodeRabbit