Skip to content

chore: improve logic in 'openshift-CI-kuttl-tests.sh'#1091

Open
jgwest wants to merge 1 commit intoredhat-developer:masterfrom
jgwest:race-condition-in-kuttl-script-feb-2026
Open

chore: improve logic in 'openshift-CI-kuttl-tests.sh'#1091
jgwest wants to merge 1 commit intoredhat-developer:masterfrom
jgwest:race-condition-in-kuttl-script-feb-2026

Conversation

@jgwest
Copy link
Member

@jgwest jgwest commented Feb 26, 2026

What type of PR is this?
/kind failing-test

What does this PR do / why we need it:

  • The logic in 'openshift-CI-kuttl-tests.sh' has a hardcoded sleep which may not be long enough for the ArgoCD CR to be reconciled to create the necessary Argo CD pods.
  • We should instead first wait for all 4 pods to exist, then use oc to wait for them to be ready
  • Plus I added additional diagnostic logic when the script fails.
  • Note: the name of the script references kuttl, but this is a misnomer. No kuttl tests are actually invoked.

Here is an example of failure case this is addressing:

+ oc apply -f -
Warning: ArgoCD v1alpha1 version is deprecated and will be converted to v1beta1 automatically. Moving forward, please use v1beta1 as the ArgoCD API version.
argocd.argoproj.io/argocd created
+ sleep 60s
+ oc get pods -n test-argocd
No resources found in test-argocd namespace.
+ oc wait --for=condition=Ready -n test-argocd pod --timeout=15m -l 'app.kubernetes.io/name in (argocd-application-controller,argocd-redis,argocd-repo-server,argocd-server)'
error: no matching resources found

Notice that after the sleep statement, there are no pods, and thus oc wait will immediately fail (because it requires the pods to exist) rather than waiting 15 minutes

Signed-off-by: Jonathan West <jgwest@gmail.com>
@openshift-ci openshift-ci bot added the kind/failing-test Categorizes issue or PR as related to a frequently failing test. label Feb 26, 2026
@openshift-ci
Copy link

openshift-ci bot commented Feb 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign svghadi for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jgwest jgwest changed the title chore: improve racy logic in 'openshift-CI-kuttl-tests.sh' chore: improve logic in 'openshift-CI-kuttl-tests.sh' Feb 26, 2026
@openshift-ci
Copy link

openshift-ci bot commented Feb 26, 2026

@jgwest: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/v4.14-kuttl-sequential 270fa7f link false /test v4.14-kuttl-sequential
ci/prow/v4.14-kuttl-parallel 270fa7f link false /test v4.14-kuttl-parallel
ci/prow/v4.14-e2e 270fa7f link true /test v4.14-e2e

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/failing-test Categorizes issue or PR as related to a frequently failing test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants