Skip to content

Conversation

@rabi
Copy link
Contributor

@rabi rabi commented Feb 10, 2026

For pre-provisioned dataplane nodes, ansibleHost must be a valid IP address. This is required because the controller defaults the ctlplane network fixedIP from ansibleHost during IPAM reservation, ensuring the reserved IP matches the already-configured node interface. Without a valid IP, IPAM could reserve a different address and break connectivity during deployment.

The validating webhook enforces that for pre-provisioned nodes:

  • ansibleHost is not empty
  • ansibleHost is a valid IP address

The controller (ipam.go) defaults the ctlplane fixedIP from ansibleHost when not already set. The ctlplane network is identified using the netServiceNetMap from NetConfig, so it works regardless of the network name.

@openshift-ci openshift-ci bot requested review from fultonj and slagle February 10, 2026 06:45
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 10, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rabi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rabi
Copy link
Contributor Author

rabi commented Feb 10, 2026

Probably a flaky test?

        echo "=== Checking global ApplicationCredential is enabled ==="
        global_enabled=$(oc get openstackcontrolplane openstack -n "$NS" -o jsonpath='{.spec.applicationCredential.enabled}')
        if [ "$global_enabled" != "true" ]; then
          echo "ERROR: OpenStackControlPlane.spec.applicationCredential.enabled expected 'true', got '$global_enabled'"
          exit 1
        fi
        echo "✓ OpenStackControlPlane.spec.applicationCredential.enabled = true"
        echo

@rabi
Copy link
Contributor Author

rabi commented Feb 10, 2026

/test openstack-operator-build-deploy-kuttl-4-18

@rabi rabi force-pushed the ctlplane_fixed_ip branch from 0ed8bf0 to 496f239 Compare February 10, 2026 13:38
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/27c287f32bb74b7e8ee5a39af21bb77a

openstack-k8s-operators-content-provider NODE_FAILURE Node request 100-0008160556 failed in 0s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

… nodes

For pre-provisioned dataplane nodes, ansibleHost must be a valid IP
address. This is required because the controller defaults the ctlplane
network fixedIP from ansibleHost during IPAM reservation, ensuring the
reserved IP matches the already-configured node interface. Without a
valid IP, IPAM could reserve a different address and break connectivity
during deployment.

The validating webhook enforces that for pre-provisioned nodes:
- ansibleHost is not empty
- ansibleHost is a valid IP address

The controller (ipam.go) defaults the ctlplane fixedIP from ansibleHost
when not already set. The ctlplane network is identified using the
netServiceNetMap from NetConfig, so it works regardless of the network
name.

Signed-off-by: rabi <ramishra@redhat.com>
@rabi rabi force-pushed the ctlplane_fixed_ip branch from 496f239 to 3f6c604 Compare February 10, 2026 13:41
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/4f8823cf30a84e3fa7b102deeb1cf5c8

openstack-k8s-operators-content-provider NODE_FAILURE Node request 100-0008160605 failed in 0s
⚠️ podified-multinode-edpm-deployment-crc SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ cifmw-crc-podified-edpm-baremetal SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider
⚠️ openstack-operator-tempest-multinode SKIPPED Skipped due to failed job openstack-k8s-operators-content-provider

@jdandrea
Copy link

jdandrea commented Feb 10, 2026

Hi @rabi - thanks for filing this PR!

About this part:

The validating webhook enforces that for pre-provisioned nodes:

  • ansibleHost is not empty
  • ansibleHost is a valid IP address

The controller (ipam.go) defaults the ctlplane fixedIP from ansibleHost when not already set. The ctlplane network is identified using the netServiceNetMap from NetConfig, so it works regardless of the network name.

Will this still catch the problem in our case?

Here's what we used to have:

nodes:
    edpm-compute-1:
      hostName: compute01.srv.example.com
      ansible:
        ansibleHost: 192.168.51.31
        ansibleUser: root
      networks:
      - name: ctlplane
        subnetName: subnet1
        defaultRoute: false
        fixedIP: 172.22.0.110
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
      - name: external
        subnetName: subnet1
    edpm-compute-2:
      hostName: compute02.srv.example.com
       ansible:
        ansibleHost: 192.168.51.32
        ansibleUser: root
      networks:
      - name: ctlplane
        subnetName: subnet1
        defaultRoute: false
        fixedIP: 172.22.0.111
      - name: internalapi
        subnetName: subnet1
      - name: storage
        subnetName: subnet1
      - name: tenant
        subnetName: subnet1
      - name: external
        subnetName: subnet1

Now we are using this:

nodes:
    edpm-compute-1:
      hostName: compute01.srv.example.com
      ansible:
        ansibleHost: 192.168.51.31
        ansibleUser: root
      networks:
      - name: ctlplane
        subnetName: subnet1
        defaultRoute: false
        fixedIP: 172.22.0.110
      - name: internalapi
        subnetName: subnet1
        fixedIP: 172.17.0.110
      - name: storage
        subnetName: subnet1
        fixedIP: 172.18.0.110
      - name: tenant
        subnetName: subnet1
        fixedIP: 172.19.0.110
      - name: external
        subnetName: subnet1
        fixedIP: 192.168.51.31
    edpm-compute-2:
      hostName: compute02.srv.example.com
       ansible:
        ansibleHost: 192.168.51.32
        ansibleUser: root
      networks:
      - name: ctlplane
        subnetName: subnet1
        defaultRoute: false
        fixedIP: 172.22.0.111
      - name: internalapi
        subnetName: subnet1
        fixedIP: 172.17.0.111
      - name: storage
        subnetName: subnet1
        fixedIP: 172.18.0.111
      - name: tenant
        subnetName: subnet1
        fixedIP: 172.19.0.111
      - name: external
        subnetName: subnet1
        fixedIP: 192.168.51.32

We also have this (unchanged):

        edpm_network_config_os_net_config_mappings:
          edpm-compute-1:
            nic1: "52:54:00:03:00:6e"
            nic2: "52:54:00:02:33:1f"
          edpm-compute-2:
            nic1: "52:54:00:03:00:6f"
            nic2: "52:54:00:02:33:20"

Using fixedIP in all of those spots seems to do the trick so far. I just want to be sure that this PR will prevent others from running in to the same issue we did (with 192.168.51.31 and 192.168.51.32 sometimes appearing to swap across the compute nodes).

I can share more of our osdpns via PM if that helps. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants