Skip to content

Heat stuck at bastion #397

@ghost

Description

Hi Everybody,

I am trying to deploy OCP 3.5 (even 3.7) on OSP 11 from Red Hat.
When I run the heat script, it does create the stack, all the necessary networks are created and creates the bastion and does the usual cloud-init provisioning steps (adding repos, updating, installation basic packages) and cloud init send the finished signal and get the HTTP 200.

After that, it get stuck at

$> openstack stack resource list -n 2 ocp2 | grep -i progress
| bastion_host                    | 98bd1fee-87c3-4360-bd4b-549e39d1345e | file:///Users/myself/projects/openshift-on-openstack/bastion.yaml                                              | CREATE_IN_PROGRESS | 2017-12-21T16:00:41Z | ocp2                                                     |
| deployment_write_templates      | c8be1435-3125-4e06-8234-b620dd556fa8 | OS::Heat::SoftwareDeployment                                                                                        | CREATE_IN_PROGRESS | 2017-12-21T16:01:12Z | ocp2-bastion_host-n4vsl5fz4maw                           |
| deployment_update_node_count    | 79327e5c-579d-4a95-a0b4-e93c52385afd | OS::Heat::SoftwareDeployment                                                                                        | CREATE_IN_PROGRESS | 2017-12-21T16:01:12Z | ocp2-bastion_host-n4vsl5fz4maw                           |
| deployment_tune_ansible         | a705f997-3cf0-44aa-90f1-af21e3a23ca1 | OS::Heat::SoftwareDeployment

If I force the signal with openstack heat resource signal ... it goes to the next step but I see that the ansible template isn't create and the usual pushed files aren't present.
The /etc/os-collect-config.conf points to the good endpoint:

$> cat /etc/os-collect-config.conf
[DEFAULT]
command = os-refresh-config
collectors = ec2
collectors = cfn
collectors = local

[cfn]
metadata_url = https://10.1.3.11:13005/v1/
stack_name = ocp2-bastion_host-n4vsl5fz4maw
secret_access_key = 7e7214750d1a48c9a4cad81010fe2173
access_key_id = 494ab1ed83b441168423aec7d868267c
path = host.Metadata
$> openstack endpoint list | grep heat
| 1b24a4cf65a74e38992c4d8230a6e7da | regionOne | heat-cfn     | cloudformation | True    | internal  | http://172.17.1.16:8000/v1               |
| 2f666c5f3f25445682d8cc6ca51f9488 | regionOne | heat         | orchestration  | True    | admin     | http://172.17.1.16:8004/v1/%(tenant_id)s |
| 557a1fc9ff2549a8bc142bd305ac26bb | regionOne | heat-cfn     | cloudformation | True    | public    | https://10.1.3.11:13005/v1               |
| 622df692e35b424b93cd24f54c577df4 | regionOne | heat         | orchestration  | True    | public    | https://10.1.3.11:13004/v1/%(tenant_id)s |
| da4ed879390b4b6c9d97e114aa011f49 | regionOne | heat         | orchestration  | True    | internal  | http://172.17.1.16:8004/v1/%(tenant_id)s |
| fba19a090ed6437f86513a91e9cdc0ba | regionOne | heat-cfn     | cloudformation | True    | admin     | http://172.17.1.16:8000/v1

After few hours, it times out and the stack is failed.

Does anyone might have a clue why?

Thanks a lot for your support
P.

parameters.yaml

parameters:
  ssh_key_name: myself
  bastion_image: rhel-guest-image-7.2-20160302.0.x86_64
  bastion_flavor: m1.medium
  master_image: rhel-guest-image-7.2-20160302.0.x86_64
  master_flavor: m1.medium
  infra_image: rhel-atomic-cloud-7.2-10.x86_64
  infra_flavor: m1.medium
  node_image: rhel-atomic-cloud-7.2-10.x86_64
  node_flavor: m1.medium
  loadbalancer_image: rhel-atomic-cloud-7.2-10.x86_64
  loadbalancer_flavor: m1.medium
  ocp_version: 3.5
  osp_version: 11

  external_network: internet_access
  container_subnet: 192.168.1.0/24
  loadbalancer_type: neutron

  dns_nameserver: 8.8.4.4,8.8.8.8
  node_count: 2

  rhn_username: ""
  rhn_password: "."
  rhn_pool: ""
  extra_rhn_pools: ""
  deployment_type: openshift-enterprise
  domain_name: "example.com"
  master_hostname: "openshift-master"
  node_hostname: "openshift-node"
  ssh_user: cloud-user
  master_docker_volume_size_gb: 25
  infra_docker_volume_size_gb: 25
  node_docker_volume_size_gb: 25

  system_update: false

resource_registry:
  #OOShift::LoadBalancer: ../openshift-on-openstack/loadbalancer_dedicated.yaml
  OOShift::LoadBalancer: ../openshift-on-openstack/loadbalancer_neutron.yaml
  OOShift::ContainerPort: ../openshift-on-openstack/sdn_openshift_sdn.yaml
  OOShift::IPFailover: ../openshift-on-openstack/ipfailover_keepalived.yaml
  OOShift::DockerVolume: ../openshift-on-openstack/volume_docker.yaml
  OOShift::DockerVolumeAttachment: ../openshift-on-openstack/volume_attachment_docker.yaml
  OOShift::RegistryVolume: ../openshift-on-openstack/registry_ephemeral.yaml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions