KVM cluster with NFS primary storage – VM HA not working when host is powered down

### problem

In a KVM cluster with NFS primary storage, VM HA does not work when a host is powered down.  

- The host status transitions to Down, HA state shows Fenced.  
- VMs from the powered-down host are not restarted on other available hosts in the cluster.  
- Both Host HA and VM HA are enabled.  
- OOB driver: IPMI.  




### Expected behavior
VMs from the failed host should be restarted on other available hosts in the cluster.  

### Actual behavior
- Host goes to `Down` and HA state `Fenced`.  
- VMs are not started elsewhere.  
- Management server logs show a `NoTransitionException`.  

### Relevant log snippet
WARN  [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-4:[ctx-c2bf501d]) (logid:96e12771) Unable to find next HA state for current HA state=[Fenced] for event=[Ineligible] for host Host {"id":4,"name":"csh-1-2.clab.run","type":"Routing","uuid":"f8f86177-f0e3-4994-8609-dd55e0e35a3e"} with id 4. com.cloud.utils.fsm.NoTransitionException: Unable to transition to a new state from Fenced via Ineligible
	at com.cloud.utils.fsm.StateMachine2.getTransition(StateMachine2.java:108)
	at com.cloud.utils.fsm.StateMachine2.getNextState(StateMachine2.java:94)
	at org.apache.cloudstack.ha.HAManagerImpl.transitionHAState(HAManagerImpl.java:153)
	at org.apache.cloudstack.ha.HAManagerImpl.validateAndFindHAProvider(HAManagerImpl.java:233)
	at org.apache.cloudstack.ha.HAManagerImpl$HAManagerBgPollTask.runInContext(HAManagerImpl.java:665)
	at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
	at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
	at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)





### versions

### Environment
- CloudStack version: 4.20.1.0 
- Hypervisor: KVM  
- Primary storage: NFS  
- HA settings: Host HA enabled, VM HA enabled, OOB driver = IPMI  

### The steps to reproduce the bug

1.1. Enable Host HA and VM HA in a KVM cluster (NFS primary storage).  
2. Power off a host that runs VMs.  
3. Observe host and VM states in the management server.  

### What to do about it?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

KVM cluster with NFS primary storage – VM HA not working when host is powered down #11627

problem

Expected behavior

Actual behavior

Relevant log snippet

versions

Environment

The steps to reproduce the bug

What to do about it?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

KVM cluster with NFS primary storage – VM HA not working when host is powered down #11627

Description

problem

Expected behavior

Actual behavior

Relevant log snippet

versions

Environment

The steps to reproduce the bug

What to do about it?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions