Clock conflicts and other errors when clustering

Hi there!

We are using Swarm in a cluster with 4 nodes (eks). They discover themselves (dynamically) using libcluster (Kubernetes strategy) as many people is doing right now.
I didn't expect to get that amount of warnings and errors when using Swarm.. Maybe we are doing something wrong??

To give you some examples of the warnings we receive:

```
[swarm on {app}@x.x.x.x] [tracker:handle_replica_event] received track event for "{process}", mismatched pids, local clock conflicts with remote clock, event unhandled
```

```
** (exit) exited in: :gen_statem.call(Swarm.Tracker, {:track, "{process}", %{mfa: {Module, :start_link, ["{process}", {state}]}}}, 5000)
    ** (EXIT) time out
```

```
[swarm on {app}@x.x.x.x] [tracker:ensure_swarm_started_on_remote_node] nodeup for {app}@x.x.x.x was ignored because: {:badrpc, {:EXIT, {:timeout, {:gen_server, :call, [:application_controller, :which_applications]}}}}
```

```
[swarm on {app}@x.x.x.x] [tracker:handle_topology_change] handoff failed for "{process}": {:timeout, {GenServer, :call, [#PID<0.11273.0>, {:swarm, :begin_handoff}, 5000]}}
```

and some others..

Something worrying me is also how Swarm knows where to send the handoff messages. If we are rollout restarting a deployment, does it decide to send those messages to the "new" nodes? Or maybe it's sending them to the ones that will be knocked down in a second?

Thanks in advance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clock conflicts and other errors when clustering #149

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Clock conflicts and other errors when clustering #149

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions