-
Notifications
You must be signed in to change notification settings - Fork 8
Description
Hello,
I've raised an issue on the incorrect repo and would like to bring it to the right one. Below my original post on jjneely/buckytools
. The content below is just added for everyone to have context on my initial issue jjneely/buckytools#38
I've found 2 issues which I would love to discuss:
BuckyD and bucky configuration
buckyd
will accept the members of the hashring via non-option cli arguments as buckyd <graphite1:port> <graphite2:port> ...
.
bucky
calls for the cluster configuration and it will get graphite1:hashringport
instead of graphite1:4242
because of this mismatch, bucky won't be able to reach the buckyd members
/usr/sbin/bucky servers -h go-carbon-0.go-carbon.graphite:4242
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-0.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:45902->172.16.76.119:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-0.go-carbon.graphite:2004: Get "http://go-carbon-0.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:45902->172.16.76.119:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-1.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:47772->172.16.27.82:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-1.go-carbon.graphite:2004: Get "http://go-carbon-1.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:47772->172.16.27.82:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-2.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:40946->172.16.58.32:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-2.go-carbon.graphite:2004: Get "http://go-carbon-2.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:40946->172.16.58.32:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-3.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:43570->172.16.127.44:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-3.go-carbon.graphite:2004: Get "http://go-carbon-3.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:43570->172.16.127.44:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-4.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:51674->172.16.27.91:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-4.go-carbon.graphite:2004: Get "http://go-carbon-4.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:51674->172.16.27.91:2004: read: connection reset by peer
2021/11/10 01:01:57 Error retrieving URL: Get "http://go-carbon-5.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:56712-172.16.14.79:2004: read: connection reset by peer
2021/11/10 01:01:57 Cluster unhealthy: go-carbon-5.go-carbon.graphite:2004: Get "http://go-carbon-5.go-carbon.graphite:2004/hashring": read tcp 172.16.14.79:56712->172.16.14.79:2004: read: connection reset by peer`
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 6 nodes, 100 replicas, 600 ring members go-carbon-0.go-carbon.graphite:2004=None go-carbon-1.go-carbon.graphite:2004=None go-carbon-2.go-carbon.graphite:2004=None go-carbon-3.go-carbon.graphite:2004=None go-carbon-4.go-carbon.graphite:2004=None go-carbon-5.go-carbon.graphite:2004=None]
Number of replicas: 100
Found these servers:
go-carbon-0.go-carbon.graphite:2004
go-carbon-1.go-carbon.graphite:2004
go-carbon-2.go-carbon.graphite:2004
go-carbon-3.go-carbon.graphite:2004
go-carbon-4.go-carbon.graphite:2004
go-carbon-5.go-carbon.graphite:2004
Is cluster healthy: false
2021/11/10 01:01:57 Cluster is inconsistent.
I've tracked the issue to line https://github.com/go-graphite/buckytools/blob/master/cmd/bucky/cluster.go#L88. The the port value for the cluster member is set to the same port as the hashring one instead of 4242
(or whichever port is specified by user).
To test this theory, I've forked and patched the code to set it to default 4242 and cluster is reported as healthy with the correct hashring values as below
/ # /usr/sbin/bucky servers -h go-carbon-5.go-carbon.graphite:4242
Buckd daemons are using port: 4242
Hashing algorithm: [carbon: 6 nodes, 100 replicas, 600 ring members go-carbon-0.go-carbon.graphite:2004=None go-carbon-1.go-carbon.graphite:2004=None go-carbon-2.go-carbon.graphite:2004=None go-carbon-3.go-carbon.graphite:2004=None go-carbon-4.go-carbon.graphite:2004=None go-carbon-5.go-carbon.graphite:2004=None]
Number of replicas: 100
Found these servers:
go-carbon-0.go-carbon.graphite:4242
go-carbon-1.go-carbon.graphite:4242
go-carbon-2.go-carbon.graphite:4242
go-carbon-3.go-carbon.graphite:4242
go-carbon-4.go-carbon.graphite:4242
go-carbon-5.go-carbon.graphite:4242
Is cluster healthy: true
Is this a real issue or just a misconfiguration on my side?
Inconsistent metric count will almost match active metric count
bucky
is reporting metrics as inconsistent on our cluster and the number is nearly the same as the active metrics one which is very odd. Taking a closer look, this line https://github.com/go-graphite/buckytools/blob/master/cmd/bucky/inconsistent.go#L69 does check the port values and these don't match because one is 2004 and the other is 4242.
The original code does not take the ports into account, just the hostnames
https://github.com/jjneely/buckytools/blob/master/cmd/bucky/inconsistent.go#L64
Is my assumption that these rings won't match because of this correct?