-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Description of problem:
Gluster heal process fills up whole free space on replaced brick of disperse volume, if there are sparse files in volume.
The exact command to reproduce the issue:
- create new disperse volume (tested with 4+2), e.g. with
gluster volume create vol1 disperse-data 4 redundancy 2 transport tcp node1:/gluster/nvme1/brick node1:/gluster/nvme2/brick node2:/gluster/nvme1/brick node2:/gluster/nvme2/brick node3:/gluster/nvme1/brick node3:/gluster/nvme2/brick
- place some sparse files on volume, e.g. with
cp -avp --sparse=always source destination-vol1/
- reset a brick, e.g. with
gluster volume reset-brick vol1 node3:/gluster/nvme1/brick start
andgluster volume reset-brick vol1 node3:/gluster/nvme1/brick node3:/gluster/nvme1/brick commit force
Actual results:
The volume starts healing, but sparse files on new brick appear as their real size corresponds to apparent sparse file size, eventually filling up whole brick. Besides that, such volume starts reporting wrong size with df
when mounted. Healing process never ends, leaving some files unhealed, and new brick reports No space left on device (see brick log fragment bellow).
Expected results:
Dispersed volume with sparse files on it should be correctly healed after brick reset.
Mandatory info:
- The output of the gluster volume info
command:
Volume Name: vol1
Type: Disperse
Volume ID: ***
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: node1:/gluster/nvme1/brick
Brick2: node1:/gluster/nvme2/brick
Brick3: node2:/gluster/nvme1/brick
Brick4: node2:/gluster/nvme2/brick
Brick5: node3:/gluster/nvme1/brick
Brick6: node3:/gluster/nvme2/brick
Options Reconfigured:
cluster.server-quorum-type: none
storage.health-check-interval: 600
storage.health-check-timeout: 30
auth.allow: ***
nfs.disable: on
transport.address-family: inet
storage.fips-mode-rchecksum: on
features.cache-invalidation: on
network.ping-timeout: 5
server.allow-insecure: on
network.remote-dio: disable
client.event-threads: 8
server.event-threads: 8
performance.io-thread-count: 8
cluster.eager-lock: enable
cluster.locking-scheme: granular
performance.low-prio-threads: 32
cluster.data-self-heal-algorithm: full
performance.client-io-threads: off
cluster.lookup-optimize: off
performance.readdir-ahead: off
cluster.readdir-optimize: off
cluster.enable-shared-storage: enable
- The output of the gluster volume status
command:
Status of volume: vol1
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node1:/gluster/nvme1/brick 49168 0 Y 1239337
Brick node1:/gluster/nvme2/brick 49169 0 Y 1239344
Brick node2:/gluster/nvme1/brick 49168 0 Y 1363957
Brick node2:/gluster/nvme2/brick 49169 0 Y 1363964
Brick node3:/gluster/nvme1/brick 49157 0 Y 848916
Brick node3:/gluster/nvme2/brick 49158 0 Y 848923
Self-heal Daemon on localhost N/A N/A Y 848936
Self-heal Daemon on node1 N/A N/A Y 1239357
Self-heal Daemon on node2 N/A N/A Y 1363977
Task Status of Volume vol1
------------------------------------------------------------------------------
There are no active volume tasks
- The output of the gluster volume heal
command:
Launching heal operation to perform index self heal on volume vol1 has been successful
Use heal info commands to check status.
**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/bricks/gluster-nvme1-brick.log
[2022-05-19 17:23:17.588485 +0000] W [dict.c:1532:dict_get_with_ref] (-->/usr/lib64/glusterfs/9.1/xlator/features/index.so(+0x3bdc) [0x7f0915443bdc] -->/lib64/libglusterfs.so.0(dict_get_str+0x3c) [0x7f0924c5318c] -->/lib64/libglusterfs.so.0(dict_get_with_ref+0x85) [0x7f0924c519b5] ) 0-dict: dict OR key (link-count) is NULL [Invalid argument]
[2022-05-19 17:23:17.601320 +0000] E [MSGID: 113072] [posix-inode-fd-ops.c:2068:posix_writev] 0-vol1-posix: write failed: offset 0, [No space left on device]
[2022-05-19 17:23:17.601396 +0000] E [MSGID: 115067] [server-rpc-fops_v2.c:1324:server4_writev_cbk] 0-vol1-server: WRITE info [{frame=12201148}, {WRITEV_fd_no=3}, {uuid_utoa=***
-dc3d-4041-8e11-835327df299c}, {client=CTX_ID:***-GRAPH_ID:4-PID:1027276-HOST:my-host-name.cz-PC_NAME:vol1-client-4-RECON_NO:-0}, {error-xlator=vol1-posix}, {errno=28}, {error=No space left on device}]
**- Is there any crash ? Provide the backtrace and coredump
No
Additional info:
I am also concerned about very high PIDs, even short after node restart, but that may not be related.
- The operating system / glusterfs version:
centos 8 / glusterfs 9.1 and also tested on glusterfs 9.4