You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using a VolumeSnapshotClass as below for Block volume snapshotting:
apiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotClassmetadata:
name: oci-bv-snapshot-incrementaldriver: blockvolume.csi.oraclecloud.comparameters:
backupType: incremental # No functional restore difference between full and incrementaldeletionPolicy: Delete
This is integrated with CNPG for lower environment database volume snapshots.
Occasionally (every few weeks), we find these backups failing. with the error:
DeadlineExceeded desc = Timed out waiting for backup to become available
It looks like this is being thrown by the oci-bv csi here:
However, in practice a 45 second timeout is too conservative, looking in the logs, we see the following times for snapshot creation in uk-london-1 between going from com.oraclecloud.BlockVolumes.CreateVolumeBackup.begin to com.oraclecloud.BlockVolumes.CreateVolumeBackup.end state.
I believe the solution for this would be to increase the available timeout to 60 seconds to align better with the expected response times from the API.
Same problem here, also using CNPG. However, I believe that 60 seconds won't be enough for us. We have an 8TB and a 20TB disks that takes longer than that to become available. None of our attempts with snapshots on this database cluster have been successful. The smaller ones are working fine.
Hi,
We are using a VolumeSnapshotClass as below for Block volume snapshotting:
This is integrated with CNPG for lower environment database volume snapshots.
Occasionally (every few weeks), we find these backups failing. with the error:
It looks like this is being thrown by the oci-bv csi here:
oci-cloud-controller-manager/pkg/csi/driver/bv_controller.go
Line 1099 in 411bfeb
Which uses a timeout of 45 seconds as defined here:
oci-cloud-controller-manager/pkg/csi/driver/bv_controller.go
Line 1068 in 411bfeb
However, in practice a 45 second timeout is too conservative, looking in the logs, we see the following times for snapshot creation in
uk-london-1
between going fromcom.oraclecloud.BlockVolumes.CreateVolumeBackup.begin
tocom.oraclecloud.BlockVolumes.CreateVolumeBackup.end
state.Over 9 samples:
average: 37.4 seconds | min: 34 seconds | max: 41 seconds
With a
backupPollInterval
of5 seconds
, the CSI steps just outside of the permissible timeout of 45 seconds.https://github.com/oracle/oci-cloud-controller-manager/blob/master/pkg/oci/client/block_storage.go#L150C36-L150C60
https://github.com/oracle/oci-cloud-controller-manager/blob/master/pkg/oci/client/block_storage.go#L42
I believe the solution for this would be to increase the available timeout to
60 seconds
to align better with the expected response times from the API.oci-cloud-controller-manager/pkg/csi/driver/bv_controller.go
Line 1068 in 411bfeb
Thanks!
The text was updated successfully, but these errors were encountered: