-
-
Notifications
You must be signed in to change notification settings - Fork 346
Description
Is there an existing issue for this?
- There is no existing issue for this bug
Is this happening on an up to date version of Incus?
- This is happening on a supported version of Incus
Incus system details
config: {}
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_dev_incus
- migration_pre_copy
- infiniband
- dev_incus_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- dev_incus_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- images_all_projects
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- zfs_delegate
- storage_api_remote_volume_snapshot_copy
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- image_restriction_privileged
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- certificate_description
- disk_io_bus_virtio_blk
- loki_config_instance
- instance_create_start
- clustering_evacuation_stop_options
- boot_host_shutdown_action
- agent_config_drive
- network_state_ovn_lr
- image_template_permissions
- storage_bucket_backup
- storage_lvm_cluster
- shared_custom_block_volumes
- auth_tls_jwt
- oidc_claim
- device_usb_serial
- numa_cpu_balanced
- image_restriction_nesting
- network_integrations
- instance_memory_swap_bytes
- network_bridge_external_create
- network_zones_all_projects
- storage_zfs_vdev
- container_migration_stateful
- profiles_all_projects
- instances_scriptlet_get_instances
- instances_scriptlet_get_cluster_members
- instances_scriptlet_get_project
- network_acl_stateless
- instance_state_started_at
- networks_all_projects
- network_acls_all_projects
- storage_buckets_all_projects
- resources_load
- instance_access
- project_access
- projects_force_delete
- resources_cpu_flags
- disk_io_bus_cache_filesystem
- instance_oci
- clustering_groups_config
- instances_lxcfs_per_instance
- clustering_groups_vm_cpu_definition
- disk_volume_subpath
- projects_limits_disk_pool
- network_ovn_isolated
- qemu_raw_qmp
- network_load_balancer_health_check
- oidc_scopes
- network_integrations_peer_name
- qemu_scriptlet
- instance_auto_restart
- storage_lvm_metadatasize
- ovn_nic_promiscuous
- ovn_nic_ip_address_none
- instances_state_os_info
- network_load_balancer_state
- instance_nic_macvlan_mode
- storage_lvm_cluster_create
- network_ovn_external_interfaces
- instances_scriptlet_get_instances_count
- cluster_rebalance
- custom_volume_refresh_exclude_older_snapshots
- storage_initial_owner
- storage_live_migration
- instance_console_screenshot
- image_import_alias
- authorization_scriptlet
- console_force
- network_ovn_state_addresses
- network_bridge_acl_devices
- instance_debug_memory
- init_preseed_storage_volumes
- init_preseed_profile_project
- instance_nic_routed_host_address
- instance_smbios11
- api_filtering_extended
- acme_dns01
- security_iommu
- network_ipv4_dhcp_routes
- network_state_ovn_ls
- network_dns_nameservers
- acme_http01_port
- network_ovn_ipv4_dhcp_expiry
- instance_state_cpu_time
- network_io_bus
- disk_io_bus_usb
- storage_driver_linstor
- instance_oci_entrypoint
- network_address_set
- server_logging
- network_forward_snat
- memory_hotplug
- instance_nic_routed_host_tables
- instance_publish_split
- init_preseed_certificates
- custom_volume_sftp
- network_ovn_external_nic_address
- network_physical_gateway_hwaddr
- backup_s3_upload
- snapshot_manual_expiry
- resources_cpu_address_sizes
- disk_attached
- limits_memory_hotplug
- disk_wwn
- server_logging_webhook
- storage_driver_truenas
- container_disk_tmpfs
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: root
auth_user_method: unix
environment:
addresses: []
architectures:
- x86_64
- i686
certificate: |
-----BEGIN CERTIFICATE-----
-----END CERTIFICATE-----
certificate_fingerprint: dcebab58f13abf6101b6c425f4266a90b6a9d2f6911886b3c6069f2561788504
driver: lxc | qemu
driver_version: 6.0.5 | 9.0.4
firewall: nftables
kernel: Linux
kernel_architecture: x86_64
kernel_features:
idmapped_mounts: "true"
netnsid_getifaddrs: "true"
seccomp_listener: "true"
seccomp_listener_continue: "true"
uevent_injection: "true"
unpriv_binfmt: "true"
unpriv_fscaps: "true"
kernel_version: 6.8.0-79-generic
lxc_features:
cgroup2: "true"
core_scheduling: "true"
devpts_fd: "true"
idmapped_mounts_v2: "true"
mount_injection_file: "true"
network_gateway_device_route: "true"
network_ipvlan: "true"
network_l2proxy: "true"
network_phys_macvlan_mtu: "true"
network_veth_router: "true"
pidfd: "true"
seccomp_allow_deny_syntax: "true"
seccomp_notify: "true"
seccomp_proxy_send_notify_fd: "true"
os_name: Ubuntu
os_version: "24.04"
project: default
server: incus
server_clustered: false
server_event_mode: full-mesh
server_name: poc-services
server_pid: 25033
server_version: "6.16"
storage: zfs
storage_version: 2.2.2-0ubuntu9.4
storage_supported_drivers:
- name: dir
version: "1"
remote: false
- name: truenas
version: 0.7.3
remote: true
- name: zfs
version: 2.2.2-0ubuntu9.4
remote: false
Instance details
No response
Instance log
No response
Current behavior
There appears to be a race condition when pulling multiple OCI images.
I have repeatedly hit this issue while running terraform code which launches 5 instances at the same time.
incusd
crashes with the following stack:
Sep 12 22:40:08 poc-services incusd[24666]: panic: runtime error: invalid memory address or nil pointer dereference
Sep 12 22:40:08 poc-services incusd[24666]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1090ce9]
Sep 12 22:40:08 poc-services incusd[24666]: goroutine 733 [running]:
Sep 12 22:40:08 poc-services incusd[24666]: github.com/apex/log.(*Logger).log(0x37016c0, 0x10?, 0xc000882160?, {0xc002938000?, 0x1?})
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/apex/log@v1.9.0/logger.go:153 +0x49
Sep 12 22:40:08 poc-services incusd[24666]: github.com/apex/log.(*Entry).Info(...)
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/apex/log@v1.9.0/entry.go:96
Sep 12 22:40:08 poc-services incusd[24666]: github.com/apex/log.(*Entry).Infof(0xc000fc3b58, {0x231c3fa?, 0xc000fc3b70?}, {0xc000882160?, 0x75565097edc8?, 0x75569b41fa78?})
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/apex/log@v1.9.0/entry.go:122 +0x4c
Sep 12 22:40:08 poc-services incusd[24666]: github.com/apex/log.(*Logger).Infof(0xc000e48550?, {0x231c3fa?, 0x276f560?}, {0xc000882160?, 0x2402ec7?, 0x0?})
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/apex/log@v1.9.0/logger.go:121 +0x65
Sep 12 22:40:08 poc-services incusd[24666]: github.com/apex/log.Infof(...)
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/apex/log@v1.9.0/pkg.go:86
Sep 12 22:40:08 poc-services incusd[24666]: github.com/opencontainers/umoci/oci/layer.UnpackRootfs({0x2769cb8, 0x3d6dc60}, {0x277df20, 0xc000e745f0}, {0xc00032a640, _}, {{0x2}, {0xc000740210, 0x2a}, {0x0, ...}, ...}, ...)
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/opencontainers/umoci@v0.5.0/oci/layer/unpack.go:260 +0xf42
Sep 12 22:40:08 poc-services incusd[24666]: github.com/opencontainers/umoci/oci/layer.UnpackManifest({0x2769cb8, 0x3d6dc60}, {0x277df20, 0xc000e745f0}, {0xc000740120, _}, {{0x2}, {0xc000740210, 0x2a}, {0x0, ...}, ...}, ...)
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/opencontainers/umoci@v0.5.0/oci/layer/unpack.go:155 +0x6f0
Sep 12 22:40:08 poc-services incusd[24666]: github.com/opencontainers/umoci.Unpack({{0x277df80, 0xc000688420}}, {0x22f22d3, 0x6}, {0xc000740120, 0x30}, {{0x2761c88, 0x3d72d20}, 0x1, 0x0, ...})
Sep 12 22:40:08 poc-services incusd[24666]: /root/go/pkg/mod/github.com/opencontainers/umoci@v0.5.0/unpack.go:87 +0x874
Sep 12 22:40:08 poc-services incusd[24666]: github.com/lxc/incus/v6/client.unpackOCIImage({0xc0007400f0, 0x2e}, {0x22f22d3, 0x6}, {0xc000740120, 0x30})
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/client/oci_util_linux.go:59 +0x238
Sep 12 22:40:08 poc-services incusd[24666]: github.com/lxc/incus/v6/client.(*ProtocolOCI).GetImageFile(0xc000c6bae0, {0xc000ba2900?, 0x9?}, {{0x275f090, 0xc000692970}, {0x275f090, 0xc000692978}, 0xc000f40da0, 0xc000f40db0, 0x247c508})
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/client/oci_images.go:185 +0x9ce
Sep 12 22:40:08 poc-services incusd[24666]: main.ImageDownload({0x2769cf0, 0x3d6dc60}, 0xc000e3e140, 0xc0004fed00, 0xc000e3e8c0, 0xc000fc5a88)
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/cmd/incusd/daemon_images.go:419 +0x25cc
Sep 12 22:40:08 poc-services incusd[24666]: main.imgPostRemoteInfo({0x2769cf0, 0x3d6dc60}, 0xc0004fed00, 0xc000e3e140, {{0x0, 0x0, 0x0, {0x0, 0x0, 0x0}, ...}, ...}, ...)
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/cmd/incusd/images.go:536 +0x1d0
Sep 12 22:40:08 poc-services incusd[24666]: main.imagesPost.func3(0xc000e3e8c0)
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/cmd/incusd/images.go:1257 +0x16c
Sep 12 22:40:08 poc-services incusd[24666]: github.com/lxc/incus/v6/internal/server/operations.(*Operation).Start.func1(0xc000e3e8c0)
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/internal/server/operations/operations.go:306 +0x26
Sep 12 22:40:08 poc-services incusd[24666]: created by github.com/lxc/incus/v6/internal/server/operations.(*Operation).Start in goroutine 614
Sep 12 22:40:08 poc-services incusd[24666]: /build/incus/internal/server/operations/operations.go:305 +0x106
Sep 12 22:40:08 poc-services systemd[1]: incus.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
The nil pointer happens inside a log.Infof
and from from debugging I believe the issue is due to this line:
If another goroutine tries to use the logger after the handler has been set to nil, the panic happens.
If I remove that I am unable to reproduce the issue.
I would submit a PR, I'm just not sure of the best way to ensure the logging continues to work as expected without it.
Expected behavior
I should be able to create multiple instances at the same time that all pull OCI container images.
Steps to reproduce
I've been able to reproduce by calling incus image copy
concurrently:
incus image copy oci-docker:openfga/openfga local: &
incus image copy oci-docker:library/haproxy local: &
incus image copy oci-docker:grafana/loki local: &
incus image copy oci-docker:grafana/grafana-enterprise local: &
incus image copy oci-docker:prom/prometheus local: &
Result:
incus image copy oci-docker-local:openfga/openfga local: &
incus image copy oci-docker-local:library/haproxy local: &
incus image copy oci-docker-local:grafana/loki local: &
incus image copy oci-docker-local:grafana/grafana-enterprise local: &
incus image copy oci-docker-local:prom/prometheus local: &
[1] 24779
[2] 24780
[3] 24781
[4] 24782
[5] 24783
Error: Failed remote image download: websocket: close 1006 (abnormal closure): unexpected EOF
Error: Failed remote image download: websocket: close 1006 (abnormal closure): unexpected EOF
Error: Failed remote image download: websocket: close 1006 (abnormal closure): unexpected EOF
Error: Failed remote image download: websocket: close 1006 (abnormal closure): unexpected EOF
Error: Failed remote image download: websocket: close 1006 (abnormal closure): unexpected EOF
[1] Exit 1 incus image copy oci-docker-local:openfga/openfga local:
[2] Exit 1 incus image copy oci-docker-local:library/haproxy local:
[3] Exit 1 incus image copy oci-docker-local:grafana/loki local:
[4]- Exit 1 incus image copy oci-docker-local:grafana/grafana-enterprise local:
[5]+ Exit 1 incus image copy oci-docker-local:prom/prometheus local: