Disk exhausted after upgrade 1.5.6-1.9.5

Hi,
After upgrading our nodes from 1.5.6 -> 1.9.5 we observed differences in storage allocated resources.
We have 2 partitions on our hosts

- `/` for the OS 
- `/var/` for the tasks
this is the storage free:

![Image](https://github.com/user-attachments/assets/068d4696-a7d8-437f-9a6d-4d71ae481027)


our data_dir is pointing to /var/nomad . Our config looks like:

```
region = "gine"
name = "ec2devusfarm02"
log_level = "DEBUG"
leave_on_interrupt = true
leave_on_terminate = true
data_dir = "/var/nomad/data"
bind_addr = "0.0.0.0"
disable_update_check = true
limits {
        https_handshake_timeout   = "10s"
        http_max_conns_per_client = 400
        rpc_handshake_timeout     = "10s"
        rpc_max_conns_per_client  = 400
}
advertise {
    http = "10.121.200.13:4646"
    rpc = "10.121.200.13:4647"
    serf = "10.121.200.13:4648"
}
tls {
  http = true
  rpc  = true
  cert_file = "/opt/nomad/ssl/server.pem"
  key_file = "/opt/nomad/ssl/server-key.pem"
  ca_file = "/opt/nomad/ssl/nomad-ca.pem"
  verify_server_hostname = true
  verify_https_client    = true

}
log_file = "/var/log/nomad/"
log_json = true
log_rotate_max_files = 7
consul {
    address = "127.0.0.1:8500"
    server_service_name = "nomad-server"
    client_service_name = "nomad-client"
    auto_advertise = true
    server_auto_join = true
    client_auto_join = true

    ssl = true
    ca_file = "/opt/consul/ssl/consul-ca.pem"
    cert_file = "/opt/consul/ssl/server.pem"
    key_file = "/opt/consul/ssl/server-key.pem"
    token = "xxxxx"


}
acl {
  enabled = true
}

vault {
    enabled = true
    address = "https://vault.legacy-dev.com:8200/"
    ca_file = "/opt/vault/ssl/vault-ca.pem"
        cert_file = "/opt/vault/ssl/client-vault.pem"
    key_file = "/opt/vault/ssl/client-vault-key.pem"
}

telemetry {
  publish_allocation_metrics = true
  publish_node_metrics       = true
  datadog_address = "localhost:8125"
  disable_hostname = true
  collection_interval = "10s"
}
datacenter = "farm"

client {
    enabled = true
    network_interface = "ens5"
    cni_path = "/opt/cni/bin"
    cni_config_dir = "/etc/cni/net.d/"
}

plugin "docker" {
  config {
    auth {
      config = "/etc/docker/config.json"
    }
    allow_privileged = true
    volumes {
      enabled = true
    }
  }
}
```
After the upgrade we started to see `exhausted disk` errors when we tried to schedule a job:

![Image](https://github.com/user-attachments/assets/3906bb87-d39d-4206-9f07-e05a57770d2c)

But the node has enough free storage. If we observe the nomad node status:

![Image](https://github.com/user-attachments/assets/006fff5c-2fe6-4812-b1c5-264831e52d16)

As you can see, nomad uses `/` instead of `/var` to calculate allocable space. /var/ was used in 1.5.6.  But in unique attributes is fingerprinting the right FS

![Image](https://github.com/user-attachments/assets/0a032d28-3e5a-4096-960c-ebfab2e4f3c0)

How can I solve that? I didn't see in the release notes something related with this. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disk exhausted after upgrade 1.5.6-1.9.5 #24914

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disk exhausted after upgrade 1.5.6-1.9.5 #24914

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions