Skip to content

[wip] fix: DiskDiffOperation #2364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft

[wip] fix: DiskDiffOperation #2364

wants to merge 3 commits into from

Conversation

mbokman
Copy link
Contributor

@mbokman mbokman commented Mar 19, 2025

Description

Fixes #2363: Creating a new VM using datastores created as part of the same apply would give an error as the following:

╷
│ Error: Provider produced inconsistent final plan
│ 
│ When expanding the plan for module.solution["solution"].module.product["product"].vsphere_virtual_machine.db_vm["vm_uuid"] to
│ include new values learned so far during apply, provider "registry.terraform.io/hashicorp/vsphere"
│ produced an invalid new value for .disk[4].datastore_id: was cty.StringVal("<computed>"), but now
│ cty.StringVal("datastore-176385").
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵

Apologies for not creating an acceptance test but I am a n00b to Go and this provider. I am also pretty sure that my "fix" might not be the ultimate solution

  • Have you added an acceptance test for the functionality being added?
  • Have you run the acceptance tests on this branch?

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccXXX'

...

Release Note

Release note for CHANGELOG:

...

References

@mbokman mbokman requested a review from a team as a code owner March 19, 2025 18:00
@github-actions github-actions bot added provider Provider needs-review Needs Review labels Mar 19, 2025
@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

For additional clarification:

In my environment, I am creating 4 VMs, 3 of them reference existing datastores, the 4th references two newly created datastores (provisioned by the same apply) for a total of 6 disk (sub-)resources.

The VMs that reference existing datasources will provision successfully but the fourth referencing the new datastores will fail with the provided error.

However, all 4 VMs go through the same section of code referenced, and only the single VM mentioned ends up in the error. I surmise that this piece of code is not necessary and especially in the case of referencing existing datastores is "modified" such that the "diskDatastoreComputedName" is later on updated with the correct values.

While this might not be the appropriate fix, in my environment this skirts around the issue and I can provision all 4 VMs without any issues.

I hope this helps.

@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

And to be more clear, in the mentioned scenario, even for the 3 VMs being provisioned that reference an already existing datastore, the dsID = "".

@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

Lastly, (redacted) debug output has been provided on issue #2363

@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

I have updated the description to emphasize that the datastore ID reports as "<computed>".

@tenthirtyam
Copy link
Contributor

Do you have a depends_on or use the hashicorp/time provider to allow for a delay until the resources are realized? If not, this is often a common issue where there can be a race condition waiting for a resource to be realized by vSphere before it can be used - especially if these are in the same config plan which can be generally suboptimal and you might consider breaking the config into dependent modules.

@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

I do have the depends_on. The VM depends on the datastore which is created successfully.

@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

resource "vsphere_virtual_machine" "db_vm" {
  for_each = local.db_servers_to_provision

[...snip...]
  dynamic "disk" {
    for_each = each.value.volumes
    content {
      datastore_id     = disk.key == 0 ? vsphere_vmfs_datastore.datastore["${each.value.dataStoreNamePrefix}${each.value.hostname}-sys"].id : vsphere_vmfs_datastore.datastore["${each.value.dataStoreNamePrefix}${each.value.hostname}-db"].id
      label            = "Hard disk ${(disk.key + 1)}"
      size             = max(disk.value["size"], data.vsphere_virtual_machine.db_template[each.key].disks[disk.key].size)
      thin_provisioned = data.vsphere_virtual_machine.db_template[each.key].disks[disk.key].thin_provisioned
      unit_number      = disk.value["unit_number"]
    }
  }

[...snip...]
  depends_on = [
    vsphere_vmfs_datastore.datastore,
  ]
}

resource "vsphere_vmfs_datastore" "datastore" {
  for_each = local.datastores_to_provision

  name           = each.key
  host_system_id = data.vsphere_vmfs_disks.rescan[each.key].host_system_id
  folder         = var.ds_folder_path
  # custom_attributes = var.custom_attributes

  disks = data.vsphere_vmfs_disks.rescan[each.key].disks

  depends_on = [
    restful_operation.mkvolume,
    restful_operation.mkvolumehostclustermap,
    data.vsphere_vmfs_disks.rescan,
  ]
}

@mbokman
Copy link
Contributor Author

mbokman commented Mar 19, 2025

I am definitely not suggesting I fixed the root cause of the issue. But to reiterate, 3 of the 4 VMs provisioned will be successful going through the same code path that tests that the dsID = "" and therefore sets nm["datastore_id"] to "<computed>". The fourth VM that uses newly provisioned datastores will go through the same code path which will set the nm["datastore_id"] to "<computed>" equally but will then fail. Just commenting out that piece of code leads to success.

@tenthirtyam
Copy link
Contributor

@mbokman - CI is failing for the build.

go: downloading github.com/davecgh/go-spew v1.1.1
# github.com/hashicorp/terraform-provider-vsphere/vsphere/internal/virtualdevice
Error: vsphere/internal/virtualdevice/virtual_machine_disk_subresource.go:685:33: invalid operation: operator ! not defined on ok (variable of type interface{})
FAIL	github.com/hashicorp/terraform-provider-vsphere [build failed]
FAIL	github.com/hashicorp/terraform-provider-vsphere/vsphere [build failed]

Marking as DRAFT until the CI passes and is ready for review.

@tenthirtyam tenthirtyam marked this pull request as draft March 28, 2025 20:03
@tenthirtyam tenthirtyam self-requested a review March 28, 2025 20:03
@mbokman
Copy link
Contributor Author

mbokman commented Apr 1, 2025

@tenthirtyam Apologies, I only now see your update. Understood, I will look into what is failing the CI build, thanks!

@mbokman
Copy link
Contributor Author

mbokman commented Apr 1, 2025

@tenthirtyam Checks are passing now, the problem is/was my unfamiliarity with Go. Who between the two of us is the person to take the PR out of DRAFT? Sorry, still a n00b but only trying to help.

@mbokman mbokman marked this pull request as ready for review April 1, 2025 20:38
@tenthirtyam tenthirtyam requested a review from spacegospod April 1, 2025 20:39
@mbokman
Copy link
Contributor Author

mbokman commented Apr 1, 2025

If this is approved, fingers crossed, please move it to the v2.12.0 milestone. That release would then solve all my current issues 😄

Copy link
Contributor

@tenthirtyam tenthirtyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original: Sets nm["datastore_id"] to diskDatastoreComputedName if the key is missing or if the value is an empty string.

Modified: Sets nm["datastore_id"] to diskDatastoreComputedName only if the key is missing, ignoring empty string values.

Implications

We'll need this think about the potential implications of this change.

Original: Ensures that nm["datastore_id"] is never an empty string.

Modified: Allows nm["datastore_id"] to remain an empty string if it is already set to an empty string.

@mbokman
Copy link
Contributor Author

mbokman commented Apr 1, 2025

@tenthirtyam I am limited in my capacity (provider/Go knowledge) to help but is there anything that I can perhaps focus on to progress this further? Not asking for detailed instructions, hoping to be helpful on general hints, but no promise.

@spacegospod
Copy link
Contributor

spacegospod commented Apr 16, 2025

Original: Sets nm["datastore_id"] to diskDatastoreComputedName if the key is missing or if the value is an empty string.

Modified: Sets nm["datastore_id"] to diskDatastoreComputedName only if the key is missing, ignoring empty string values.

Implications

We'll need this think about the potential implications of this change.

Original: Ensures that nm["datastore_id"] is never an empty string.

Modified: Allows nm["datastore_id"] to remain an empty string if it is already set to an empty string.

I agree with Ryan, this PR changes the semantics behind datastore_id and we need to understand what this may cause.

I will try to elaborate, let's look at the code in its current form

		if dsID, ok := nm["datastore_id"]; !ok || dsID == "" {
			nm["datastore_id"] = diskDatastoreComputedName
		}

The conditions for this block are

  1. The nm map should not contain a key "datastore_id"
  2. The value for this key should be an empty string

If we look at what nm actually is, we see that it's obtained from the diff between the previous and current disk blocks
(some transformations excluded but basically nm is a plain map obtained from n)

o, n := d.GetChange(subresourceTypeDisk)

Now here's where it gets interesting - even if datastore_id is not present on the disk block Terraform will still put it in the diff map and it will still be an empty string.

Golang has the concept of default values - "" for string, 0 of int, etc. and Terraform follows this pattern - everything on a resource's config has a default value and is always present. This is why the check for the empty string is there.
In fact, the !ok part is effectively redundant since the key is always present in that map.

@tenthirtyam tenthirtyam marked this pull request as draft April 16, 2025 13:54
@tenthirtyam tenthirtyam added do-not-merge Do Not Merge and removed needs-review Needs Review provider Provider labels Apr 16, 2025
@mbokman
Copy link
Contributor Author

mbokman commented Apr 18, 2025

@spacegospod I only now see your comment, thanks for your review.

One thing that is not clearly mentioned on the PR, however it is mentioned in the Expected Behavior of the related issue #2362, is that without my change obviously the terraform apply fails with the mentioned error but when another terraform apply is done without any changes, there is no error and it succeeds. I will try and see what the difference for nm is for both applies.

@tenthirtyam tenthirtyam force-pushed the main branch 2 times, most recently from 2d0d0cd to 3ebb159 Compare May 6, 2025 00:35
@github-actions github-actions bot added dependencies Dependencies provider Provider chore Chore needs-review Needs Review labels May 6, 2025
@tenthirtyam tenthirtyam removed the needs-review Needs Review label May 6, 2025
@tenthirtyam tenthirtyam changed the title fix: DiskDiffOperation #2363 [wip] fix: DiskDiffOperation May 7, 2025
@github-actions github-actions bot added the needs-review Needs Review label May 7, 2025
@tenthirtyam tenthirtyam removed the needs-review Needs Review label May 7, 2025
@github-actions github-actions bot added the needs-review Needs Review label May 8, 2025
@tenthirtyam tenthirtyam removed the do-not-merge Do Not Merge label May 8, 2025
@tenthirtyam tenthirtyam removed needs-review Needs Review provider Provider labels May 30, 2025
@github-actions github-actions bot added provider Provider needs-review Needs Review size/xs Relative Sizing: Extra-Small labels Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Chore dependencies Dependencies needs-review Needs Review provider Provider size/xs Relative Sizing: Extra-Small
Projects
None yet
Development

Successfully merging this pull request may close these issues.

r/virtual_machine: provisioning alongside r/vmfs_datastore fails
3 participants