Skip to content

Add support for storage controllers #84

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tintoy opened this issue Apr 17, 2017 · 19 comments
Closed

Add support for storage controllers #84

tintoy opened this issue Apr 17, 2017 · 19 comments
Assignees
Milestone

Comments

@tintoy
Copy link
Contributor

tintoy commented Apr 17, 2017

The Q release (v2.5) of the CloudControl API has introduced support for explicitly controlling the storage controllers in a deployed server.

Because the behaviour of disk management when using custom storage controllers diverges from the existing behaviour, storage controllers and their disks will be managed via a separate ddcloud_storage_controller resource type:

resource "ddcloud_server" {
  name = "My server"

  // Disks can either be specified here (order now matters)
  disk {
    // SCSI unit Id can no longer be specified here
    size_gb = 10
    speed   = "STANDARD"
  }
}

resource "ddcloud_storage_controller" {
  server = "${ddcloud_server.my_server.id}"
  adapter_type = "LSI_LOGIC_PARALLEL"
  bus_number = 1

  // Or disks can be specified here (order does not matter)
  disk {
    scsi_unit_id = 0 // SCSI unit Id can now be specified here
    size_gb       = 10
    speed         = "STANDARD"
  }
}

For a given ddcloud_server, you can either specify disks the old way (inline, but without a SCSI unit Id) or with an explicit storage controller (but not both). Since disks specified the old way no longer have an explicit SCSI unit Id, their order now matters (i.e. you cannot have 3 disks and remove disk 2). Disks specified via the new way (with explicit storage controller) have an explicit SCSI unit Id and their order there does not matter.

These breaking API changes are expected to also require refactoring of the existing ddcloud_server resource implementation due to changes in the format of information returned by v2.5 of the CloudControl API.

Note that it's theoretically possible to to support specifying disks and controllers in the same ddcloud_server resource but this would greatly complicate the provider implementation for this resource type (disks with a value for controller_id would have to be processed after any controller entries, and Terraform may no longer be able to automatically warn the user about changes that force a new resource or are otherwise invalid).

See below for an alternative design proposal.

See also:

@tintoy tintoy added the design label Apr 17, 2017
@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

CC: @kumarappanc @mingsheng36 @wninobla @jeffstoner @johnamurray @alexbacchin @manishkp @BenObermayer

Your input would be appreciated, both regarding this design and the trade-offs it involves.

@tintoy tintoy added this to the v1.4 milestone Apr 17, 2017
@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

A possible alternative for the design might be to rely on the fact that disks which don't explicitly specify a controller Id are always created against controller 0 (the default controller).

resource "ddcloud_server" "my_server" {
	name = "My Server"

	// There's always an implicit storage_controller with bus = 0

	storage_controller {
		bus          = 1
		adapter_type = "BUS_LOGIC"
	}

	// Old-style disk (implicitly targets controller 0)
	disk {
		scsi_unit_id = 0
		size_gb      = 10
		speed		 = "STANDARD"
	}

	// New-style disk (explicitly targets controller 1)
	disk {
		bus          = 1
		scsi_unit_id = 0
		size_gb      = 10
		speed		 = "STANDARD"
	}
}

In this model, the provider always uses the CloudControl APIs for disks to target a specific controller (rather than the old-style APIs), but disk entries that don't specify a controller are assumed to target controller 0 (i.e. the bus property has a default value of 0).

Note that this approach may add some complexity to the implementation because disk management logic for updating a ddcloud_server resource would then have to be done in several passes:

  1. Examine storage_controller entries and add controllers if required.
  2. Examine disk entries, and add / remove / modify if required.
  3. Examine storage controller entries and remove controllers if required.

In keeping with our design ethos, the provider will not attempt to detect an attempt to remove a controller with disks still attached (instead, responsibility for rejecting this change will be delegated to the CloudControl API).

@jeffstoner
Copy link

Given what's coming in the R release (support for IDE/SATA,) there may be another jumbling of the storage subsystem metadata. The design phase of this is currently on-going, so I can't comment on exactly what changes are happening (because they're very much in-flux) but we should be careful with the modeling - might want to consider using "disk type" structures that cater to the bus they're attached to (IDE disks won't have a scsi_unit_id, etc.)

@jeffstoner
Copy link

There are (currently) 4 storage adapters supported by the platform:

There are four adapters supported by vSphere:

  • BusLogic Parallel - This was one of the first two emulated vSCSI controllers made available in the VMware platform, and remains commonly used on older version of Windows as the driver is available by default.
  • LSI Logic Parallel - This was the other emulated vSCSI controller made available in the VMware platform, and remains commonly used on UNIX Operating Systems 
  • LSI Logic SAS - This is a newer evolution of the parallel driver supported (and in some cases required) for newer versions of Microsoft Windows. It was designed to replace the BusLogic Parallel adapter and provides better performance than the original BusLogic Parallel adapter.
  • VMware Paravirtual - This is a VMware-specific driver that is virtualization-aware and designed to support very high throughput with minimal processing cost and is therefore the most efficient driver. However, this driver requires that VM Tools be installed and running in order to function properly. There are some other restrictions, particularly on older operating systems - review the appropriate VMware documentation for more details. 

(source: https://docs.mcp-services.net/pages/viewpage.action;jsessionid=E61080748C1D3F34342D36D225289A95?pageId=3015251)

@jeffstoner
Copy link

Your statement

disks which don't explicitly specify a controller Id are always created against controller 0 (the default controller)

conflicts with the defined MCP behavior

If you do not manually select the SCSI Controller and position of the disk, it will be added sequentially to the next available SCSI controller and position.

(source: https://docs.mcp-services.net/display/CCD/How+to+Add+Additional+Local+Storage+%28Disk%29+to+a+Cloud+Server - Prerequisite 5.)

@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

Thanks for the feedback @jeffstoner - this is all useful information :)

Unfortunately it sounds like these changes may make the provider quite painful to use, so I may hold off on building anything until we can find a design compromise that fits. I'll keep thinking about it.

@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

conflicts with the defined MCP behavior

On the other hand, this is Terraform behaviour we're talking about, not CloudControl. As long as the behaviour is well-documented, I think I'd still be comfortable with it.

As for SATA controllers, we could add a sata_disk type:

resource "ddcloud_server" "my_server" {
  name = "My Server"

  // SCSI disk, bus 0
  disk {
    scsi_unit_id = 0
    size_gb      = 10
  }

  // SCSI disk, bus 1
  disk {
    bus          = 1
    scsi_unit_id = 0
    size_gb      = 100
  }

  // SATA disk
  sata_disk {
    size_gb = 50
  }
}

@wninobla
Copy link
Collaborator

wninobla commented Apr 17, 2017 via email

@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

(I'm trying to be mindful of backward-compatibility here)

@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

Thanks @wninobla - that does make sense. I'm favouring option 2 at the moment as it's a little more user-friendly, but I'm open to suggestions :)

@wninobla
Copy link
Collaborator

wninobla commented Apr 17, 2017 via email

@wninobla
Copy link
Collaborator

wninobla commented Apr 17, 2017 via email

@alexbacchin
Copy link
Contributor

Same here, backwards compatibility is a must. These storage adapters changes makes practically no difference for 90% of the users as the underlying physical storage infrastructure still the same.

@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

Breaking changes to Terraform provider schema are risky for users - not so much for new deployments, but for existing deployments where the configuration suddenly means something different than it did before. Something I'd like to minimise if possible :)

@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

On the other hand, this is Terraform behaviour we're talking about, not CloudControl. As long as the behaviour is well-documented, I think I'd still be comfortable with it.

Just to clarify, @jeffstoner, the design ethos for our Terraform provider is to use CloudControl terminology and semantics wherever it's practical to do so unless it conflicts with Terraform's resource / schema / life-cycle model.

We currently use a disk's SCSI unit Id to uniquely identify each disk in the configuration because the Id is only known after deployment (so we match up CloudControl Ids to SCSI unit Ids). The proposed design attempts to retain those semantics because otherwise we have no way to tell which disk is which except by their ordering (which is undesirable - otherwise if you had 3 disks and removed the middle we'd have no way of knowing that).

tintoy added a commit that referenced this issue Apr 17, 2017
#84.

Also, clean up error messages to prevent maddening code-analysis warnings.
@tintoy
Copy link
Contributor Author

tintoy commented Apr 17, 2017

The commit's a bit messy (latest version of Go and associated tooling is more pedantic than usual and had to fix a lot of code-analysis warnings) but I've built a (totally untested) first attempt at implementing option 2. Thanks to some earlier refactoring I'm pleased to say it wasn't actually that much work.

I'll try to have a go at actually testing it sometime today or tomorrow to see what breaks (and, if broken, whether it can be fixed).

@tintoy tintoy self-assigned this Apr 18, 2017
@tintoy
Copy link
Contributor Author

tintoy commented Apr 18, 2017

Working on a prototype for option 1 in feature/ddcloud-storage-controller.

tintoy added a commit that referenced this issue Apr 19, 2017
@tintoy
Copy link
Contributor Author

tintoy commented Apr 19, 2017

Ok, so the following configuration works correctly now (branch for option 1):

provider "ddcloud" {
	region = "AU"
}

data "ddcloud_networkdomain" "ansible_test" {
	name		= "AnsibleTest"
	datacenter	= "AU9"
}

data "ddcloud_vlan" "ansible_vlan" {
	name			= "AnsibleVLAN"
	networkdomain	= "${data.ddcloud_networkdomain.ansible_test.id}"
}

resource "ddcloud_server" "test_server" {
	name			= "TerraformTestServer1"

	image			= "Ubuntu 16 64-bit 2 CPU"

	networkdomain	= "${data.ddcloud_networkdomain.ansible_test.id}"

	primary_network_adapter {
		vlan = "${data.ddcloud_vlan.ansible_vlan.id}"
	}
	
	admin_password	= "snausages!"
}

resource "ddcloud_storage_controller" "test_server_controller_0" {
	server			= "${ddcloud_server.test_server.id}"
	scsi_bus_number	= 0

	disk {
		scsi_unit_id	= 0
		size_gb			= 20
	}

	disk {
		scsi_unit_id	= 1
		size_gb			= 40
	}
}

resource "ddcloud_storage_controller" "test_server_controller_1" {
	server			= "${ddcloud_server.test_server.id}"
	scsi_bus_number	= 1

	disk {
		scsi_unit_id	= 0
		size_gb			= 30
	}

	disk {
		scsi_unit_id	= 1
		size_gb			= 25
	}
}

If I remove disk 2 (scsi_unit_id = 1) on test_server_controller_1, Terraform produces incorrect diff output when running terraform plan but does behave correctly when running terraform apply. Pretty sure the diff output is due to a bug in Terraform (not really related, but along similar lines to hashicorp/terraform#10520).

@tintoy tintoy modified the milestones: v1.3, v1.4 Apr 21, 2017
tintoy added a commit that referenced this issue Apr 26, 2017
* Expose SCSI bus number for inline disks on ddcloud_server.
* If both IPv4 and VLAN Id are specified for a network adapter, ignore VLAN Id.
* Handle the fact that ExpandDisk is part of the v2.x API and therefore returns ResponseCodeInProgress rather than ResultSuccess.
* Only build master and development/vXXX.
* Initial sketch of implementation for ddcloud_storage_controller (this is a prototype of option 1 for #84).
* Mark all storage controller properties as ForceNew.
* Start implementing update / delete for ddcloud_storage_controller resource.
* Fix bugs and improve diagnostic logging for deployment of ddcloud_storage_controller.
* Fix incorrectly-propagated scsi_bus_number on ddcloud_storage_controller.disk.
* Log server remaining disk count before removing a disk.
* Refactor access to target storage controller.
* Create initial acceptance test for ddcloud_storage_controller (default controller, single image disk).
* Fix broken cleanup checks for ddcloud_storage_controller acceptance tests.
* Add acceptance test for ddcloud_storage_controller (default controller, 1 additional disk).
* Remove SCSI controller when ddcloud_storage_controller is destroyed (unless it's the last storage controller in the target server).
* Fix buggy handling for removal of server's last disk.
* Add refresh step before verifying server disk configuration in acceptance tests for ddcloud_storage_controller.
* Fix broken ddcloud_storage_controller acceptance test (default controller with 1 image disk and 1 additional disk).
* Fix another broken acceptance test for ddcloud_storage_controller, and improve output from testCheckXXX functions.
* Add docs for ddcloud_storage_controller.
* Update version info and change log (v1.3.0-alpha2).
* Fix failing unit tests due to CloudControl schema change.
* Add ddcloud_storage_controller to provider documentation
@tintoy
Copy link
Contributor Author

tintoy commented Apr 26, 2017

Implemented in 1.3.0-alpha2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants