Skip to content

Container: resize and copy support #87

@stotko

Description

@stotko

Up to now, the container classes have a fixed capacity and are created using the non-standard createDeviceObject factory function. Furthermore, since ease of use in GPU kernels is considered a key feature, the copy constructors are currently restricted to perform only shallow copies rather than deep copies. This behavior makes the container still feel non-standard and unintuitive to some degree, especially for new users.

In order to fix both issues, the design of the copy operations needs to be revised to match the STL more closely. At first glance, this seems to be an easy task:

  1. Define the copy constructors and copy assignment operators to perform deep copies.
  2. Provide a reference_wrapper<T> class which can be used on the GPU.

However, objects (or at least their states) need to be copied from CPU to GPU memory in order to allow for the proper execution of an operation. Since we want to make the containers work for as many backends and use cases as possible, we cannot make any assumptions how this transfer will be performed or whether this really requires calling the copy constructor or not. reference_wrapper<T> does not solve this problem since it points to the original object which lives in CPU memory.

Therefore, the current proposal would be:

  1. Provide a shallow_copy_wrapper<T> class (suggestions for a better name are welcome) which wraps the object state. This class is copyable such that the object state can be easily passed to the GPU similar to reference_wrapper<T>. However, if the state of the original object is changed, e.g. due to a resize operation, this change will not be visible or propagated to the wrapper invalidating it. Thus, we trade object consistency with GPU support.
  2. Define the copy constructors and copy assignment operators to perform deep copies, but restrict them to be callable from the host only.
  3. Clearly document that shallow_copy_wrapper<T> is only intended to allow crossing memory boundaries and to enable container usage on the GPU. For CPU usage, std::reference_wrapper<T> should be used instead if required.
  4. Deprecate/remove the createDeviceObject and destroyDeviceObject factory functions.

This change will break existing usage within kernels and thrust algorithms (functors). A reasonable transition strategy would be to introduce shallow_copy_wrapper<T> in the last minor release of version 1 (which might be 1.3.0) and provide an option to disable the copy constructor and copy assignment operators. This way, users could start porting to the new copy model and will only need to move away from the factory functions in version 2.0.0.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions