Skip to content

Backend decoupling and typed memory interface #63

@alexandermorozov

Description

@alexandermorozov

I've implemented feature related to decoupling from #37. Main commit can be viewed here. Below is commit message for convenience:

Change SharedTensor::read() signature from
fn read(&self, device: &DeviceType) -> Result<&MemoryType, ...>
into
fn read<D: IDevice(&self, device: &D) -> Result<&D::M, ...>
New signature provides type-level guarantee that if a Cuda device is passed
into read(), then it'll return Cuda memory (and not Native or OpenCL).
Previously required additional unwraps (.as_native().unwrap()) are no
longer required, code is more clear and concise.

Internally SharedTensor uses Any type to store objects of different types
uniformely. Synchronization between memories is also done through type-erased
interface. This makes it possible to define a new Framework in an external
crate, or extract Cuda and OpenCL frameworks into their own crates. Though
error types would require some additional work.

Use of "dynamic typing" has drawbacks -- mainly slightly larger runtime
overhead. Before this patch benchmarks showed that SharedTensor::read() takes
19-22ns, now it takes 23-26ns. For comparison, minimal synchronized CUDA
operation will take about 10-40us. Small NN layers on CPU are much faster,
e.g. 10-input softmax layer takes about 500ns. Still, in typical NNs overhead
looks negligible, and I think it's fair tradeoff for code clarity and better
decoupling.

Here are actual benches, before:

test bench_shared_tensor_access_time_first                            ... bench:          19 ns/iter (+/- 2)
test bench_shared_tensor_access_time_second                           ... bench:          21 ns/iter (+/- 0)

after:

test bench_shared_tensor_access_time_first                        ... bench:          23 ns/iter (+/- 0)
test bench_shared_tensor_access_time_second                       ... bench:          26 ns/iter (+/- 3)

What's your opinion on it?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions