Add API to fill tensors

It would be nice to have an API to fill `SharedTensor` with a constant value. Currently closest thing is `leaf::weight::FillerType::Constant { value: 0.0 }.fill(&mut tensor)`. There are two problems: usability and performance.

On usability side this interface is available only from `leaf` crate, from first glance looks like it's have to do something with weights and is quite verbose.

On performance side it's implemented by adding native device, filling CPU mem and syncronizing with original device. If original belongs to `Cuda` framework, I think this operation can be done without allocating host memory, filling it using CPU and doing a PCI transfer. At least for `SharedTensor<f32>` there is `cuMemsetD32()`.

I don't completely understand whole arhitecture, but it seems that because the operation depends on backend, it should be implemented as collenchyma plugin. It looks like it'd be too much to create separate repo for this, so maybe it should be done inside collenchyma somewhere in `src/plugins/`?

Well, that said, it's not clear if it's worth to do now... In my opinion this mostly depends on how it affects performance. And I haven't seen any perf issues yet except one probably fixed in autumnai/leaf#90.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add API to fill tensors #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add API to fill tensors #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions