Skip to content

Add API to fill tensors #58

@alexandermorozov

Description

@alexandermorozov

It would be nice to have an API to fill SharedTensor with a constant value. Currently closest thing is leaf::weight::FillerType::Constant { value: 0.0 }.fill(&mut tensor). There are two problems: usability and performance.

On usability side this interface is available only from leaf crate, from first glance looks like it's have to do something with weights and is quite verbose.

On performance side it's implemented by adding native device, filling CPU mem and syncronizing with original device. If original belongs to Cuda framework, I think this operation can be done without allocating host memory, filling it using CPU and doing a PCI transfer. At least for SharedTensor<f32> there is cuMemsetD32().

I don't completely understand whole arhitecture, but it seems that because the operation depends on backend, it should be implemented as collenchyma plugin. It looks like it'd be too much to create separate repo for this, so maybe it should be done inside collenchyma somewhere in src/plugins/?

Well, that said, it's not clear if it's worth to do now... In my opinion this mostly depends on how it affects performance. And I haven't seen any perf issues yet except one probably fixed in autumnai/leaf#90.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions