Further optimize gemm

Thanks for your great work!

I plan to work on bnn optimization as well for various application (generative model/classifier) on a powerful cpu.
I did preliminary work for a few hours to change the "micro_kernel" to use avx512, and it showed 4x speed up for simple one loop optimization (note -O3 won't do the optimization to vectorize). I wonder if you plan to work on this further ? and boost the performance further. 
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Further optimize gemm #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Further optimize gemm #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions