I was wondering if deepgemm can support bias computation. I think bias term is widely existing in linear layers.
Without bias support, we need to write dedicated kernels to perform sum operator. Compared with fused operator, extra time is needed to read and write the matrix.