Skip to content

Avoid identical computation in self tanimoto similarity #117

@JochenSiegWork

Description

@JochenSiegWork

The ´self_tanimoto_similarity´ function equates matrix_a to itself, and then it calls the tanimoto_similarity_sparse. Calculating norm_2 is repeated in this case which is unnecessarily costly for large arrays. See

norm_2 = np.array(matrix_b.multiply(matrix_b).sum(axis=1))

We can add a simple check for identity of the two matrices to avoid redundant computation.

Thanks to Afnan for bringing this to our attention!

Metadata

Metadata

Labels

type: maintenanceImprovement of code or keeping the code up to date

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions