Skip to content

Provide a way to map from WBOFragmenter input molecule to parent molecule  #129

@j-wags

Description

@j-wags

Summary

Many users probably want to follow specific atoms/bonds through the fragmentation process. However, as a result of performing canonicalization of input molecules, there's no way to map from an atom in the input molecule of a Fragmenter job to an atom in the result's parent_molecule or any of the output fragments.

Reproducing example

from openff.toolkit.topology import Molecule
from openff.fragmenter.fragment import WBOFragmenter
mol1 = Molecule.from_smiles('ClCCCF')
frag_engine = WBOFragmenter()
result = frag_engine.fragment(mol1)

def draw_mol_and_label_atom_index(mol):
    rdmol = mol.to_rdkit()
    for atom in rdmol.GetAtoms():
        atom.SetAtomMapNum(atom.GetIdx())
    return rdmol

mol1 has the following atom map, and the drawing below shows the atom indices (NOT map indices)

print(mol1.properties)
draw_mol_and_label_atom_index(mol1)

{'atom_map': {0: 1,
1: 2,
2: 3,
3: 4,
4: 5,
5: 6,
6: 7,
7: 8,
8: 9,
9: 10,
10: 11}}

image

The result's parent_molecule has the following atom map, and the drawing below shows the atom indices (NOT map indices)

result.parent_molecule.properties
draw_mol_and_label_atom_index(result.parent_molecule)

{'atom_map': {0: 1,
1: 3,
2: 5,
3: 4,
4: 2,
5: 8,
6: 9,
7: 10,
8: 11,
9: 6,
10: 7}}

image

We see that the Cl switched from being atom index 0 in the input, to atom index 4 in the parent. There's no direct way to map from 0 to 4 given the information in the result object.

Solutions

I think the root cause of this issue is that the OpenFF toolkit performs canonicalization, but it doesn't have a way to return the atom mapping that got applied. I'll open an issue on the Toolkit repo to have that mapping get returned, and once that's available we can provide it to the user here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions