Skip to content

Residue ID key error if _ in the chain #298

@yuyuan871111

Description

@yuyuan871111

test pdb: receptor.txt (Github is not allowed to upload pdb, so I modified it into txt.)

The water molecules are with a chain "_".

TER    3277      PHE A 456                                                      
HETATM 3278  O   HOH _   1      26.096  36.294  30.269  1.00 32.18           O  
HETATM 3279  O   HOH _   2      23.309  35.382  24.389  1.00 23.49           O  
HETATM 3280  O   HOH _   3      30.682  28.763  30.563  1.00 22.28           O  
HETATM 3281  O   HOH _   4      23.773  31.116  21.198  1.00 16.56           O  
HETATM 3282  O   HOH _   5      27.333  37.316  26.228  1.00 32.51           O  
HETATM 3283  O   HOH _   6      28.153  35.475  19.899  1.00 39.46           O  
HETATM 3284  O   HOH _   7      26.977  32.208  19.549  1.00 33.98           O  
END

When reading a file, the resid is shown as ResidueId(HOH, 7, _).

import MDAnalysis as mda

protein_mol = Molecule.from_rdkit(Chem.MolFromPDBFile(test_file))
protein_mol.residues[-1].resid

>>> ResidueId(HOH, 7, _)

However, when I tried to find this residue using a string, there was a key error.

protein_mol["HOH7._"]

>>> KeyError: ResidueId(HOH, 7, None)

When I checked with ResidueID class, I found it is attributed to the limitation of the regex in from_string function.

r"(TIP[234]|T[234]P|H2O|[0-9][A-Z]{2}|[A-Z ]+)?(\d*)\.?([A-Z\d]{1,2})?"

Here is my test:

from prolif.residue import ResidueId

ResidueId.from_string("HOH7._")

>>> KeyError: ResidueId(HOH, 7, None)

Original regex:
r"(TIP[234]|T[234]P|H2O|[0-9][A-Z]{2}|[A-Z ]+)?(\d*)\.?([A-Z\d]{1,2})?"

I suggest to have "_" in the group 3.
Modifled regex:
r"(TIP[234]|T[234]P|H2O|[0-9][A-Z]{2}|[A-Z ]+)?(\d*)\.?([A-Z_\d]{1,2})?"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions