-
Notifications
You must be signed in to change notification settings - Fork 84
Open
Description
test pdb: receptor.txt (Github is not allowed to upload pdb, so I modified it into txt.)
The water molecules are with a chain "_".
TER 3277 PHE A 456
HETATM 3278 O HOH _ 1 26.096 36.294 30.269 1.00 32.18 O
HETATM 3279 O HOH _ 2 23.309 35.382 24.389 1.00 23.49 O
HETATM 3280 O HOH _ 3 30.682 28.763 30.563 1.00 22.28 O
HETATM 3281 O HOH _ 4 23.773 31.116 21.198 1.00 16.56 O
HETATM 3282 O HOH _ 5 27.333 37.316 26.228 1.00 32.51 O
HETATM 3283 O HOH _ 6 28.153 35.475 19.899 1.00 39.46 O
HETATM 3284 O HOH _ 7 26.977 32.208 19.549 1.00 33.98 O
END
When reading a file, the resid is shown as ResidueId(HOH, 7, _)
.
import MDAnalysis as mda
protein_mol = Molecule.from_rdkit(Chem.MolFromPDBFile(test_file))
protein_mol.residues[-1].resid
>>> ResidueId(HOH, 7, _)
However, when I tried to find this residue using a string, there was a key error.
protein_mol["HOH7._"]
>>> KeyError: ResidueId(HOH, 7, None)
When I checked with ResidueID class, I found it is attributed to the limitation of the regex in from_string
function.
Line 22 in 185460c
r"(TIP[234]|T[234]P|H2O|[0-9][A-Z]{2}|[A-Z ]+)?(\d*)\.?([A-Z\d]{1,2})?" |
Here is my test:
from prolif.residue import ResidueId
ResidueId.from_string("HOH7._")
>>> KeyError: ResidueId(HOH, 7, None)
Original regex:
r"(TIP[234]|T[234]P|H2O|[0-9][A-Z]{2}|[A-Z ]+)?(\d*)\.?([A-Z\d]{1,2})?"
I suggest to have "_" in the group 3.
Modifled regex:
r"(TIP[234]|T[234]P|H2O|[0-9][A-Z]{2}|[A-Z ]+)?(\d*)\.?([A-Z_\d]{1,2})?"
Metadata
Metadata
Assignees
Labels
No labels