Replies: 2 comments
-
I liked this suggestion because it hews closely to how this is already representing smaller synonymous variation. But I do also share Larry's concerns with sharing since this description in particular could be arbitrarily long. I'm not sure what form this would take, but it does seem nice to me to have some way of saying 'the state of this allele is the same as the reference' and not have to also provide the sequence with it. I'm not sure if goes against existing conventions for describing variation though. |
Beta Was this translation helpful? Give feedback.
-
I also wanted to note that it would be useful to think about potential solutions in the context of translation between HGVS and VRS. The existing vrs-python translator doesn't really handle this well, I think because the HGVS string is a little ambiguous (there's no position information provided explicitly). I haven't personally looked at the implementation of that translator, but it would be nice if we could also support this syntax in vrs-python. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
In certain circumstances, it is necessary to report synonymous variation across the entirety of a given sequence location. Using
NP_000213.1
as an example, this may be expressed asNP_000213.1:p.=
, using HGVS. Such notation does not seem to have a well documented VRS equivalent and it would be useful for guidance on best practices for its representation.After some previous discussion, @korikuzma suggested representing this with a
LiteralSequenceExpression
containing the whole sequence. Another option might be using aReferenceLengthExpression
object with a repeat subunit length equal to the length of the sequence to denote the variation is just the sequence. Examples of both these are shown here: https://gist.github.com/bencap/28166bb0a719545f12a190a72021a019. Both these methods seem to me as if they would represent the variation unambiguously.@larrybabb has pointed out that adding the entirety of the sequence to the LSE object might represent issues with sharing these representations when the sequence string is very large. He also noted that really, the
SequenceReference
object is equivalent to this sort ofp.=
syntax, so it may be worthwhile to think about using those to represent this sort of 'same as reference' variation.If the potential for sharing issues with large sequences is problematic, we should discuss whether it makes sense to use a
SequenceReference
object or some other more compact way of describing this sort of variation.Beta Was this translation helpful? Give feedback.
All reactions