You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -108,6 +109,31 @@ If integer counts are preferred, then they can instead be returned.
108
109
>>> tb.bases("chr1", 24, 74, False)
109
110
{'A': 6, 'C': 6, 'T': 6, 'G': 6}
110
111
112
+
## Fetch masked blocks
113
+
114
+
There are two kinds of masking blocks that can be present in 2bit files: hard-masked and soft-masked. Hard-masked blocks are stretches of NNNN, as are commonly found near telomeres and centromeres. Soft-masked blocks are runs of lowercase A/C/T/G, typically indicating repeat elements or low-complexity stretches. In can sometimes be useful to query this information from 2bit files:
115
+
116
+
>>> tb.hardMaskedBlocks("chr1")
117
+
[(0, 50), (100, 150)]
118
+
119
+
In this (small) example, there are two stretches of hard-masked sequence, from 0 to 50 and again from 100 to 150 (see the note below about coordinates). If you would instead like to query all blocks overlapping with a specific region, you can specify the region bounds:
120
+
121
+
>>> tb.hardMaskedBlocks("chr1", 75, 101)
122
+
[(100, 150)]
123
+
124
+
If there are no overlapping regions, then an empty list is returned:
125
+
126
+
>>> tb.hardMaskedBlocks("chr1", 75, 100)
127
+
[]
128
+
129
+
Instead of `hardMaskedBlocks()`, one can use `softMaskedBlocks()` in an identical manner:
0 commit comments