-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Hi @micknudsen,
thanks for providing your code for the transformation of GATK outputs as inputs for ASCAT.
I am trying to adapt this to our current pipeline in which we analyze WXS/WGS data with trios (normal sample + two timely separated tumor samples).
I was encountering some errors when running your script, specifically related to
class BAF(NamedTuple):
chromosome: str
position: int
ref_count: int
alt_count: int
ref_nucleotide: str
alt_nucleotide: str
@property
def frequency(self):
return self.alt_count / (self.ref_count + self.alt_count)
Since the allelic counts files for tumor and normal include positions where alt_count and ref_count are zero, it gave a non-dividable by zero error, which I attempted to fix with:
def frequency(self):
if (self.alt_count + self.ref_count) == 0:
return 0
else:
return self.alt_count / (self.ref_count + self.alt_count)
Another point is related to the following part:
class Segmentation:
def __init__(self, segments: Iterable[Segment]) -> None:
self._segments: DefaultDict[str, List[Segment]] = defaultdict(list)
for segment in segments:
self._segments[segment.chromosome].append(segment)
def logr(self, chromosome: str, position: int) -> float:
for segment in self._segments[chromosome]:
if segment.start <= position <= segment.end:
return segment.logr
raise UncoveredPositionError
-> Since not all positions of the allelic count files are included in the denoisedCR file, which throws the error UncoveredPositionError
I attempted to fix this is as following:
def logr(self, chromosome: str, position: int) -> float:
for segment in self._segments[chromosome]:
if segment.start <= position <= segment.end:
return segment.logr
else:
print("Skipping non-matching position...")
Please let me know your comments. I am not sure what specific down-stream effects these modifications might have and your feedback is appreciated.
Best,
Emre