IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene #1957

singhd6 · 2025-06-13T19:48:22Z

singhd6
Jun 13, 2025

Hi all,

I analyzed my BCR rep seq data with MIXCR/4.7.0 and generated the clonotypes table. While analyzing the v gene usage in the repertoire by total numbers of clonotypes and total number of reads, i reported in certain subjects the genes vary based on total number of clonotypes and total numbers of reads by the clonotypes. Analyzing the data with total numbers of reads by all the clonotypes for a specific gene or total number of clonotypes for a specific gene, what makes sense. As per my understanding total number of reads for a specific clonotypes makes sense since it demonstrates how many time a clonotype has been read for a specific gene or reads associated with the clonotypes. However, when i read the literatures most numbers of the literatures site about the total clonotypes only few literatures have cited about the total numbers of read by the clonotypes. I was wondering have the previous authors never thought about the reads associated with a clonotypes or v gene usage is analyzed only by the total numbers of clonotypes. If so, then why gene names vary based on the total number of clonotypes in a subject and total numbers of reads associated with clonotypes.
e.g. in the below screenshot:
Though total numbers of clonotypes for IGHV9-3 is 66 which is greater than the total clonotypes count for IGHV11-2 = 40, however the total reads for IGHV11-2 = 25078 is greater than IGHV9-3 = 14330.
Now if we want to select the CDR3 amino acid sequences for antibodies purification purpose should we consider total reads or total clonotypes. In my understanding I will consider total reads by total clonotypes because it shows the total reads associated with total clonotypes.

Can you all please pour in your suggestions about this?

Thank you
Divya

singhd6 · 2025-06-14T03:37:26Z

singhd6
Jun 14, 2025
Author

0 replies

mizraelson · 2025-06-16T18:43:06Z

mizraelson
Jun 16, 2025
Collaborator

These are two different ways to capture usage:
• Weighted (e.g., by reads or UMIs)
• Unweighted (by the number of unique clones)

The choice depends on what you’re looking for. In your case, the weighted approach (based on reads) makes more sense, as you are interested in abundant clonotypes.

0 replies

singhd6 · 2025-06-16T19:01:32Z

singhd6
Jun 16, 2025
Author

Hi Mark, Thank you for your reply. Appreciate it. Divya

…

________________________________ From: mizraelson ***@***.***> Sent: Monday, June 16, 2025 2:43 PM To: milaboratory/mixcr ***@***.***> Cc: Singh, Divya Jyoti ***@***.***>; Author ***@***.***> Subject: [EXT] Re: [milaboratory/mixcr] IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene (Discussion #1957) PROCEED WITH CAUTION: Slow down and pay close attention to emails sent from outside the organization. If you receive an unsolicited email from an unknown sender or are suspicious of the tone, style, vocabulary or urgency of the email message, never click links or open attachments within it. When in doubt, you should either delete the email, verify its authenticity by contacting the sender using an alternative method not listed in the email, or submit it via the BlueFish button in Outlook for investigation. If you don't have the BlueFish button or are using a mobile device, forward the email as an attachment to ***@***.******@***.***?subject=Report%20a%20Suspicious%20Email>

________________________________ These are two different ways to capture usage: • Weighted (e.g., by reads or UMIs) • Unweighted (by the number of unique clones) The choice depends on what you’re looking for. In your case, the weighted approach (based on reads) makes more sense, as you are interested in abundant clonotypes. — Reply to this email directly, view it on GitHub<#1957 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BH7E6ZLRJHH52K22TGUXMTT3D4F45AVCNFSM6AAAAAB7I6CXYGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGNBYG43DANA>. You are receiving this because you authored the thread.Message ID: ***@***.***> Please consider the environment before printing this e-mail Cleveland Clinic is a nonprofit, multispecialty academic medical center that's recognized in the U.S. and throughout the world for its expertise and care. Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.

0 replies

singhd6 · 2025-06-16T19:03:35Z

singhd6
Jun 16, 2025
Author

[like] Singh, Divya Jyoti reacted to your message:

…

________________________________ From: mizraelson ***@***.***> Sent: Monday, June 16, 2025 6:43:26 PM To: milaboratory/mixcr ***@***.***> Cc: Singh, Divya Jyoti ***@***.***>; Author ***@***.***> Subject: [EXT] Re: [milaboratory/mixcr] IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene (Discussion #1957) PROCEED WITH CAUTION: Slow down and pay close attention to emails sent from outside the organization. If you receive an unsolicited email from an unknown sender or are suspicious of the tone, style, vocabulary or urgency of the email message, never click links or open attachments within it. When in doubt, you should either delete the email, verify its authenticity by contacting the sender using an alternative method not listed in the email, or submit it via the BlueFish button in Outlook for investigation. If you don't have the BlueFish button or are using a mobile device, forward the email as an attachment to ***@***.******@***.***?subject=Report%20a%20Suspicious%20Email>

________________________________ These are two different ways to capture usage: • Weighted (e.g., by reads or UMIs) • Unweighted (by the number of unique clones) The choice depends on what you’re looking for. In your case, the weighted approach (based on reads) makes more sense, as you are interested in abundant clonotypes. — Reply to this email directly, view it on GitHub<#1957 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BH7E6ZLRJHH52K22TGUXMTT3D4F45AVCNFSM6AAAAAB7I6CXYGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGNBYG43DANA>. You are receiving this because you authored the thread.Message ID: ***@***.***> Please consider the environment before printing this e-mail Cleveland Clinic is a nonprofit, multispecialty academic medical center that's recognized in the U.S. and throughout the world for its expertise and care. Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.

0 replies

singhd6 · 2025-06-20T13:53:53Z

singhd6
Jun 20, 2025
Author

Hi Mark, Hope you are doing good. I generated the SHMT trees with following commands for mouse BCR sequencing: #Find alleles# mixcr findAlleles --report "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/SHMT_PAN-PAN2/PANPAN2.findAlleles.report.txt" --export-alleles-mutations "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/SHMT_PAN-PAN2/PANPAN2/PANPAN2_Alleles.tsv" --export-library "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/SHMT_PAN-PAN2/PANPAN2/PANPAN2_Alleles.json" --output-template "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/result2/Pan-pan2.reassigned.clns" "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/result2/Pan-pan2S12.clns" #Export the reassigned .clns in .shmt format# mixcr findShmTrees --report "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/PAN-PAN2/PANPAN2_trees.log" "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/result2/Pan-pan2.reassigned.clns" "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/SHMT_PAN-PAN2/PANPAN2.shmt" #export SHM trees with nodes# mixcr exportShmTreesWithNodes "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/SHMT_PAN-PAN2/PANPAN2.shmt" "/mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/Pan-pan2S12/SHMT_PAN-PAN2/PANPAN2_Trees.tsv" I am trying to understand the somatic hypermutation trees: treeId nodeId isObserved chains parentId DistanceFromGermline nMutationsRate cloneId fileName readCount readFraction targetSequences targetQualities bestVHit bestDHit bestJHit bestCHit allVAlignments allDAlignments allJAlignments allCAlignments nSeqFR1 nSeqCDR1 nSeqFR2 nSeqCDR2 nSeqFR3 nSeqCDR3 nSeqFR4 aaSeqFR1 aaSeqCDR1 aaSeqFR2 aaSeqCDR2 aaSeqFR3 aaSeqCDR3 aaSeqFR4 isotype refPoints 74 14 TRUE IGH 4 132 0.390532544 3233 /mnt/beegfs/singhd6/Divya_data/vhseq/BCR_Seq_Mouse_GZ/JA10S1/JA10SHMT/JA10.reassigned.clns 124 2.88E-05 CTAGAATGGATTGGAGAAATTAATCCAGATAGCAGTACGATAAACTATACGCCATCTCTAAAGGATAAATTCATCATCTCCAGAGACAACGCCAAAAATACGCTGTACCTGCAAATGAGCAAAGTGAGATCTGAGGACACAGCCCTTTATTACTGTGCAAGTATCTATGATGGTTACTTCCACTTTGACTACTGGGGCCAAGGCACCACTCTCACAGTCTCCTCAG NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN IGHV4-100-M8-5OIS4G6YYQRS2JEG37HKSY5G IGHD2-300 IGHJ200 IGHM00 74|367|388|0|161|DG74DA75DG76DG77DT78DG79DA80DA81DG82DC83DT84DT85DC86DT87DC88DG89DA90DG91DT92DC93DT94DG95DG96DA97DG98DG99DT100DG101DG102DC103DC104DT105DG106DG107DT108DG109DC110DA111DG112DC113DC114DT115DG116DG117DA118DG119DG120DA121DT122DC123DC124DC125DT126DG127DA128DA129DA130DC131DT132DC133DT134DC135DC136DT137DG138DT139DG140DC141DA142DG143DC144DC145DT146DC147DA148DG149DG150DA151DT152DT153DC154DG155DA156DT157DT158DT159DT160DA161DG162DT163DA164DG165DA166DT167DA168DC169DT170DG171DG172DA173DT174DG175DA176DG177DT178DT179DG180DG181DG182DT183DC184DC185DG186DG187DC188DA189DG190DG191DC192DT193DC194DC195DA196DG197DG198DG199DA200DA201DA202DG203DG204DG205|107.0 16|32|51|162|178||80.0 23|68|68|181|226||450.0 CTAGAATGGATTGGAGAA ATTAATCCAGATAGCAGTACGATA AACTATACGCCATCTCTAAAGGATAAATTCATCATCTCCAGAGACAACGCCAAAAATACGCTGTACCTGCAAATGAGCAAAGTGAGATCTGAGGACACAGCCCTTTATTAC TGTGCAAGTATCTATGATGGTTACTTCCACTTTGACTACTGG GGCCAAGGCACCACTCTCACAGTCTCCTCAG LEWIGE INPDSSTI NYTPSLKDKFIISRDNAKNTLYLQMSKVRSEDTALYY CASIYDGYFHFDYW GQGTTLTVSS IgM ::::0:0:0:18:42:153:-1:161:162:1:-2:178:181:-3:195:226:: "In the above tree the nMutation rate for treeID74 is 0.390532544 and distance from germline is 132, according to MIXCR documents: n mutation rate is "Number of nucleotide mutations from germline divided by target sequence size" now if i see the target sequence size of the treeID74 is 226 and i divide 132/226 = 0.584070796460177, then why the nMuatation rate is 0.390532544. Unless target sequence size is something else and how mixcr is calculating the mutation rate i want to understand. I calculated the total number of nucleotide from the column target sequence which is 226. Can you please explain how do you calculate the nMutation rate for somatic hypermutation tree." Thank you Divya

…

________________________________ From: Singh, Divya Jyoti ***@***.***> Sent: Monday, June 16, 2025 3:01 PM To: milaboratory/mixcr ***@***.***>; milaboratory/mixcr ***@***.***> Cc: Author ***@***.***> Subject: Re: [EXT] Re: [milaboratory/mixcr] IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene (Discussion #1957) Hi Mark, Thank you for your reply. Appreciate it. Divya

________________________________ From: mizraelson ***@***.***> Sent: Monday, June 16, 2025 2:43 PM To: milaboratory/mixcr ***@***.***> Cc: Singh, Divya Jyoti ***@***.***>; Author ***@***.***> Subject: [EXT] Re: [milaboratory/mixcr] IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene (Discussion #1957) PROCEED WITH CAUTION: Slow down and pay close attention to emails sent from outside the organization. If you receive an unsolicited email from an unknown sender or are suspicious of the tone, style, vocabulary or urgency of the email message, never click links or open attachments within it. When in doubt, you should either delete the email, verify its authenticity by contacting the sender using an alternative method not listed in the email, or submit it via the BlueFish button in Outlook for investigation. If you don't have the BlueFish button or are using a mobile device, forward the email as an attachment to ***@***.******@***.***?subject=Report%20a%20Suspicious%20Email>

________________________________ These are two different ways to capture usage: • Weighted (e.g., by reads or UMIs) • Unweighted (by the number of unique clones) The choice depends on what you’re looking for. In your case, the weighted approach (based on reads) makes more sense, as you are interested in abundant clonotypes. — Reply to this email directly, view it on GitHub<#1957 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BH7E6ZLRJHH52K22TGUXMTT3D4F45AVCNFSM6AAAAAB7I6CXYGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTGNBYG43DANA>. You are receiving this because you authored the thread.Message ID: ***@***.***> Please consider the environment before printing this e-mail Cleveland Clinic is a nonprofit, multispecialty academic medical center that's recognized in the U.S. and throughout the world for its expertise and care. Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use only by the individual or entity to which it is addressed and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and destroy the material in its entirety, whether electronic or hard copy. Thank you.

1 reply

mizraelson Jun 24, 2025
Collaborator

Answered in #1967

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene #1957

Uh oh!

{{title}}

Uh oh!

Replies: 5 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

IGHV gene usage based on number of total clonotypes or number of total reads by all total clonotypes for a specific gene #1957

Uh oh!

singhd6 Jun 13, 2025

Replies: 5 comments · 1 reply

Uh oh!

singhd6 Jun 14, 2025 Author

Uh oh!

mizraelson Jun 16, 2025 Collaborator

Uh oh!

singhd6 Jun 16, 2025 Author

Uh oh!

singhd6 Jun 16, 2025 Author

Uh oh!

singhd6 Jun 20, 2025 Author

Uh oh!

mizraelson Jun 24, 2025 Collaborator

singhd6
Jun 13, 2025

Replies: 5 comments 1 reply

singhd6
Jun 14, 2025
Author

mizraelson
Jun 16, 2025
Collaborator

singhd6
Jun 16, 2025
Author

singhd6
Jun 16, 2025
Author

singhd6
Jun 20, 2025
Author

mizraelson Jun 24, 2025
Collaborator