Sunday 6 October 2013

Understanding SNPs and INDELs in microbial genomes

Introduction

Variants are differences between two genomes. Here I describe two important types of nucleotide-level variants (SNPs and INDELs) and how they affect microbial genomes.

SNPs

A SNP is a single nucleotide polymorphism (pronounced "snip"). This is when there is single base which differs between two genomes, and the DNA around that base is otherwise unchanged.

Genome 1 | DNA | ATGCTATAGTAAATCTGCGCTAGCT
Genome 2 | DNA | ATGCTATAGTAAATGTGCGCTAGCT
                               |
                           SNP(C=>G)  

In coding-dense genomes like microbes, most SNPs will be within protein coding regions. Thus the SNP will change a codon, and potentially change the amino acid it codes for. If the amino acid coded for does not change, it is called a synonymous SNP (as the codon is a 'synonym' for the amino acid). If it does change, it is called a non-synonymous SNP.

Genome 1 | DNA | ATG AAA GTT GAT GAC CAG CAT TCC CCA TGA
Genome 2 | DNA | ATG AAA GTC GAT GAC CAG CAT TAC CCA TGA
                         ..|                 .|.  
                       SNP(T=>C)          SNP(C=>A)
                         ..|                 .|.
Genome 1 |  AA |  M   K   V   D   D   Q   H   S   P   *
Genome 2 |  AA |  M   K   V   D   D   Q   H   Y   P   *
                          |                   |
                         SYN               NON-SYN

A non-synonymous SNP can drastically alter the function of a protein because sometimes a single amino acid difference can modify the structure/shape of a protein. It could even affect the RNA transcript itself, causing it to be translated at lower efficiency or not at all. SNPs in promoter regions (-35, -10) and the ribosome binding site (RBS) can have similar effects.

A good rule of thumb is that SNPs in the 3rd position in a codon often produce synonymous SNPs, due to the particular pattern of degeneracy in the genetic code. If two SNPs occur right next to each other, the variant is sometimes called a multiple nucleotide polymorphism (MNP).

INDELs

An INDEL (INsertion/DELetion) is where a single base has been deleted, or inserted into one genome relative to another. It is a symmetrical relationship, as a deletion in one corresponds to an insertion in another. I reckon it should be called a deletion/insertion polymorphism (DIP) too, so we can all snack on SNPs and DIPs :-)

                           DEL(A)
                             |
Genome 1 | DNA | ATGCTATAGTAA-TCTGCGCTAGCT
Genome 2 | DNA | ATGCTATAGTAAATGTGCGCTAGCT
                             |
                           INS(A)  

While a SNP will either change a protein slightly or not at all, an INDEL will nearly always have a drastic affect on a protein. Because codons are groups of 3 nucleotides, removing/adding 1 nucleotide messes everything up; this is called a frame-shift mutation. This usually results in either a protein being extended, or truncated.

Genome 1 | DNA | ATG AAA GTT GAT GAC CAG CAT TCC CCA TGA
Genome 1 |  AA |  M   K   V   D   D   Q   H   S   P   *

Genome 2 | DNA | ATG AAA GTC -AT GAC CAG CAT TAC CCA TGA
                             |                          
                           DEL(G)          
                             |
Genome 2 | DNA | ATG AAA GTC ATG ACC AGC ATT ACC CAT GA? ??? ??? ???
Genome 2 |  AA |  M   K   V   M   T   S   I   T   H   X   X   X   X
                                                      |
                                            STOP Loss & read-through

In the previous case, the protein was extended into a new frame, causing it to have a different 3' end than normal. It will eventually hit another stop codon just by chance. In the case below, if a premature STOP codon is introduced, then we end up with a shorter reading frame.

Genome 3 | DNA | ATG AAA GTC GAAT GAC CAG CAT TAC CCA TGA
                               |
                             INS(A)
                               |
Genome 3 | DNA | ATG AAA GTC GAA  TGA CCA GCA TTA CCC ATG          
Genome 3 |  AA |  M   K   V   E    *   P   A   L   P   M
                                   |
                         STOP Gain & truncation

Because the terminator sequence is no longer where it needs to be, these genes may not every be transcribed, or translated. In that case they are called pseudo-genes.

If multiple deletions (or multiple insertions) occur together, it is sometimes called a micro-indel (or micro-insertion). A micro-INDEL of length 3 occasionally occurs in bacterial evolution, as it keeps the protein translation in frame.

Structural Variation

SNPs and INDELs are about low-level genomic variation. It is also possible to look at structural variants which affect the genome at larger scales. Events like gene duplications, tandem repeats, transposon insertions, inversions, and other chromosomal rearrangements are all important to consider, but this post will leave those issues for another day.

Conclusion

SNPs and INDELs are small differences between genomes. They are important drivers of bacterial evolution, by modifying how or whether genes are transcribed and translated. In my next post I will introduce my new tool Snippy for discovering these differences efficiently.

8 comments:

  1. hello Dr torsten,
    I am working on SNP and indel detection for non model plant organism after reaching and producing the vcf file i think i have hit a road block!! The format is very complex an i am wondering where and how to proceed further with the .vcf files i posses.
    I am using these .vcf files in IGV but igv says that it fails to detect an index file and i get a blank screen...any suggestion
    Thanks in advance

    ReplyDelete
    Replies
    1. This web page explains how to index your VCF files for IGV:
      http://www.broadinstitute.org/igv/VCF

      Delete
  2. Thanks! This was a helpful review. Loved the SNP DIP joke :0)

    ReplyDelete
  3. My daughter has multiple CBS insertions and a COMT insertion. Trying to make sense of what this means....

    ReplyDelete
  4. All thanks to Dr OLIHA for curing my herpes virus/hpv with his herbal medicine, i do not have much to say but with all my life i will forever be grateful to him and God Almighty for using Dr OLIHA to reach me when i thought it was all over, today i am happy with my life again after the medical doctor have confirmed my HERPES SIMPLEX VIRUS / HPV of 5 is gone,i have never in my life believed that HERPES SIMPLEX VIRUS could be cured by herbal medicine. so i want to use this means to reach other persons who have this disease by testifying the power of Dr OLIHA that all hope is not lost yet, try and contact him by any means for any kind of disease with his email: oliha.miraclemedicine@gmail.com add him on whatsapp line or call +2349038382931.

    ReplyDelete
  5. Haily Bradwell
    Tue, 24 Aug, 06:27 (10 days ago)
    to me

    HAVE YOU LOST YOUR MONEY TO BINARY OPTION SCAM OR ANY ONLINE SCAM WHATSOEVER?.DO YOUR DESIRE CREDIT REPAIR[EQUIFAX, EXPERIAN, TRANSUNION? WELL, YOU HAVE FOUND REDEMPTION IN ASORE CORP.



    Asore Corp is a group of multinational Hacker's, an affiliate of Evil Corp. We make sure by all means necessary that our clients get the best of services on a🔐PAYMENT AFTER JOB IS DONE BASIS✅. Rather than send money and trust a criminal to fulfill your deal, you can make sure the job is done before WORKMANSHIP is paid for. You'll get excellent customer service.

    That's a 100% guarantee. Our Cyber security Technicians are on standby 24/7 to receive your job requests.



    ⚠️ BEWARE OF FRAUDSTARS looking to hoax.

    if you have been a VICTIM, contact : ✉️cyberprecinct@gmail.com for directives.

    Here, it's always a win for you.



    🔸OUR SERVICES🔸

    ➡️Binary Option funds recovery

    ➡️Social media hack

    ➡️Recovery of loan scam

    ➡️Credit repair (Equifax,Experian,Transunion)

    ➡️Email hack

    ➡️College score upgrade

    ➡️Android & iPhone Hack

    ➡️Website design

    ➡️Website hack

    And lots more.



    DISCLAIMER: Asore Cyber Corp accepts no responsibility for any information,previously given to anybody by clients on as regarding the job. Asore Cyber Corp will not distribute contact information collected on any hacking job other than in the Asore corps Hacker's listings themselves, and will not sell contact information to third parties.



    CONTACT INFO:

    📧 asorehackcorp@gmail.com

    cyberprecinct@gmail.com



    Copyright ©️

    Asore Cyber Corp 2021.

    All rights reserved.

    ReplyDelete