SNP detection by RNA-seq L. Xumerle 1 , I. Iacobucci 2 , V. Mijatovic 1 , A. Mori 1 , G. Martinelli 2 , P. F. Pignatti 1 , G. Malerba 1 ; 1 Biology and Genetics, Verona, Italy, 2 Department of Hematology and Oncological Sciences „L. e A. Seràgnoli“, Bologna, Italy. The massively parallel sequencing of the transcriptome (RNA-seq) produ ces several millions sequences (reads) that are usually used to quantify digiltally the genes expression over the trascriptome. Since RNA-seq is based on sequencing, it‘s possible to identify loci that are likely to be polimorphic using alignment mismatches between the tested sample and the reference genome sequence. The goal of this study is to compare the most common alignment programs (bowtie2, tophat2, bwa, gsnap) to verify the reliability of SNP detection in RNA-seq samples. For the analysis were used data of 5 RNA-seq samples from individuals (2 leukemia and 3 chronic myeloproliferative syndrome) previously genotyped with the Affymetrix Chip GenomeWideSNP 6. After alignment, genotypes of polimorfic loci were detected directly from the pileup computed by the program samtools. More than 10000 polymorphic loci were detected by RNA-seq and were in common with the corresponding loci on the Chip. In the first analysis an allele was considered a true variant if called by at least five reads. Using this threshold the average genotype error rate between RNA-seq and Chip is ~10%. The same computation performed with no threshold value (1 read was enough to call the alternative allele) shows an error rate of ~ 0,03%. The results suggest that SNP detection is possible using RNA-seq and that variants called by few reads have to be interpreted as true variants and not a background noise. A more detailed study of differential allele counts needs to be performed to ascertain possible biases of RNA-seq.
SNP detection by RNA-seq
XUMERLE, Luciano;MIJATOVIC, Vladan;MORI, Antonio;MARTINELLI, Giovanni;PIGNATTI, Pierfranco;MALERBA, Giovanni
2013-01-01
Abstract
SNP detection by RNA-seq L. Xumerle 1 , I. Iacobucci 2 , V. Mijatovic 1 , A. Mori 1 , G. Martinelli 2 , P. F. Pignatti 1 , G. Malerba 1 ; 1 Biology and Genetics, Verona, Italy, 2 Department of Hematology and Oncological Sciences „L. e A. Seràgnoli“, Bologna, Italy. The massively parallel sequencing of the transcriptome (RNA-seq) produ ces several millions sequences (reads) that are usually used to quantify digiltally the genes expression over the trascriptome. Since RNA-seq is based on sequencing, it‘s possible to identify loci that are likely to be polimorphic using alignment mismatches between the tested sample and the reference genome sequence. The goal of this study is to compare the most common alignment programs (bowtie2, tophat2, bwa, gsnap) to verify the reliability of SNP detection in RNA-seq samples. For the analysis were used data of 5 RNA-seq samples from individuals (2 leukemia and 3 chronic myeloproliferative syndrome) previously genotyped with the Affymetrix Chip GenomeWideSNP 6. After alignment, genotypes of polimorfic loci were detected directly from the pileup computed by the program samtools. More than 10000 polymorphic loci were detected by RNA-seq and were in common with the corresponding loci on the Chip. In the first analysis an allele was considered a true variant if called by at least five reads. Using this threshold the average genotype error rate between RNA-seq and Chip is ~10%. The same computation performed with no threshold value (1 read was enough to call the alternative allele) shows an error rate of ~ 0,03%. The results suggest that SNP detection is possible using RNA-seq and that variants called by few reads have to be interpreted as true variants and not a background noise. A more detailed study of differential allele counts needs to be performed to ascertain possible biases of RNA-seq.File | Dimensione | Formato | |
---|---|---|---|
ESHG2013AbstractsWebsite.pdf
accesso aperto
Tipologia:
Abstract
Licenza:
Dominio pubblico
Dimensione
9.46 MB
Formato
Adobe PDF
|
9.46 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.