DNA-barcoding is the process of taxonomic identification based on the sequence of a marker gene. When complex samples are analysed, we refer in particular to meta-barcoding. Barcoding has traditionally been performed with Sanger sequencing platform. The emergence of second-generation sequencing platforms, mainly represented by Illumina, enabled the high-throughput sequencing of hundreds of samples, and allowed the characterization of complex samples through meta-barcoding experiments. However, fragments sequenced with the Illumina platform are shorter than 600 bp, and this greatly limits taxonomic resolution of closely related species. Moreover, both these platforms suffer of long turnaround time, since they require shipping the samples to a sequencing facility, and complex regulations may hamper the export of material out of the country of origin. More recently, Oxford Nanopore Technologies provided the MinION, a portable and cheap third-generation sequencer, which has the potential of overcoming issues of currently available platforms, thanks to the production of long sequencing reads. However, MinION reads suffer of high error rate, therefore suitable analysis pipelines are needed to overcome this issue. In this thesis I describe the development of bioinformatic pipelines for MinION-based DNA barcoding. Starting from the analysis of single samples, I show how improvements both in sequencing chemistry and in software now allow obtaining consensus sequences directly in the field, with accuracy comparable with Sanger. Conversely, when analysing complex samples, sequencing reads cannot be collapsed for reducing the error rate. However, bioinformatic approaches exploiting increased read length largely compensate the higher error rate, resulting in high correlation between MinION and Illumina up to genus level, and a more marked sensitivity of MinION platform to detect spiked-in indicator species. In conclusion, the results presented in this thesis show that bioinformatic pipelines for the analysis of MinION reads can largely mitigate platform issues, paving the way for this platform to become the gold-standard for barcoding in the near future.
Development of novel bioinformatic pipelines for MinION-based DNA barcoding
Maestri, Simone
2021-01-01
Abstract
DNA-barcoding is the process of taxonomic identification based on the sequence of a marker gene. When complex samples are analysed, we refer in particular to meta-barcoding. Barcoding has traditionally been performed with Sanger sequencing platform. The emergence of second-generation sequencing platforms, mainly represented by Illumina, enabled the high-throughput sequencing of hundreds of samples, and allowed the characterization of complex samples through meta-barcoding experiments. However, fragments sequenced with the Illumina platform are shorter than 600 bp, and this greatly limits taxonomic resolution of closely related species. Moreover, both these platforms suffer of long turnaround time, since they require shipping the samples to a sequencing facility, and complex regulations may hamper the export of material out of the country of origin. More recently, Oxford Nanopore Technologies provided the MinION, a portable and cheap third-generation sequencer, which has the potential of overcoming issues of currently available platforms, thanks to the production of long sequencing reads. However, MinION reads suffer of high error rate, therefore suitable analysis pipelines are needed to overcome this issue. In this thesis I describe the development of bioinformatic pipelines for MinION-based DNA barcoding. Starting from the analysis of single samples, I show how improvements both in sequencing chemistry and in software now allow obtaining consensus sequences directly in the field, with accuracy comparable with Sanger. Conversely, when analysing complex samples, sequencing reads cannot be collapsed for reducing the error rate. However, bioinformatic approaches exploiting increased read length largely compensate the higher error rate, resulting in high correlation between MinION and Illumina up to genus level, and a more marked sensitivity of MinION platform to detect spiked-in indicator species. In conclusion, the results presented in this thesis show that bioinformatic pipelines for the analysis of MinION reads can largely mitigate platform issues, paving the way for this platform to become the gold-standard for barcoding in the near future.File | Dimensione | Formato | |
---|---|---|---|
PhD_thesis_Simone_Maestri.pdf
accesso aperto
Descrizione: Tesi di dottorato di Simone Maestri
Tipologia:
Tesi di dottorato
Licenza:
Creative commons
Dimensione
3.87 MB
Formato
Adobe PDF
|
3.87 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.