Genome Assembly

Genome assembly is a method for re-constructing a genome from a large number of (short- or long-) DNA fragments (reads) when no reference genome is available.

During this lesson you will use sequencing reads from different technologies:
1) short reads from Illumina
2) and long read from Oxford Nanopore

Prerequisites

This lesson assumes a working understanding of the bash shell. If you haven’t already completed the Shell Genomics lesson, and aren’t familiar with the bash shell, please review those materials before starting this lesson.

This lesson also assumes some familiarity with biological concepts, including the structure of DNA, nucleotide abbreviations, and the concept of genomic variation within a population.

This lesson uses data hosted on TU Delft cloud environment. Workshop participants will be given information on how to log-in to the cloud during the workshop.

Schedule

	Setup	Download files required for the lesson
00:00	1. Quality Control and Trimming Recap	How to start a genome assembly? How can I describe the quality of my data? How can I get rid of sequence data that doesn’t meet my quality standards?
00:10	2. de novo Short Read Paired-End Assembly	How to do a de novo short read paired-end genome assembly?
00:20	3. de novo Short Read Paired-End and Mate-Paired Assembly	How to do a de novo short read paired-end genome assembly using mate-pairs?
00:30	4. de novo Long Read Assembly	How to do a de novo long-read genome assembly?
00:40	5. Genome Annotation	How to find the genes present in a genome assembly
00:50	6. Genome Browsing	How to explore an annotated genome assembly?
01:00	Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.