de novo Short Read Paired-End and Mate-Paired Assembly


  • How to do a de novo short read paired-end genome assembly using mate-pairs?

  • Assembly by using multiple libraries with different insert sizes.

  • Being able to explain the terms scaffolding and ‘scaffolds’.

De Novo assembly (Paired End and Mate-pair libaries)

In most assembly projects multiple libraries with different insert sizes are used.

Here we will add an 2.5 kb Mate pair library.

Due to the library preparation the read orientation of these libraries are different: PE: → ← , MP: ← → OR → ←

SPAdes PE and MP assembly


Assemble the trimmed Paired End library together with a Mate-pair library, located at: ~/asm_workshop/data/mp/MP_2.5kb_25x_?.fastq.gz with SPAdes. Use as output dir ecoli_pe_mp.

We have to specify in SPAdes the paired-end (PE) and the mate-pair (MP) library by applying the –pex-x and –mpx-x flags. And also provide the read orientation for the Mate-pair library.

hint: run -h for the command options.


$ -h

with –pe1-1 for read1 and –pe1-2 for read2 we specify the paired-end library. with –mp1-1 for read1 and –mp1-2 for read2 we specify the mate-pair library.

With –mp1-rf we inform SPAdes the mate-pair read orientation <- (reverse), -> (forward)

The full command for the two libraries:

$ \
     --pe1-1 ~/asm_workshop/data/trimmed_fastq/PE_600bp_50x_1.trim.fastq.gz \
     --pe1-2 ~/asm_workshop/data/trimmed_fastq/PE_600bp_50x_2.trim.fastq.gz \
     --mp1-1 ~/asm_workshop/data/mp/MP_2.5kb_25x_1.fastq.gz \
     --mp1-2 ~/asm_workshop/data/mp/MP_2.5kb_25x_2.fastq.gz \
     --mp1-rf \
     -o ~/asm_workshop/results/ecoli_pe_mp

QUAST: compare PE and PE-MP assemblies


Compare the PE-MP assembly with the assembly were we used only the paired-end library by using QUAST.

What are the major differences between the two assemblies?

Use quast_pe_mp as output folder.


$ \
     ~/asm_workshop/results/ecoli_pe/contigs.fasta \
     ~/asm_workshop/results/ecoli_pe_mp/contigs.fasta \
     -o ~/asm_workshop/results/quast_pe_mp

In a new tab (local computer) in your terminal do:

$ scp ~/Desktop/quast/report_pe_mp.html

Visualise the assembly graphs with Bandage (Optional)


Download the assembly graph of the PE-MP and compare it with the assembly graph of the PE assembly.


In a new tab (local computer) in your terminal do:

scp \

start Bandage and load the file assembly_graph_pe_mp.fastg.

Click on “Draw graph” and save as image (current view)

compare with the PE assembly graph.

Filter PE-MP assembly


Filter out the smaller fragments from the PE-MP assembly by applying like we did with the PE assembly. But now we have to use the file scaffolds.fasta since the mate-pair library was used for scaffolding. Use scaffolds_500bp.fasta as output filename.

After filtering apply on the filtered scaffold file.

Have we assembled the complete genome of E. coli K12 substr. MG1655? And how many scaffolds do we have?


$ -i ~/asm_workshop/results/ecoli_pe_mp/scaffolds.fasta \
                       -o ~/asm_workshop/results/ecoli_pe_mp/scaffolds_500bp.fasta

Inspect the assembly statistics:

$ ~/asm_workshop/results/ecoli_pe_mp/scaffolds_500bp.fasta

Scaffold alignment


Align the filtered scaffolds to the reference: (~/asm_workshop/reference/Ecoli_K12_reference.fasta). Use as a prefix: ecoli_pe_mp.

Make sure you are working in the mummer folder: ~/asm_workshop/results/mummer

Inspect the resulting ecoli_pe_mp.png plot. Has the mate-pair library improved the assembly?


Use as working directory: ~/asm_workshop/results/mummer

$ cd ~/asm_workshop/results/mummer

Run nucmer

$ nucmer --prefix ecoli_pe_mp \
         ~/asm_workshop/reference/Ecoli_K12_reference.fasta \

nucmer has aligned all scaffolds to the reference.

Use mummerplot to plot the alignments:

$ mummerplot --png --layout --filter --prefix ecoli_pe_mp \
         ~/asm_workshop/results/mummer/ \
         -R ~/asm_workshop/reference/Ecoli_K12_reference.fasta \
         -Q ~/asm_workshop/results/ecoli_pe_mp/scaffolds_500bp.fasta

A plot file ‘ecoli_pe_mp.png’ has been created. Download the file to your local computer and inspect the file.

In a new tab (local computer) in your terminal do:

$ scp ~/Desktop/mummer/

