Comparing assemblies to the reference¶
The Quast program can be used to generate similar metrics as the assemblathon_stat.pl script, pluss some more and some visualisations.
Program | Options | Explanation |
---|---|---|
Quast | Evaluating genome assemblies | |
-o | name of output folder | |
-R | Reference genome | |
-G | File with positions of genes in the reference (see manual) | |
-T | number of threads (cpu’s) to use | |
sequences.fasta | one or more files with assembled sequences | |
-l | comma-separates list of names for the assemblies, e.g. “assembly 1”, “assembly 2” (in the same order as the sequence files) | |
–scaffolds | input sequences are scaffolds, not contigs. They will be split at 10 N’s or more to analyse contigs (‘broken’ assembly) | |
–est-ref-size | estimated reference genome size (when not provided) | |
–gene-finding | apply GenemarkS for gene finding |
See the manual for information on the output of Quast: http://quast.bioinf.spbau.ru/manual.html#sec3
Running Quast¶
TIP: log in to the cod3 server using the Y
flag with ssh
:
ssh -Y username@cod3.hpc.uio.no
This becomes useful at the end.
Set up quast:
module load quast/3.0
On the server, make a folder called quast
and move into it. Then
run:
quast.py -T 2 \
-o out_folder_name \
-R /data/assembly/NC_000913_K12_MG1655.fasta \
-G /data/assembly/e.coli_genes.gff \
../path/to/assembly1.fasta \
../path/to/assembly2.fasta \
-l "Assembly 1, Assembly 2"
Note that the --scaffold
option is not used here for simplification.
Also, make sure you name the assemblies (-l
) in the same order as
you give them to quast!
Quast output¶
Quast will produce a html report file report.html
. If you have
logged in to the cod3 server using ssh -Y
you can now type
cd out_folder_name
firefox report.html
Otherwise, download the report and the report_html_aux
folder to
your PC and open the html
file in your browser.
Hover over the row names to get a description. Also have a look at the ‘Extended report’.
Alternatively, have a look at the report.pdf file (it has a few more plots).