Installing and running¶
To ran a specific script from the pipeline, say Checkout, execute
java -jar MIGEC-$VERSION.jar Checkout [arguments]
$VERSION stands for pipeline version (e.g. 1.2.1), this notation is
omitted in MIGEC routine documentation.
To view the list of available scripts execute:
java -jar MIGEC-$VERSION.jar -h
alternatively you can download the repository and compile it from source using Maven (requires Maven version 3.0)
git clone https://github.com/mikessh/MIGEC.git cd MIGEC/ mvn clean install java -jar target/MIGEC-$VERSION.jar
This should show you the list of available MIGEC routines.
The data from 454 platform should be used with caution, as it contains
homopolymer errors which (in present framework) result in reads dropped
during consensus assembly. The 454 platform has a relatively low read
yield, so additional read dropping could result in over-sequencing level
below required threshold. If you still wish to give it a try, we would
recommend filtering off all short reads and repairing indels with
Coral, the latter
should be run with options
-mr 2 -mm 1000 -g 3.
NCBI-BLAST+ package is required. Could be directly installed on Linux using a command like $sudo apt-get ncbi-blast+ or downloaded and installed directly from here: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
Consider providing sufficient memory for the pipeline, i.e. 8Gb for
MiSeq or 36Gb for HiSeq sample, depending on sample sequence diversity
and current script (CdrBlast requires has the highest memory
requirements). To do so, execute the script with
java -Xmx8G -jar MIGEC-$VERSION.jar CdrBlast [arguments].
If insufficient amount memory is allocated, the Java Virtual Machine
could drop with a Java Heap Space Out of Memory error.