NOTE: both the software and the manual are under development. Expect things to change.
A MOA template is made up of a .moa file and a .jinja2 (or .mk) file.
The .moa file mainly contains input-output file sets and parameter options used for the bash command(s). Some of these options have default values which the user can change while constructing the job.
The .jinja2 file includes information to structure the command(s). It is written in jinja, which is a templating language for python and is simple to write and easy to understand.
These files are used by the backend, currently ruffus, that manages file set and parameter dependencies to make pipelines and render commands to the bash prompt. Initially, GNU make was the backend used. It is very powerful but some of its limitations and its complexity led to including ruffus as an option for the backend as well.
The easiest way to write a moa template is to edit an existing template to suit your requirements. This involves understanding the parts of an existing template.
The bwa_aln template is used as an example below. Just as a background, the bwa aln command takes a FASTQ file as input and aligns it to a reference genome that was previously indexed. The output is a .sai file with the alignments.
The bwa_aln.moa file has some main components:
Backend
backend: ruff
This is ‘ruff’ which means that ruffus is used in the python script at a lower level to read the template .moa and .jinja2 file, and render the corresponding commands to the bash prompt.
Commands
commands:
run:
mode: map
help: run bwa aln
clean:
mode: simple
help: Remove all job data, not the Moa job itself, note that this must be implemented by the template.
This indicates the function names that you will later define. In the example above, there are 2 commands- run and clean, so moa run or moa clean on the command prompt in the job directory would execute these functions.
Filesets
filesets:
input:
category: input
extension: fq
help: Fastq input files
glob: '*'
optional: false
type: set
output:
category: output
dir: .
extension: sai
glob: '{{ input_glob }}'
source: input
type: map
Like the name, each filesets refer to a set of files in a single directory. The bwa_aln template shows 2 filesets: input and output.
Category: is essentially used to separate input from output.
Extension: refers to the type of file(s) required or generated.
Glob: searches for files with a specified pattern. Moa, by default (glob= *) automatically processes all files of the specified input extension in the directory specified. By specifying a glob, Moa will only process those files whose name pattern matches what is in the glob.
Type: refers to the data type of the fileset or parameter.
A fileset can either be of set or map type. The type set refers to a simple set of files in a directory. The type map refers to a set of files that are linked to what their source value is. In the above code, the output fileset is mapped to the input fileset.
Dir: the directory of the output fileset is ‘.’, which means that the output files will be placed in the current working directory.
Parameter category order
parameter_category_order:
- ''
- input
- system
- advanced
Parameters
mismatch_penalty:
category: ''
default: 3
help: mismatch penalty
optional: true
type: integer
They are the variables/options that specify a command.
Moa_id
moa_id: bwa_aln
is supposed to be the same as the filename. Ideally something descriptive (eg. bwa_aln). This is used to later link to the other template file.
The other template file is ‘’bwa_aln.jinja2’’ which is written in jinja, a templating language for python. Note that the jinja2 file name is the same as the moa file name.
Important features of the bwa_aln.jinja2 file are:
The three hash’s (###) specify the start of a function and are followed by the function name. In our bwa_aln example, we have defined 2 funtions: run and clean.
### run
This defination is followed by a set of commands which you would want to be executed when you type moa run or moa_clean in the bwa_aln job directory. The commands in our example file look the same as what you would put in the command prompt but the values of the parameters are bought from the .moa file and hence it’s value is replaced by the parameter name.
bwa aln {{db}} \
-n {{edit_dist_missing_prob}} \
. \
. \
. \
{{ input }} \
-f {{ output}}
It is also possible to add if-else statements or other computing blocks in accordance with the design language.
{% if color_space %} -c {% endif %}