OTB is short for Only The Best (genome assemblies) and is a Hi-C / HiFi pipeline specifically designed for phasing. Listed below is some general guidance for using OTB, but please be aware that it is fairly complex so please reach out (bisonnet@bucknell.edu) if you have specific questions.
If you don’t currently have a scratch directory on BisonNet (/scratch/username), please request one by emailing bisonnet@bucknell.edu.
To get started, you have to download/install a copy of OTB in your home directory. To do this, open a Terminal window and change to the directory where you’d like to store OTB. Then run
git clone https://github.com/molikd/otb.git
To get started, it likely will be helpful for you to review the general tutorial information provided in the project’s wiki pages.
Listed below is an example OTB Slurm script. Copy and paste the contents below into a file called otb.slurm and place that in your otb directory. Review each of the lines and customize it for you (i.e. edit USERNAME to be your Bucknell username) and for your data. Note that you will need to create the directories you specify for your container paths (e.g. mkdir -p /scratch/USERNAME/containers/tmp). Once you’ve made the required modifications, you can submit the job with sbatch otb.slurm.
#!/bin/bash
#SBATCH -p short # partition (queue)
#SBATCH -N 1
#SBATCH -n 2 # number of cores
#SBATCH --mem-per-cpu=8192 # memory per core
#SBATCH --job-name="otb" # job name
#SBATCH --mail-user=USERNAME@bucknell.edu # address to email
#SBATCH --mail-type=ALL # mail events (NONE, BEGIN, END, FAIL, ALL)
# load the nextflow module
module load nextflow/22.10.6
# set container paths
export NXF_SINGULARITY_CACHEDIR=/scratch/USERNAME/containers
export NXF_SINGULARITY_LIBRARYDIR=/scratch/USERNAME/containers
export SINGULARITY_LOCALCADHEDIR=/scratch/USERNAME/containers
export SINGULARITY_CACHEDIR=/scratch/USERNAME/containers
export SINGULARITY_TMPDIR=/scratch/USERNAME/containers/tmp
export APPTAINERENV_TMPDIR=/scratch/USERNAME/containers/tmp
export NXF_WORK=/scratch/USERNAME/nextflow-work
# We need an assembly name, generally this is just the name of the organism
Assembly_Name="Bombus_huntii"
# Forward and reverse reads
Forward="/home/USERNAME/path/to/RawData/*_R1.fastq.gz"
Reverse="/home/USERNAME/path/to/RawData/*_R2.fastq.gz"
CCS='/home/USERNAME/path/to/RawData/*.fastq'
#Comment/Uncommment for busco
#Busco="--busco" #Busco will be run
Busco="" #Busco will not be run
#Comment/Uncommment for Yahs
Yahs="-y" #Yahs will be run
#Yahs="" #Yahs will not be run
#Comment/Uncomment for Polishing (only select one of)
#Polish_Type="" #No polishing
#Polish_Type="simple" #Simple Polishing
Polish_Type="dv" #Deep Variant Polishing
#Polish_Type="merfin" #merfin Polishing
#Comment/Uncomment for Type (only select one of)
#HiFi_Type="phasing"
HiFi_Type="default"
#HiFi_Type="trio"
#Comment/Uncomment for Runner (only select one of)
Runner="slurm_usda" # this matches BisonNet slurm queues
#Runner="slurm"
Threads="10"
#Busco_Location="-l /path/to/busco"
#Busco_DB="-p /home/USERNAME/path/to/busco/db"
if [[ -z "$BUSCO" ]]; then
./otb.sh -n ${Assembly_Name} -f "$( echo ${Forward})" -r "$(echo ${Reverse})" -in "$(echo ${CCS})" -m ${HiFi_Type} -t ${Threads} ${Yahs} ${Busco} --polish-type ${Polish_Type} --runner ${Runner} -c -s
else
./otb.sh -n ${Assembly_Name} -f "$( echo ${Forward})" -r "$(echo ${Reverse})" -in "$(echo ${CCS})" -m ${HiFi_Type} -t ${Threads} ${Yahs} ${Busco} ${Busco_Location} ${Busco_DB} --polish-type ${Polish_Type} --runner ${Runner} -c -s
fi
If you need to modify nextflow or OTB parameters after submitting your job (e.g. more memory is required for a particular step), you can resume the job where it last stopped. To do so, comment out (#) the last 5 lines of the otb.slurm script, replace those lines with the nextflow command in reports/*.nextflow.command.txt, and add the -resume flag at the end. Note you must leave the work directory intact for the resume to operate properly. More information can be found here in the OTB wiki.
If a particular step in the pipeline needs more resources, you can edit the run.nf file in the OTB directory. For example, to require more memory for the HiFiASM step, you can add a memory entry to the process block in the run.nf file:
process HiFiASM {
label 'longq'
container = 'dmolik/hifiasm'
cpus = params.threads
memory = '128 GB'