Dorado
------
Dorado is a high-performance, easy-to-use, open source basecaller for Oxford Nanopore reads. 
It needs to be run on a partition or node with **GPU compute/CUDA** support, and is heavily optimised for NVIDIA A100 and H100 GPUs. 

On the cluster, use either the :code:`aoraki_gpu_A100` or :code:`aoraki_gpu_H100` partition to ensure access to suitable GPUs.

Dorado is made available on the cluster as a shared :doc:`Apptainer <apptainer>` container image.

You can use the :code:`apptainer/dorado` module to add a convenient alias to running :code:`dorado` within the container:

.. code-block:: bash

    module avail dorado
    module load apptainer/dorado/0.7.1
    # The following is required to use aliases in a non-interactive/SLURM batch script:
    shopt -s expand_aliases
    dorado ....

Alternatively, you can run Dorado directly with Apptainer; i.e. to run binaries within the container, prefix any command with :code:`apptainer -s run --nv <$APPTAINER_IMG/apptainer_image.sif>`. 

Make sure to specify :code:`--nv` to enable NVIDIA GPU support. For example:

.. code-block:: bash

    apptainer -s run --nv $APPTAINER_IMG/dorado-<version>.sif dorado basecaller /models/dna_r10.4.1_e8.2_400bps_hac@v4.1.0 ~/pod5s/

Also note that :code:`-s / --silent` is required here to suppress the verbose container output that may otherwise contaminate standard output.

* To list the available basecaller models, run

.. code-block:: bash

    $APPTAINER_IMG/dorado-<version>.sif ls -1 /models

As with all Apptainer containers, take care to qualify the files and paths in the context of the container image. Models are located within the container in :code:`/models/`, and any other files and data stored outside the container need to be in a folder that is bound by Apptainer into the container (either by default, such as your :code:`$HOME`, :code:`/scratch`, or :code:`/projects`), or by explicitly specifying a bind mount with Apptainer's :code:`--bind` option.

Please refer to the Dorado GitHub page for more information regarding running Dorado:
https://github.com/nanoporetech/dorado

Example SLURM batch script
---------------------------
Below is an example SLURM batch script for running Dorado on either the :code:`aoraki_gpu_H100` or :code:`aoraki_gpu_A100` partition of the cluster. This example uses the Apptainer module with the dorado alias:

.. code-block:: bash

    #!/bin/bash
    #SBATCH --job-name=dorado_basecall
    #SBATCH --partition=aoraki_gpu_A100
    # Alternatively, use --partition=aoraki_gpu_H100
    #SBATCH --gres=gpu:1
    #SBATCH --cpus-per-task=8
    #SBATCH --mem=32G
    #SBATCH --time=02:00:00
    #SBATCH --output=logs/dorado_%j.out
    #SBATCH --error=logs/dorado_%j.err

    # Load the Apptainer/Dorado module
    module load apptainer/dorado/0.7.1

    # Enable aliases in non-interactive SLURM shell
    shopt -s expand_aliases

    echo "CUDA_VISIBLE_DEVICES=$CUDA_VISIBLE_DEVICES"
    nvidia-smi

    # Run Dorado with the module-provided alias
    dorado basecaller \
      /models/dna_r10.4.1_e8.2_400bps_hac@v4.1.0 \
      ~/pod5s/ \
      --emit-mapping