Conda ----- Installation ^^^^^^^^^^^^ We recommend using `Miniforge `_ to manage conda environments and packages. Miniforge is a community-led, minimal conda/mamba installer that uses `conda-forge `_ as the default channel. To install Miniforge under your user account, you can use the following commands: .. code-block:: bash wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh bash Miniforge3-Linux-x86_64.sh -b -u Conda environments ^^^^^^^^^^^^^^^^^^ Activating the base environment. .. code-block:: bash source ~/miniforge3/bin/activate Your command prompt will then change to include "(base) " at the start, in order to remind you that this environment is activated. You can deactivate the environment by typing: .. code-block:: bash conda deactivate **Creating and activating a sub-environment** | Although once you have activated the base conda environment, you can in principle start to install packages immediately, your use of conda will generally be better organised if you do not install packages directly into the base environment, but instead use a named sub-environment. You can have multiple sub-environments under a single base environment, and activate the one that is required at any one time. Unless you install packages directly into the base environment, your sub-environments will work independently. To create a named environment (for example, called "myenv"), ensure that the base environment is activated (the command prompt should start with "(base) "), and type: .. code-block:: bash # to create a named environment that will live in ~/.conda/envs conda create -n myenv # or you can create an environment in any* directory with conda create -p /path/to/put/your/environment It will show the proposed installation location, and once you answer the prompt to proceed, will do the installation. If you have followed these instruction, this location should be /home/users//miniconda3/envs/myenv. You can alternatively give it a different location using the option -p instead of -n . *Note* do not create conda environments in subdirectories of /mnt/auto-hcs/ - conda will either fail or have it will have issues. Once you have created your sub-environment, you can activate it using conda activate for example: .. code-block:: bash conda activate myenv The command prompt will then change (e.g. to start with "(myenv) ") to reflect this. Typing conda deactivate once will return you to the base environment; typing it a second time will deactivate conda completely (as above). | To List your conda environments type the following: .. code-block:: bash conda env list Installing conda packages ^^^^^^^^^^^^^^^^^^^^^^^^^ Once you have activated a named environment, you can install packages with the conda install command, for example: .. code-block:: bash conda install gcc You can also force particular versions to be installed. See the conda cheat sheet for details. To list the packages installed in the currently activated environment, you can type conda list. Running packages from your conda environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In order to run packages from a conda environment that you installed previously, you will first need to activate the environment in the session that you are using. This means repeating some of the commands typed above. Of course, you will not need to repeat the steps to create the environment or install the software, but the following may be needed again: .. code-block:: bash source activate conda activate myenv Installing pip packages ^^^^^^^^^^^^^^^^^^^^^^^ Many python packages that are available via PyPI are also available as conda packages in conda-forge, and it is generally best to use these via "conda install" as above. Nonetheless, you can also install pip packages (as opposed to conda packages) into your conda environment. However, first you should type: .. code-block:: bash conda install pip before typing the desired commands such as .. code-block:: bash pip install numpy If you do not install pip into your sub-environment, then either: Your shell will fail to find the pip executable, or your shell will find pip in your base environment, which will lead to pip packages being installed into the base environment, resulting in potential interference between your conda environments Explicitly installing pip into your sub-environment will guard against this. Using conda with SLURM ^^^^^^^^^^^^^^^^^^^^^^ In order to use conda environments within your slurm script you need to source the conda profile script so that the conda paths get set. .. code-block:: bash source ~/miniforge3/etc/profile.d/conda.sh export PYTHONNOUSERSITE=1 # don't add python user site library to path conda activate myenv Adding custom conda environments to Jupyter ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. include:: /common/jupyter_kernels.rst :start-line: 2 General Bioinformatics Tools ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The following categories of bioinformatics tools are available through conda: * Read aligners (e.g., bwa, bowtie2) * Variant callers (e.g., freebayes, gatk, bcftools) * File format tools (e.g., samtools, vcftools) * GWAS tools (e.g., plink, gemma) * Visualization (e.g., igv, multiqc) * RNA-seq / transcriptomics (e.g., kallisto, salmon) * Assemblers (e.g., spades, megahit) Finding Bioinformatics Tools ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ There are several ways to find bioinformatics tools in conda: 1. Search online (recommended for discovery) * Use the Anaconda package search or browse specific channels: * Bioconda: https://anaconda.org/bioconda * Conda-Forge: https://anaconda.org/conda-forge * You can search for tools like: * plink * bcftools * samtools 2. Command-line search From your terminal: .. code-block:: bash # Search all channels (if configured) conda search # Example: conda search plink # If using Mamba (faster alternative to conda) mamba search plink To restrict search to a specific channel: .. code-block:: bash conda search -c bioconda plink 3. Get full list (advanced) You can list everything in a channel, but it's very large: .. code-block:: bash # List all bioconda packages conda search --channel bioconda "*" | less Tip: pipe it through grep to find specific tools: .. code-block:: bash conda search -c bioconda "*" | grep vcftools Managing Conda Environments to Conserve Home Directory Storage ----------------------------- To save home directory storage space, it is recommended to create Conda environments in a shared project directory. This approach allows you to manage your Conda environments within your project directory and if needed share them with collaborators. If you do not yet have a shared project directory, please contact RTIS Solutions to request one. To create Conda environments directly within your project directory (using the ``--prefix`` option), follow the guidelines below: Creating a Conda Environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Run the following command to create a new environment inside your project's shared directory: .. code-block:: bash conda create --prefix /path/to/project_directory/env python This command is Python version agnostic. To specify a particular Python version explicitly, add ``python=x.y`` to the command. Migrating an Existing Conda Environment ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To move an existing Conda environment to a new location: 1. Export your current environment to a YAML file: .. code-block:: bash conda env export --name existing_env > environment.yml 2. Create a new environment from the exported YAML file at your chosen location: .. code-block:: bash conda env create --prefix /path/to/project_directory/env/conda_envs/myenv --file environment.yml 3. Activate the newly created environment: .. code-block:: bash conda activate /path/to/project_directory/env/conda_envs/myenv Creating an Alias for Easy Activation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To simplify environment activation, consider adding an alias to your shell configuration file (e.g., ``.bashrc`` or ``.bash_profile``): .. code-block:: bash alias activate_myenv="conda activate /path/to/project_directory/env" Activate your environment using the alias: .. code-block:: bash activate_myenv This method is Python-version agnostic and provides a convenient way to manage Conda environments in shared or collaborative project directories.