Introduction

UVA Research Computing provides shared high-performance computing resources for compute- and data-intensive research across the University of Virginia (UVA). In the standard security zone, its main HPC systems are Rivanna and Afton. Rivanna is the established general-purpose research cluster, while Afton is the newer system that expands CPU, memory, and GPU capacity. Together, Rivanna and Afton provide login nodes, heterogeneous compute nodes, shared storage, preinstalled software, GPU resources, and Slurm-managed queues for batch and interactive workloads.

UVA Computer Science also maintains a UVA CS Slurm cluster environment for CS-managed servers. It follows the same scheduler model: users request CPUs, memory, GPUs, time, and partitions, then submit work with commands such as sbatch, salloc, and srun. However, the CS Slurm cluster is separate from UVA Research Computing’s Rivanna/Afton systems, so CS users should follow CS-specific login, partition, reservation, GPU, and job-submission guidance when using department nodes.

This guide focuses on Rivanna and Afton.

Access and Allocations

Learn how to request an allocation and add collaborators. https://www.rc.virginia.edu/userinfo/hpc/access/

Logging In

Log in through a Web browser or a command-line tool. https://www.rc.virginia.edu/userinfo/hpc/login/

Web-based Access

Open OnDemand is a graphical user interface that allows access to HPC via a web browser. The Open OnDemand access point is https://ood.hpc.virginia.edu/. Within the Open OnDemand environment users have access to a file explorer; interactive applications like Jupyter, RStudio Server & FastX Web; a command line interface; and a job composer and job monitor to submit jobs to the Rivanna and Afton clusters. Detailed instructions can be found on Open OnDemand documentation page.

Secure Shell Access (SSH)

ssh -Y rhe9cf@login.hpc.virginia.edu

File Transfer

Moving files between Rivanna/Afton and other systems. https://www.rc.virginia.edu/userinfo/data-transfer/

Software

See a listing of available software. https://www.rc.virginia.edu/userinfo/hpc/software/

Storage

Options for free short-term and leased long-term storage. https://www.rc.virginia.edu/userinfo/hpc/storage/

Running Jobs in Slurm

Submit jobs to Rivanna/Afton through the Slurm resource manager. https://www.rc.virginia.edu/userinfo/hpc/slurm/

UVA CS Slurm

Use the UVA CS Slurm guide for department-managed compute nodes, partitions, GPU constraints, reservations, interactive jobs, and sbatch examples. https://www.cs.virginia.edu/computing/doku.php?id=compute_slurm

Job Queues

Determine the best queue (or “partition”) for running your jobs. https://www.rc.virginia.edu/userinfo/hpc/#job-queues

Usage Policies

Understand the terms and conditions for using Rivanna/Afton. https://www.rc.virginia.edu/userinfo/hpc/#usage-policies

FAQs

Frequently Asked Questions. https://www.rc.virginia.edu/userinfo/faq/rivanna-faq

UVARC GenAI

User guide for UVA Research Computing’s GenAI service. https://www.rc.virginia.edu/userinfo/rcgenai-userguide/

Multi-GPU LLM Inference

Workshop and learning notes for running large language model inference across multiple GPUs. https://cal.lib.virginia.edu/event/14584586 https://staging.learning.rc.virginia.edu/notes/multigpu-inference/

Hugging Face Accelerate DeepSpeed Deploying vLLM on UVA High-Performance Computing Systems

Reference List

  1. https://www.rc.virginia.edu/userinfo/hpc/
  2. https://www.rc.virginia.edu/userinfo/computing-environments/
  3. https://www.cs.virginia.edu/computing/doku.php?id=compute_slurm