Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data loading and preprocessing to training and model evaluation. Throughout the sessions, students participate in writing and executing simple deep learning programs using Pytorch – a popular Python library for developing, training, and deploying deep learning models.
Monthly workshops sponsored by ACCESS on a variety of HPC topics organized by Pittsburgh Supercomputing Center (PSC). Each workshop will be telecast to multiple satellite sites and workshop materials are archived.
GPU training series for scientists, software engineers, and students, with emphasis on Earth science applications.
The content of this course is coordinated with the 6 month series of GPU Training sessions starting in Februrary 2022. The NVIDIA High Performance Computing Software Development Kit (NVHPC SDK) and CUDA Toolkit will be the primary software requirements for this training which will be already available on NCAR's HPC clusters as modules you may load. This software is free to download from NVIDIA by navigating to the NVHPC SDK Current Release Downloads page and the CUDA Toolkit downloads page. Any provided code is written specifically to build and run on NCAR's Casper HPC system but may be adapted to other systems or personal machines. Material will be updated as appropriate for the future deployment of NCAR's Derecho cluster and as technology progresses.
This tutorial provides a comprehensive introduction to CUDA programming, focusing on essential concepts such as CUDA thread hierarchy, data parallel programming, host-device heterogeneous programming model, CUDA kernel syntax, GPU memory hierarchy, and memory optimization techniques like global memory coalescing and shared memory bank conflicts. Aimed at researchers, students, and practitioners, the tutorial equips participants with the skills needed to leverage GPU acceleration for scalable computation, particularly in the context of AI.
This tutorial introduces the use of Containers using the Charliecloud software suite. This tutorial will provide participants with background and hands-on experience to use basic Charliecloud containers for HPC applications. We discuss what containers are, why they matter for HPC, and how they work. We'll give an overview of Charliecloud, the unprivileged container solution from Los Alamos National Laboratory's HPC Division. Students will learn how to build toy containers and containerize real HPC applications, and then run them on a cluster. Exercises are demonstrated using the ACES cluster, a composable accelerator testbed at Texas A&M University. Students with an allocation on the ACES cluster can follow along with the ACES-specific exercises.
Learning resources for using PyTorch as a DNA analysis platform/tool are scarce and scattered. I have attempted to compile some resources that may help beginners get started on their journey into this versatile and yet unexplored field of genomics with PyTorch. The resources listed are intended to give the biggner different perspectives and opportunities to pick up the subject matter with minimal effort spent locating resources.
Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting. This documentation describes how to use the Expanse cluster with some specific information for people with ACCESS accounts.
Thrust is a CUDA library that optimizes parallelization on the GPU for you. The Thrust tutorial is great for beginners. The documentation is helpful for anyone using Thrust.
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in parallel. The library supports popular deep learning framework such as TensorFlow, Keras, PyTorch, and Apache MXNet.
4/5/25 - 4/6/25
The Duke IEEE Student Chapter is working with ACCESS to host a workshop on a introduction to supercomputing.
All workshop resources are available on https://workshop.dukeieee.org/
Topics include:
Here's a summarized list of topics for the event:
Day 1: Saturday, April 5th
Opening Remarks & ACCESS Overview (Including how to request compute usage with Jetstream 2) Tutorial 1: Introduction to Supercomputing Architecture, Linux, and Job Scheduling (SLURM) Tutorial 2: Containerized Large Language Model Inference and Finetuning Tutorial 3: Portable Code - Local Containers to HPC Scale Tutorial 4: ACCESS Pegasus - Serverless Data Processing Workflow in Jupyter Notebooks Networking & Hors D'oeuvres
Day 2: Sunday, April 6th
Tutorial 2: Deep Dive in AI Agents - "Building Superintelligence in 90 Minutes" by Harry Fazzone Tutorial 3: DASK - Python-based Distributed Computing Framework for HPC Tutorial 4: Basic Parallelism & MPI by Rebecca Hartman-Baker, PhD (NERSC) Closing Talk: Capt. Grace Hopper on Future Possibilities: Data, Hardware, Software, and People (Part One, 1982)
This tutorial explains how to use Python for GPU acceleration with libraries like CuPy, PyOpenCL, and PyCUDA. It shows how these libraries can speed up tasks like array operations and matrix multiplication by using the GPU. Examples include replacing NumPy with CuPy for large datasets and using PyOpenCL or PyCUDA for more control with custom GPU kernels. It focuses on practical steps to integrate GPU acceleration into Python programs.
This technology lab contains a set of sessions to help a new user start an AI project on the ACES cluster, a composable accelerator testbed at Texas A&M University. You will learn how to create and activate a virtual environment, manipulate and visualize data with Pandas and Matplotlib, use Scikit-learn for linear regression and classification applications, and use Pytorch to create and train a simple image classification model with deep neural networks (DNN).
Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely compatible with other versions. Before compiling change extension from thrust_ex.txt to thrust_ex.cu. Any code on the device (GPU) that is run through a Thrust transform is automatically parallelized on the GPU. Host (CPU) code will not be. Thrust code can also be compiled to run on a CPU for practice.