These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
A comprehensive list of training resources from the HPC University. HPCU is a virtual organization whose primary goal is to provide a cohesive, persistent, and sustainable on-line environment to share educational and training materials for a continuum of high performance computing environments that span desktop computing capabilities to the highest-end of computing facilities offered by HPC centers.
Cornell Virtual Workshop is a comprehensive training resource for high performance computing topics. The Cornell University Center for Advanced Computing (CAC) is a leader in the development and deployment of Web-based training programs. Our Cornell Virtual Workshop learning platform is designed to enhance the computational science skills of researchers, accelerate the adoption of new and emerging technologies, and broaden the participation of underrepresented groups in science and engineering. Over 350,000 unique visitors have accessed Cornell Virtual Workshop training on programming languages, parallel computing, code improvement, and data analysis. The platform supports learning communities around the world, with code examples from national systems such as Frontera, Stampede2, and Jetstream2.
This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how LLMs work and use them in your own projects.
Learn how to use Linux commands in a python script. Specifically, learn how to use the subprocess and os modules in python to run shell commands (which run Linux commands) in a python script that is run on a cluster.
This is a great mentoring resource and has many articles related to mentoring. It is a one-stop shop for mentoring, and at the bottom, there are tags based on topics, and interested users can pick and choose articles and resources on different types of mentorship.
This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data loading and preprocessing to training and model evaluation. Throughout the sessions, students participate in writing and executing simple deep learning programs using Pytorch – a popular Python library for developing, training, and deploying deep learning models.
DeapSECURE is a training program to infuse high-performance computational techniques into cybersecurity research and education. It is an NSF-funded project of the ODU School of Cybersecurity along with the Department of Electrical and Computer Engineering and the Information Technology Services at ODU. The DeapSECURE team has developed six non-degree training modules to expose cybersecurity students to advanced CI platforms and techniques rooted in big data, machine learning, neural networks, and high-performance programming. Techniques taught in DeapSECURE workshops are rather general and transferable to other areas including science, engineering, finance, linguistics, etc. All lesson materials are made available as open-source educational resources.
This course from MIT OpenCourseWare (OCW) covers very basic information on how to get started with programming using Python. Lectures are available, along with practice assignments, to users at no cost. Python has many applications in tech today, from web frameworks to machine learning. This course will also instruct users on how to get set up with an IDE, which will allow for way more efficient debugging.
Understand the benefits of an automated version control system and the basics of how automated version control systems work. Configure git the first time it is used on a computer and understand the meaning of the --global configuration flag. Create a local Git repository and describe the purpose of the .git directory. Go through the modify-add-commit cycle for one or more files, explain where information is stored at each stage of that cycle, and distinguish between descriptive and non-descriptive commit messages.
Open OnDemand is an easy-to-use web portal that lets students, researchers, and industry professionals use supercomputers from anywhere. It is installed on supercomputing resources at hundreds of sites. By eliminating the need for client software or command-line interface, Open OnDemand empowers users of all skill levels and significantly speeds up the time to their first computing.
A comprehensive collection of NERSC developed training and tutorial events, offered on regular schedules. All sessions are archived, including slide decks, video recordings, and software examples as are available. Some examples of past training and tutorial topics are listed below
Deep Learning for Sciences Webinar Series
BerkeleyGW Tutorial Workshop
VASP Trainings
Timemory Software Monitoring Tutorial, April 2021
HPCToolkit to Measure and Analyzing GPU Applications Performance Tutorial
Totalview Tutorial
NVidia HPCSDK - OpenMP Target Offload Training
Parallelware Training Series
ARM Debugging and Profiling Tools Tutorial
Roofline on NVIDIA GPUs
GPUs for Science events
3-part OpenACC Training Series
9-part CUDA Training Series
Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely compatible with other versions. Before compiling change extension from thrust_ex.txt to thrust_ex.cu. Any code on the device (GPU) that is run through a Thrust transform is automatically parallelized on the GPU. Host (CPU) code will not be. Thrust code can also be compiled to run on a CPU for practice.
Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own library built on PyTorch, news articles, and other resources to dive into this realm.
R for Data Science is a comprehensive resource for individuals looking to harness the power of the R programming language for data analysis, visualization, and statistical modeling. Whether you're a beginner or an experienced data scientist, this guide will help you unlock the full potential of R in the realm of data science.
The mission of Trusted CI is to lead in the development of an NSF Cybersecurity Ecosystem with the workforce, knowledge, processes, and cyberinfrastructure that enables trustworthy science and NSF’s vision of a nation that is a global leader in research and innovation.
In GIS, projections are helpful to take something plotted on a globe and convert it to a flat map that we can print or show on a screen. Unfortunately it also introduces distortions to the objects and features on the map. This not only distorts the objects visually, but the results for any spatial attribute calculations will also reflect this distortion (such as distance and area ). Below is a link to a quick primer on projections, types of distortions that can occur, and suggestions on how to choose a correct projection for your work.
This package lets you easily scrape websites and extract information based on html tags and various other metadata found in the page. It can be useful for large-scale web analysis and other tasks requiring automated data gathering.
Rocky Linux is an open-source enterprise operating system. It is compatible with Red Hat Enterprise Linux (RHEL). It is a community-driven project that provides a stable and reliable platform for production workloads. It is one of the best alternatives to Opensource CentOS, since Centos will be on end of life (EoL) soon in 2024 by shifting to CentOS Stream.
This tutorial introduces the use of Containers using the Charliecloud software suite. This tutorial will provide participants with background and hands-on experience to use basic Charliecloud containers for HPC applications. We discuss what containers are, why they matter for HPC, and how they work. We'll give an overview of Charliecloud, the unprivileged container solution from Los Alamos National Laboratory's HPC Division. Students will learn how to build toy containers and containerize real HPC applications, and then run them on a cluster. Exercises are demonstrated using the ACES cluster, a composable accelerator testbed at Texas A&M University. Students with an allocation on the ACES cluster can follow along with the ACES-specific exercises.