Knowledge Base Resources

Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!

Add a Resource

The Carpentries

Carpentries.org

We teach foundational coding and data science skills to researchers worldwide.

administering-hpc training

4 Likes

Type

website

Level

HPC University

HPC University Resources

A comprehensive list of training resources from the HPC University. HPCU is a virtual organization whose primary goal is to provide a cohesive, persistent, and sustainable on-line environment to share educational and training materials for a continuum of high performance computing environments that span desktop computing capabilities to the highest-end of computing facilities offered by HPC centers.

debugging hpc-operations professional-development training workforce-development compiling matlab python r mpi

3 Likes

Type

learning

Level

Relion - Cryo-EM structure determination software

Relion Website

RELION (REgularised LIkelihood OptimisatioN, pronounced rely-on) is a stand-alone software package developed by Sjors Scheres' group at the MRC Laboratory of Molecular Biology. It employs an empirical Bayesian approach for electron cryo-microscopy (cryo-EM) structure determination, specifically for refining multiple 3D reconstructions or 2D class averages.

machine-learning data-analysis image-processing computer-science data-science

2 Likes

Type

website

Level

Open OnDemand

Open Ondemand Home Page

Open OnDemand is an easy-to-use web portal that lets students, researchers, and industry professionals use supercomputers from anywhere. It is installed on supercomputing resources at hundreds of sites. By eliminating the need for client software or command-line interface, Open OnDemand empowers users of all skill levels and significantly speeds up the time to their first computing.

open-ondemand administering-hpc cluster-management cluster-support hpc-operations batch-jobs kubernetes

2 Likes

Type

website

Level

An Introduction to Cryptography with Python

Workshop Tutorial

This comprehensive workshop is designed to guide participants through the world of cryptography, from foundational concepts to advanced implementations. Starting with the basics of encryption, decryption, and hashing, the workshop discusses real-world applications like SSL, blockchain, and digital signatures. Interactive Python-based coding examples, such as symmetric and asymmetric encryption, will provide hands-on experience. Participants will also learn to identify cryptographic vulnerabilities and perform attacks like length extension. Finally, the workshop also explores future trends such as quantum cryptography and zero-knowledge proofs, providing participants with the knowledge to apply cryptography in securing modern digital systems. Ideal for beginners and intermediate learners alike, this workshop is a step-by-step journey into mastering cryptographic principles and practices.

python data-security cybersecurity encryption secure-data-architecture

2 Likes

Type

website

Level

ACCESS HPC Workshop Series

Monthly workshops sponsored by ACCESS on a variety of HPC topics organized by Pittsburgh Supercomputing Center (PSC). Each workshop will be telecast to multiple satellite sites and workshop materials are archived.

deep-learning machine-learning neural-networks big-data tensorflow gpu training openmpi c c++fortran openmp programming mpi spark

1 Like

Type

learning

Level

Containerized Jupyter Notebooks for HPCs

Containerized Jupyter Notebooks for HPCs

This tutorial demonstrates how to create, manage, and deploy containerized Jupyter simulations for High-Performance Computing (HPC) environments, specifically using SLAC's S3DF infrastructure. By utilizing Apptainer (formerly Singularity) containers, users can package complex simulations with all necessary dependencies, input files, and configurations, ensuring reproducibility and ease of use for new users. The automated workflows, powered by GitHub Actions, handle building and updating the containers, while Open OnDemand provides an accessible interface for running Jupyter notebooks directly from the HPC environment. This approach eliminates setup errors, saves time, and ensures consistent simulation environments, enabling researchers to focus on their work instead of system configuration.

1 Like

Type

learning

Level

HPC Carpentry

HPC Carpentry

An HPC focused Carpentry community. Trainings include: HPC fundamentals, python, chapel, LAMMPS, parallelization with python, scaling studies, etc.

software-carpentry training

1 Like

Type

website

Level

Enhanced Sampling for MD simulations

Tools and plugins to enhance molecular dynamics sampling

data-analysis computational-chemistry c++conda cuda python

1 Like

Type

tool

Level

GIS: Geocoding Services

Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically requires an API call to a remote server hosted by an organization/institution. The remote server will take the address attributes provided by you and the remote server will compare it to the data it contains and return a best estimate on the coordinates for that location. There are many geocoding services available with different world coverages, quality of result, and set different rate limits for access. For R, a package called "tidygeocoder" provides an easy way to connect to these different services. As an additional benefit, their documentation provides a good summary of geocoding services available and links to their documentation. The link to the documentation for gecoding services accessible by "tidygeocoder" is provided below. For Python, geopy package is a library that provides connection to various geocoding services. The link to the documentation for this package is also included below.

gis

1 Like

Type

documentation

Level

NCSA HPC Training Moodle

NCSA HPC Training Moodle Site

Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Other related topics include 'Cybersecurity for End Users' and 'Developing Webinar Training.' Some of the tutorials also offer digital badges. Many of these tutorials were previously offered on CI-Tutor. A list of open access training courses are provided below. Parallel Computing on High-Performance Systems Profiling Python Applications Using an HPC Cluster for Scientific Applications Debugging Serial and Parallel Codes Introduction to MPI Introduction to OpenMP Introduction to Visualization Introduction to Performance Tools Multilevel Parallel Programming Introduction to Multi-core Performance Using the Lustre File System

performance-tuning profiling parallelization lustre training workforce-development openmp python mpi cybersecurity

1 Like

Type

learning

Level

ACCESS Pegasus Documentation

ACCESS Pegasus Documentation

The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC) workloads, covering logging in, workflow creation, resource configuration, and monitoring options.

pegasus

1 Like

Type

documentation

Level

Attention, Transformers, and LLMs: a hands-on introduction in Pytorch

This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how LLMs work and use them in your own projects.

ai deep-learning machine-learning neural-networks pytorch

1 Like

Type

learning

Level

Using Linux commands in a python script (and the difference between the subprocess and os python modules)

Using Linux Commands in a Python Script

Learn how to use Linux commands in a python script. Specifically, learn how to use the subprocess and os modules in python to run shell commands (which run Linux commands) in a python script that is run on a cluster.

cluster-management programming python

1 Like

Type

learning

Level

DARWIN Documentation Pages

DARWIN Documentation

DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze Delaware research and education

darwin big-data

1 Like

Type

documentation

Level

Managing Python Packages on an HPC Cluster

Python Packages on HPC

This workshop will go into the different ways python packages can be managed in a cluster environment using conda and python virtual environments both in batch mode from the command line and with Jupyter Notebooks and Jupyter Lab on the cluster. The examples will be run on the GMU HOPPER Cluster.

documentation pytorch data-science open-ondemand batch-jobs job-submission slurm environment-modules anaconda jupyterhub python library-paths dependencies pip version-control

1 Like

Type

documentation

Level

Cornell Virtual Workshop

Cornell Virtual Workshop is a comprehensive training resource for high performance computing topics. The Cornell University Center for Advanced Computing (CAC) is a leader in the development and deployment of Web-based training programs. Our Cornell Virtual Workshop learning platform is designed to enhance the computational science skills of researchers, accelerate the adoption of new and emerging technologies, and broaden the participation of underrepresented groups in science and engineering. Over 350,000 unique visitors have accessed Cornell Virtual Workshop training on programming languages, parallel computing, code improvement, and data analysis. The platform supports learning communities around the world, with code examples from national systems such as Frontera, Stampede2, and Jetstream2.

jetstream matlab cloud-computing data-analysis performance-tuning parallelization file-transfer globus slurm training cuda matlab python r mpi

1 Like

Type

learning

Level

Data Visualization tools for Python

MatPlotLib Docs

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your data extremely easy and works with Python which many people already know.

documentation python

1 Like

Type

documentation

Level

Open OnDemand Documentation Repository

Open OnDemand Documentation repo

This is the main documentation repo for the Open OnDemand Portal which enables researchers to access HPC resources from a familiar web interface.

documentation open-ondemand

1 Like

Type

documentation

Level

Introduction to Deep Learning in Pytorch

This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data loading and preprocessing to training and model evaluation. Throughout the sessions, students participate in writing and executing simple deep learning programs using Pytorch – a popular Python library for developing, training, and deploying deep learning models.

ai deep-learning image-processing machine-learning neural-networks pytorch gpu

1 Like

Type

learning

Level

Useful R Packages for Data Science and Statistics

https://www.udacity.com/blog/2021/01/best-r-packages-for-data-science.html

This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.

plotting visualization data-analysis machine-learning data-science r

1 Like

Type

documentation

Level

Tutorial: Localized RAG Chatbot with ACCESS HPC

Tutorial: Localized RAG Chatbot with ACCESS HPC

This tutorial shows how to set up an open-source customizable RAG chatbot to answer questions about documents you can choose. It uses Indiana's Jetstream 2 HPC, but should work on any major ACCESS HPC.

1 Like

Type

tool

Level

PyTorch Documentation

PyTorch Documantation

PyTorch is an optimized tensor computation library that supports automatic differentiation and is designed to accelerate deep learning research and production on both GPUs and CPUs. Built with flexibility and performance in mind, PyTorch provides a dynamic computational graph and a rich ecosystem of tools for building and deploying deep learning models.

1 Like

Type

documentation

Level

Master’s in Cybersecurity Degree Essentials

Offers comprehensive information on various master's degree options in cybersecurity, including program details, admission requirements, and career opportunities, helping students make informed decisions about pursuing an advanced degree in cybersecurity.

resources professional-development cybersecurity

0 Likes

Type

website

Level

Scipy Lecture Notes

https://lectures.scientific-python.org/

Comprehensive tutorials and lecture notes covering various aspects of scientific computing using Python and Scipy.

visualization data-analysis machine-learning python

0 Likes

Type

learning

Level

Knowledge Base Resources

Topics

Programming Language

Science Domain

Skill Level

Content Type