Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant ([rtdatavis.github.io](http://rtdatavis.github.io/)). The following links are specific to the Paraview program and the workflows that have been used my researchers at the U of Arizona. These links are distinct from the others posted in the beginner paraview access ci links from the University of Arizona in that they are for more complex workflows. The links included explain how to use the terminal with paraview (pvpython), and the steps to leverage HPC resources for headless batch rendering. The batch rendering tutorial is significantly more complex than the others so if you find yourself stuck please post on the https://ask.cyberinfrastructure.org/ and I will try to troubleshoot with you.
A VPN, or Virtual Private Network, is a technology that creates a secure tunnel between your device and a VPN server. This tunnel encrypts all of your traffic, making it unreadable to anyone who tries to intercept it.
GPU training series for scientists, software engineers, and students, with emphasis on Earth science applications.
The content of this course is coordinated with the 6 month series of GPU Training sessions starting in Februrary 2022. The NVIDIA High Performance Computing Software Development Kit (NVHPC SDK) and CUDA Toolkit will be the primary software requirements for this training which will be already available on NCAR's HPC clusters as modules you may load. This software is free to download from NVIDIA by navigating to the NVHPC SDK Current Release Downloads page and the CUDA Toolkit downloads page. Any provided code is written specifically to build and run on NCAR's Casper HPC system but may be adapted to other systems or personal machines. Material will be updated as appropriate for the future deployment of NCAR's Derecho cluster and as technology progresses.
MediaPipe is Google's open-source framework for building multimodal (e.g., video, audio, etc.) machine learning pipelines. It is highly efficient and versatile, making it perfect for tasks like gesture recognition.
This is a tutorial on how to make a custom model for gesture recognition tasks based on the Google MediaPipe API. This tutorial is specifically for video-playback, though could be generalized to image and live-video feed recognition.
This tutorial is essentially the "hello world" of image recognition and feed-forward neural network (using PyTorch). Using the MNIST database (filled within images of handwritten digits), the tutorial will instruct how to build a feed-forward neural network that can recognize handwritten digits. A solid understanding of feed-forward and back-propagation is recommended.
pip stands for "pip installs packages". It's the go-to package manager for Python, allowing developers to install, update, and manage software libraries and dependencies used in Python projects. With just a few commands in your terminal or command prompt, pip makes it effortless to fetch libraries from the Python Package Index (PyPI) and integrate them into your projects. This guide will walk you through the basics of pip, from installation to advanced package management.
A book for researchers who contribute code to R projects: This booklet is the result of my work with the Social Cognition for Social Justice lab. It was developed in response to questions I was getting from students; both grad students that were making software design decisions, and undergraduates who were using things like version control for the first time. Although many tutorials and resources exist for these topics, there was not a single source that I thought covered just enough material to build up to the workflow used by the lab without extraneous detail.
CMake is an open-source tool used to manage the build process in operating systems. This tutorial takes you through how to use CMake from the very basics with example projects.
Active inference is an emerging study field in machine learning and computational neuroscience. This website in particular introduces "active inference institute", which has established a couple of years ago, and contains a wide variety of resources for understanding the theory of active inference and for participating a worldwide active inference community.
A compilation of the slides from this year's RMACC Sys Admin Workshop.
RMACC Sys Admin Workhop Schedule:
Tuesday
12:00 PM Sign-in
1:00 PM Introductions
1:30 PM Lightning Talk - HPC Survival guide
2:00 PM Node Management - Scott Serr
2:30 PM Lightning Talk - Warewulf
3:00 PM Urgent HPC - Coltran Hophan-Nichols and Alexander Salois
Wednesday
9:00 AM Breakfast
10:00 AM Round table Sites - BYU, INL, UMT, ASU, MSU
11:00 AM Open OnDemand setup - Dean Anderson
11:30 AM Lightning talk - Long term hardware support
12:00 PM Lunch
1:00 PM HPC Security - Matt Bidwell
2:00 PM Lightning talk- Security
2:30 PM ACCESS resources - Couso
3:00 PM Easybuild tutorial - Alexander Salois
3:30 PM General Q & A
Thursday
9:00 AM Breakfast
10:00 AM Lightning Talk- Containers and Virtual Machines
11:00 AM University of Montana - Hellgate Site Tour
11:30 AM Closing Remarks
MDAnalysis is a python based library of tools for the analysis of molecular dynamics simulations. It is able to read and write many popular simulation formats including CHARMM, LAMMPS, GROMACS, and AMBER and more. This link contains the documentation pages of all MDAnalysis functions and has links to tutorials using Jupyter Notebooks.
Docker allows for containerization of any task - basically a smaller, scalable version of a virtual machine. This is very useful when transferring work across computing environments, as it ensures reproducibility.
A lecture and notes with the goal of teaching introductory python. Starting by understanding how to download and start using python, then expanding to basic syntax for lists, arrays, loops, and methods.
This article discusses the importance of fairness in machine learning and provides insights into how Google approaches fairness in their ML models.
The article covers several key topics:
Introduction to fairness in ML: It provides an overview of why fairness is essential in machine learning systems, the potential biases that can arise, and the impact of biased models on different communities.
Defining fairness: The article discusses various definitions of fairness, including individual fairness, group fairness, and disparate impact. It explains the challenges in achieving fairness due to trade-offs and the need for thoughtful considerations.
Addressing bias in training data: It explores how biases can be present in training data and offers strategies to identify and mitigate these biases. Techniques like data preprocessing, data augmentation, and synthetic data generation are discussed.
Fairness in ML algorithms: The article examines the potential biases that can arise from different machine learning algorithms, such as classification and recommendation systems. It highlights the importance of evaluating and monitoring models for fairness throughout their lifecycle.
Fairness tools and resources: It showcases various tools and resources available to practitioners and developers to help measure, understand, and mitigate bias in machine learning models. Google's TensorFlow Extended (TFX) and What-If Tool are mentioned as examples.
Google's approach to fairness: The article highlights Google's commitment to fairness and the steps they take to address fairness challenges in their ML models. It mentions the use of fairness indicators, ongoing research, and partnerships to advance fairness in AI.
Overall, the article provides a comprehensive overview of fairness in machine learning and offers insights into Google's approach to building fair ML models.
EasyBuild is a software installation framework that allows administrators to easily build and install software on high-performance computing (HPC) systems. It supports a wide range of software packages, toolchains, and compilers.
Supported software are found in the EasyConfigs repository, one of several resositories in EasyBuild project.
Plots.jl is the most widely used plotting library for the Julia programming language. It's known for being especially powerful in its versatility and intuitiveness. It's limited set of dependencies and wide applicability across different graphics packages make it especially helpful in visualizing the results of your latest Julia implementation.
However, there are still multiple options available for Julia programmers to visualize their datasets. The second link details a comparison against a variety of Julia packages.
The Neuroimaging Tools and Resources Collaboratory (NITRC) is a neuroimaging informatics knowledge environment for MR, PET/SPECT, CT, EEG/MEG, optical imaging, clinical neuroinformatics, imaging genomics, and computational neuroscience tools and resources.
Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely compatible with other versions. Before compiling change extension from thrust_ex.txt to thrust_ex.cu. Any code on the device (GPU) that is run through a Thrust transform is automatically parallelized on the GPU. Host (CPU) code will not be. Thrust code can also be compiled to run on a CPU for practice.
Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems including Linux and Windows on x64 hardware.