Knowledge Base Resources

Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!

Add a Resource

Campus Research Computing Consortium (CaRCC)

CaRCC

CaRCC – the Campus Research Computing Consortium – is an organization of dedicated professionals developing, advocating for, and advancing campus research computing and data and associated professions. Vision: CaRCC advances the frontiers of research by improving the effectiveness of research computing and data (RCD) professionals, including their career development and visibility, and their ability to deliver services and resources for researchers. CaRCC connects RCD professionals and organizations around common objectives to increase knowledge sharing and enable continuous innovation in research computing and data capabilities.

0 Likes

Type

website

Level

Regulated Research Community of Practice

Regulated Research Community of Practice

The daily news clearly shows the increasing threat to safety and privacy of data, personal as well as intellectual property. While the requirements such as DFARS 7012, HIPAA, and Cybersecurity Maturity Model Certification (CMMC) improve the consistency of data handling between agencies and contractors and grantees, it leaves academic institutions to figure out how to meet such requirements in a cost-effective way that fits the research and education mission of the institution. Most institutions, agencies, and companies act in isolation with one-off contract language to address data security and safeguarding concerns. Even though cybersecurity has a clear and uniform goal of protecting data, a onesize solution does not fit all academic institutions. By supporting this community with development of a community strategic roadmap, regular discussions and workshops, and a repository of generalized and specific resources for handling regulated research programs RRCoP lowers the barrier to entry for institutions handling new regulations.

community-outreach cybersecurity

0 Likes

Type

website

Level

Fundamentals of R Programming

This course is an introduction to the R programming language and covers the fundamental concepts needed to operate in the R environment. This course was taught for the ACCESS community on September 26, 2023, but the materials for the course are still available on the ACES cluster and can be completed independently. All materials are presented as learnR notebooks and cover several topics, including data types, variables, built-in functions, data structures, and plotting.

ACES TAMU plotting data-analysis r

0 Likes

Type

learning

Level

GPU Computing Workshop Series for the Earth Science Community

GPU training series for scientists, software engineers, and students, with emphasis on Earth science applications. The content of this course is coordinated with the 6 month series of GPU Training sessions starting in Februrary 2022. The NVIDIA High Performance Computing Software Development Kit (NVHPC SDK) and CUDA Toolkit will be the primary software requirements for this training which will be already available on NCAR's HPC clusters as modules you may load. This software is free to download from NVIDIA by navigating to the NVHPC SDK Current Release Downloads page and the CUDA Toolkit downloads page. Any provided code is written specifically to build and run on NCAR's Casper HPC system but may be adapted to other systems or personal machines. Material will be updated as appropriate for the future deployment of NCAR's Derecho cluster and as technology progresses.

optimization performance-tuning profiling parallelization github pytorch tensorflow oceanography gpu hpc-arch-and-perf training c c++fortran cuda jupyterhub programming programming-best-practices python

0 Likes

Type

learning

Level

Neurodesk

Neurodesk

Neurodesk provides a containerised data analysis environment to facilitate reproducible analysis of neuroimaging data. Analysis pipelines for neuroimaging data typically rely on specific versions of packages and software, and are dependent on their native operating system. These dependencies mean that a working analysis pipeline may fail or produce different results on a new computer, or even on the same computer after a software update. Neurodesk provides a platform in which anyone, anywhere, using any computer can reproduce your original research findings given the original data and analysis code.

psychology containers software-installation version-control

0 Likes

Type

website

Level

An Introduction to the Julia Programming Language

The Julia Programming Language is one of the fastest growing software languages for AI/ML development. It writes in manner that's similar to Python while being nearly as fast as C++, while being open source, and reproducible across platforms and environments. The following link provide an introduction to using Julia including the basic syntax, data structures, key functions, and a few key packages.

ai data-analysis machine-learning julia

0 Likes

Type

learning

Level

GIS: Projections and their distortions

Map Projections

In GIS, projections are helpful to take something plotted on a globe and convert it to a flat map that we can print or show on a screen. Unfortunately it also introduces distortions to the objects and features on the map. This not only distorts the objects visually, but the results for any spatial attribute calculations will also reflect this distortion (such as distance and area ). Below is a link to a quick primer on projections, types of distortions that can occur, and suggestions on how to choose a correct projection for your work.

gis

0 Likes

Type

learning

Level

Introduction to Probabilistic Graphical Models

https://ermongroup.github.io/cs228-notes/

This website summarizes the notes of Stanford's introductory course on probabilistic graphical models. It starts from the very basics and concludes by explaining from first principles the variational auto-encoder, an important probabilistic model that is also one of the most influential recent results in deep learning.

ai machine-learning

0 Likes

Type

learning

Level

Long Tales of Science: A podcast about women in HPC

Long Tales of Science

A series of interviews with women in the HPC community

science-gateway community-outreach professional-development project-management proposal-development training workforce-development xsede

0 Likes

Type

website

Level

R for Data Science

https://r4ds.had.co.nz/index.html

R for Data Science is a comprehensive resource for individuals looking to harness the power of the R programming language for data analysis, visualization, and statistical modeling. Whether you're a beginner or an experienced data scientist, this guide will help you unlock the full potential of R in the realm of data science.

visualization data-analysis data-science r

0 Likes

Type

learning

Level

fast.ai

fast.ai Homepage

Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own library built on PyTorch, news articles, and other resources to dive into this realm.

ai machine-learning pytorch training

0 Likes

Type

website

Level

ACCESS Support Portal

ACCESS Support Portal

affinity-group pegasus ACCESS-website open-ondemand

0 Likes

Type

website

Level

Educause HEISC-800-171 Community Group

Educause HEISC-800-171 Community Group

The purpose of this group is to provide a forum to discuss NIST 800-171 compliance. Participants are encouraged to collaborate and share effective practices and resources that help higher education institutions prepare for and comply with the NIST 800-171 standard as it relates to Federal Student Aid (FSA), CMMC, DFARS, NIH, and NSF activities.

cybersecurity

0 Likes

Type

website

Level

Understanding LLM Fine-tuning

The Ultimate Guide to LLM Fine Tuning: Best Practices & Tools

With the recent uprising of LLM's many business are looking at way to adopt these LLMs and fine-tuning these models on specfic data sets to ensure accuracy. These models when fine-tuned can be optimal for fulfilling the specific needs of a company. This site explains explicitly when, how, and why models should be trained. It goes over various strategies for LLM fine -tuning.

big-data training

0 Likes

Type

learning

Level

Knowledge Base Resources

Topics

Programming Language

Science Domain

Skill Level

Content Type