These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant (rtdatavis.github.io). The following links are specific to the Paraview program and the workflows that have been used my researchers at the U of Arizona. Some of the pages linked are very beginner friendly: getting started, working with cameras and keyframes for rendering, visualizing external files (netcdf climate data), graphs and data exporting.
Many of the workflows involve using remote desktops via the Open On Demand interface, but if this isn't set up at your university you can use paraview locally on a desktop. Feel free to post on access ci https://ask.cyberinfrastructure.org/ if you need assistance getting a paraview gui open for your work on HPC.
Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures, and data analysis tools for the Python programming language. The official documentation serves as an in-depth guide to using this powerful tool including explanations and examples.