These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!
This package lets you easily scrape websites and extract information based on html tags and various other metadata found in the page. It can be useful for large-scale web analysis and other tasks requiring automated data gathering.
Scikit-learn is free software machine learning library for Python. It has a variety of features you can use on data, from linear regression classifiers to xg-boost and random forests. It is very useful when you want to analyze small parts of data quickly.
Numpy is a python package that leverages types and compiled C code to make many math operations in Python efficient. It is especially useful for matrix manipulation and operations.
This repository contains information about Jupyter Widgets and how they can be used to develop interactive workflows, data dashboards, and web applications that can be run on HPC systems and science gateways. Easy to build web applications are not only useful for scientists. They can also be used by software engineers and system admins who want to quickly create tools tools for file management and more!
Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization.