Skip to content
Change the repository type filter

All

    Repositories list

    • A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations of possible data sources. Multiple execution modes in multiple environments enable the user to generate a diff report as a Java/Scala-friendly DataFrame or as a file for future use. Comes with out of the box Spa…
      Scala
      2752173Updated Jun 17, 2025Jun 17, 2025
    • herd-mdl

      Public
      Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
      Java
      1416913Updated Jul 17, 2024Jul 17, 2024
    • Gatekeeper

      Public archive
      Gatekeeper is a self-serviced web application allowing users to make requests for temporary access to EC2 & RDS instances running in AWS and gain access instantly
      Java
      17291019Updated Dec 16, 2023Dec 16, 2023
    • Model Validation Toolkit is a collection of tools to assist with validating machine learning models prior to deploying them to production and monitoring them after deployment to production.
      Python
      62910Updated Dec 1, 2023Dec 1, 2023
    • FINRA open source projects landing page.
      HTML
      138170Updated Oct 27, 2023Oct 27, 2023
    • Fidelius

      Public archive
      Fidelius provides an easy-to-use, secure, and organized way to create, view, and modify collections of encrypted secrets in AWS and to manage user/application access to those secrets.
      Java
      141418Updated Oct 19, 2023Oct 19, 2023
    • maskopy

      Public
      Automated solution to copy and obfuscate production data to target environments in AWS
      Python
      92401Updated May 22, 2023May 22, 2023
    • MLiy

      Public archive
      MLiy (pronounced “Emily”) is a machine-learning platform that allows data scientists to provision and manage processing power in the cloud. It provides an easy-to-use website to install customizable sets of machine learning software for use in data analysis and exploration. This allows data scientists to focus on data analysis rather than how to…
      Shell
      11132Updated May 22, 2023May 22, 2023
    • herd-ui

      Public
      Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and understand the contents of your Herd managed data lake.
      TypeScript
      91620Updated Oct 1, 2022Oct 1, 2022
    • herd

      Public
      Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabytes of data and make it accessible for data processing and analytical purposes by any cloud compute platform.
      Java
      411381242Updated Oct 1, 2022Oct 1, 2022
    • Sample Code
      Scala
      4205Updated Sep 8, 2022Sep 8, 2022
    • DataGenerator is a Java library for systematically producing large volumes of data. DataGenerator frames data production as a modeling problem, with a user providing a model of dependencies among variables and the library traversing the model to produce relevant data sets.
      Java
      1681653514Updated Jul 7, 2022Jul 7, 2022
    • Java
      1001Updated May 20, 2022May 20, 2022
    • MSL

      Public
      MSL (pronounced 'Missile') stands for Mock Service Layer. Our tools enable quick local deployment of your UI code on Node and mocking of your service layer for fast, targeted testing.
      JavaScript
      2432205Updated Feb 11, 2022Feb 11, 2022
    • aphelion

      Public archive
      Aphelion is a web application that captures and visualizes your AWS services usage limits. It continuously collects data in the background and you can visualize the data in easy-to-see graphs and charts.
      Java
      1034112Updated Mar 31, 2021Mar 31, 2021
    • yum-nginx-api

      Public archive
      yum-nginx-api is a go API for uploading RPMs to yum repositories and configurations for running NGINX to serve them. It is a deployable solution with Docker or a single 8MB statically linked Linux binary. yum-nginx-api enables CI tools to be used for uploading RPMs and managing yum repositories.
      Go
      225100Updated Jan 14, 2021Jan 14, 2021
    • JTAF-ExtWebDriver

      Public archive
      Extensions for WebDriver is an enhancement to the powerful WebDriver API, with robust features that keep your browser automation running smoothly. It includes a widget library, improved session management and extended functions over the existing WebDriver API.
      Java
      4726625Updated Oct 29, 2020Oct 29, 2020
    • HiveQLUnit

      Public archive
      Test your Hive scripts inside your favorite IDE with HiveQLUnit! Increase your developers productivity by testing on all operating systems including Windows, Linux and Mac OSX. Build continuous integration and delivery tests to control the releases of your big data products.
      Java
      133972Updated Oct 13, 2020Oct 13, 2020
    • JTAF-XCore

      Public archive
      XCore is a framework to define and execute automated tests. It enables automation code development in Java, test script development in XML via domain specific language, and execution & reporting via JUnit.
      Java
      1610593Updated Oct 13, 2020Oct 13, 2020
    • CTGrazer

      Public archive
      CTGrazer is code you can use to create an AWS Lambda Function that will collect all of your AWS CloudTrail logs and efficiently send them to your Splunk HEC (HTTP Event Collector) server.
      Python
      2800Updated Jun 15, 2018Jun 15, 2018
    • karma-msl

      Public
      Plugin for Karma Test Runner to integrate MSL (Mock Service Layer)
      JavaScript
      5520Updated May 24, 2016May 24, 2016
    • Elasticd

      Public archive
      Elastic Discovery - Help applications that don't quite work in the cloud better handle autoscaling and other cloud events.
      Python
      121010Updated Oct 26, 2015Oct 26, 2015
    • UMD-Bitcamp-2015

      Public archive
      UMD bitcamp challenge solutions.
      6100Updated Apr 13, 2015Apr 13, 2015