Skip to content

wsxrdv/sedpack

 
 

Repository files navigation

Sedpack - Scalable and efficient data packing

Coverage Status

Documentation

Mainly refactored from the SCAAML project.

Available components

See the documentation website: https://google.github.io/sedpack/.

Install

Dependencies

To use this library you need to have a working version of TensorFlow 2.x.

Development dependencies:

  • python-dev and gcc for xxhash

Dataset install

Development install

  1. Clone the repository: git clone https://github.com/google/sedpack
  2. Install dependencies: python3 -m pip install --require-hashes -r requirements.txt
  3. Install the package in development mode: python3 -m pip install --editable . (short pip install -e . or legacy python setup.py develop)

Rust install

  • Activate your Python virtual environment
  • Install Rust
  • Run maturin develop --release
  • Run python -m pytest from the project root directory -- no tests should be skipped

Update dependencies

Make sure to have: sudo apt install python3 python3-pip python3-venv and activated the virtual environment.

Install requirements: pip install --require-hashes -r base-tooling-requirements.txt

Update: pip-compile pyproject.toml --generate-hashes --upgrade and commit requirements.txt.

Package install

pip install sedpack

Tutorial

A tutorial and documentation is available at https://google.github.io/sedpack/.

Code for the tutorials is available in the docs/tutorials directory. For a "hello world" see https://google.github.io/sedpack/tutorials/mnist/.

Disclaimer

This is not an official Google product.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 75.1%
  • Rust 13.6%
  • MDX 10.5%
  • Shell 0.5%
  • JavaScript 0.3%
  • TypeScript 0.0%