|
| 1 | +# Compressor |
| 2 | +A Fast C++ Lossless Compress/Decompress CLI Program with High Compression Ratio |
| 3 | + |
| 4 | +# Algorithms Implemented |
| 5 | +- Huffman |
| 6 | +- LZW *(Lempel – Ziv – Welch)* |
| 7 | +- BWT *(Burrows - Wheeler Transform)* and MTF *(Move To Front)* |
| 8 | + |
| 9 | +The Pipeline which produces best compression ratio is **`BWT -> MTF -> LZW`** |
| 10 | + |
| 11 | +# Compatibility |
| 12 | +Tested on Linux (Ubuntu) with GNU GCC and Windows with Microsoft Visual Studio and Mingw-w64 |
| 13 | + |
| 14 | +Requires CMake and C++11 or above |
| 15 | + |
| 16 | +# How to Build |
| 17 | +- Clone this repo with its submodules **`git clone --recurse-submodules https://github.com/3omar-mostafa/Compressor.git`** |
| 18 | +- Use CMake to Build |
| 19 | +```bash |
| 20 | +mkdir cmake-build |
| 21 | +cd ./cmake-build |
| 22 | +cmake .. |
| 23 | +``` |
| 24 | + |
| 25 | +You Can Create Visual Studio Project on Windows |
| 26 | +```bash |
| 27 | +mkdir cmake-build |
| 28 | +cd ./cmake-build |
| 29 | +cmake .. -G "Visual Studio 15 2017" |
| 30 | +``` |
| 31 | + |
| 32 | +# How To Run |
| 33 | +``` |
| 34 | +./Compressor OPTION input_file output_file |
| 35 | +OPTION: |
| 36 | + -c --compress Compress the file |
| 37 | + -d --decompress Decompress the file |
| 38 | +``` |
| 39 | + |
| 40 | +# Tests and Results |
| 41 | +Tested on [enwik8](http://mattmahoney.net/dc/enwik8.zip) (Size of 100MB) |
| 42 | + |
| 43 | +Tested on Github Action Servers with 2-core 64-Bit CPU, 7 GB of RAM memory and SSD. [Learn more about environment](https://docs.github.com/en/actions/reference/specifications-for-github-hosted-runners#supported-runners-and-hardware-resources) |
| 44 | + |
| 45 | +Executables are built in release mode with max compiler optimization |
| 46 | + |
| 47 | +### Compression Ratio on enwik8: 3.858, Lossless Integrity checked with SHA512 |
| 48 | + |
| 49 | +| Platform | Compress Time (Avg) | Decompress Time (Avg) | |
| 50 | +|:-----------------------:|:-------------------:|:---------------------:| |
| 51 | +| Linux (Ubuntu) - GNU GCC| 1m 25s | 20s | |
| 52 | +| Windows - Visual Studio | 4m 30s | 46s | |
| 53 | +| Windows - Mingw-w64 | 4m 3s | 36s | |
| 54 | + |
| 55 | +# TODO |
| 56 | + - [ ] More Memory Optimization |
| 57 | + |
| 58 | +# Credits |
| 59 | +Used this paper **`Linear Work Suffix Array Construction (2006)`** By **`Juha Kärkkäinen, Peter Sanders, Stefan Burkhardt`** |
| 60 | +to implement a faster Suffix Array **`O(n)`** instead of **`O(nlogn)`** |
0 commit comments