Skip to content

ryanku98/lost-whitespace-file-recreation-project

Repository files navigation

lost-whitespace-file-recreation-project

This is a C++ project created under the premise that we want to be able to identify all possible full sequences of words that exist in a whitespace-less sequence of characters: Project Instructions.txt

I am offering this code unlicensed (as it's simply just a project for a class cancelled last minute due to air quality issues), but if you do use any part of my code/concepts, please do note the original source in your file (a link to this page will suffice). Thanks!

USE

Requirements

Run in a UNIX shell like Bash or Terminal.

Compile

dictionary.txt
g++ -o file_recreation file_recreation.cpp
dictionary2.txt
g++ -o file_recreation2 file_recreation2.cpp

Run

Second and third arguments should be the filepath to the dictionary .txt file and the filepath to a "compressed" input .txt file.

dictionary.txt
./file_recreation dictionary.txt Examples/Ex1.txt
dictionary2.txt
./file_recreation2 dictionary2.txt Examples/Ex2.txt

Author Notes

Files outputted represent every possible "original file" based on the dictionary .txt file used to define what constitutes as a "word", found in a generated Output directory.

The rankings.txt file can be used (also found in the Output directory) to get the ranks of each file's content. In addition, the console will output the top (up to 3) lowest ranked files (which represent the 3 files with the highest probability of being the original file).

About

Mini-project for an algorithms midterm in lecture that got cancelled last-second...

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages