This repository contains the solution code and data for the Cryptography and Secure Development (CSD) midterm. It demonstrates two attacks against an 8-rotor substitution-permutation style machine: a Known Plaintext Attack (KPA) and a Ciphertext-Only Attack (COA).
- KPA.javaβ Known Plaintext Attack implementation (dictionary search).
- COA.javaβ Ciphertext-Only Attack implementation (scoring + dictionary search).
- Rotor96Crypto.javaβ 8-rotor encryption/decryption engine used by both attacks.
- CSVReader.javaβ CSV helper used to load student ciphertexts.
- passwordsβ dictionary of candidate keys (one per line).
- ciphertext1.txtβ ciphertext for the KPA task.
- ciphertext2.txtβ ciphertext for the COA task.
- decrypted_results.txtβ output generated by- KPA.java(found key + plaintext).
- password_scores.csvβ scoring output generated by- COA.javafor analysis.
- report.pdf/- report.texβ written report with methodology, calculations, and results.
- Midterm_Assignment.pdfβ assignment brief.
- KPA: Brute-force the passwordslist. Decryptciphertext1.txtwith each key and check for the known prefix ("We"). The keygivemewas recovered; the plaintext is indecrypted_results.txt.
- COA: Brute-force the passwordslist. For each candidate key, decryptciphertext2.txtand score the result using English-language statistics (letter frequency and n-gram scores). The highest-scoring plaintext corresponds to the keyoctopus.
- Known Plaintext Attack (KPA): key recovered = giveme. The full decrypted message (checked for the student ID 3043047J) is written todecrypted_results.txt.
- Ciphertext-Only Attack (COA): key recovered = octopus. The decrypted plaintext includes a timestamp and the student identifier;COA.javaalso writes per-key scores topassword_scores.csvfor analysis.
- Unicity distance (summary): the report calculates a theoretical unicity distance of approximately 8.8 characters (using H(K) β 13.22 bits and H(P) β 1.5 bits/char). Experimentally, using the implemented scoring and clustering, the COA required about 110 ciphertext characters to reach an unambiguous result for this dataset.
Notes:
- The report discusses why a very short known plaintext (e.g., two characters "We") can lead to multiple candidate keys: the provided password list (β9473 entries) is larger than the 96^2 possible two-character combinations, so collisions are expected. The verification steps in KPA.java(checking full plaintext coherence and the student ID) disambiguate candidates.
- See report.pdf/report.texfor detailed derivations, plots, and the scoring algorithm used byCOA.java.
- Compile (from the repository root):
javac KPA.java COA.java Rotor96Crypto.java CSVReader.java- Run the Known Plaintext Attack (KPA) β decrypt ciphertext1.txt:
java KPA- Output: decrypted_results.txtwill be created/updated and contains the discovered key and plaintext.
- Run the Ciphertext-Only Attack (COA) β decrypt ciphertext2.txtand generate scores:
java COA- Output: best key and decrypted plaintext are printed to the console. password_scores.csvis written with per-key scores (useful for analysis and plotting).
Tips
- If the JVM runs out of memory while processing a very large passwordsfile, increase heap size, e.g.:
java -Xmx2G COA- Use the provided passwordsfile and ciphertexts to reproduce results in the report. RunningCOAwill producepassword_scores.csvwhich contains the per-password scoring data used for analysis (e.g., to estimate unicity distance and score separability).
- The report.pdfcontains derivations for the expected number of keys matching a known-plaintext prefix (KPA) and the unicity distance estimates used to reason about the COA.
- Scoring in COA.javacombines letter frequency and n-gram statistics; inspect the source to tune weights or add a language model.
- The search is single-threaded. You can parallelize by splitting passwordsand running multiple JVM instances or by modifyingCOA.javato use a thread pool.
- Add a small unit test (or an integration test) that runs Rotor96Cryptowith a known key/plaintext pair to validate encryption/decryption round-trips.
- If the program can't find passwordsor the ciphertext files, ensure you run from the repository root where those files live.
- If output differs from the report, confirm you are using the provided passwordsfile and ciphertext files from this repository.
The attached report.pdf documents the approach and results. High-level findings:
- KPA: With the known prefix "We" and the provided password dictionary, the correct key givemewas recovered reliably. The report derives the probability of multiple keys matching the known-prefix and shows it is negligible for the chosen dictionary size.
- COA: Using statistical scoring of decrypted candidates, the highest-scoring candidate corresponds to the key octopus.password_scores.csvdemonstrates a clear score separation for this dataset, and the report discusses unicity distance calculations that explain why the plaintext is recoverable from the ciphertext alone with high probability.
For full details (math, code excerpts and plots), see report.pdf and the LaTeX source report.tex.
If you'd like, I can:
- also add a short runnable test that demonstrates Rotor96Cryptoround-trips; or
- update COA.javato run in parallel and add a simple script to split thepasswordsfile for easy parallel runs.
If you want me to commit this README change, confirm and I'll finalize the update and mark the task completed.