This tool is a Python-based implementation that modifies an image's metadata to achieve a desired hash prefix. It combines Simulated Annealing and Multiprocessing for efficient, parallelized exploration of the solution space. By using this tool, you can experiment with cryptographic concepts like hash collisions and explore metadata manipulation in images.
-
Input:
- The user provides:
An input image file
: The image whose metadata will be modified.A target hash prefix
: The desired starting characters of the hash.
- Optional parameters include:
- Maximum attempts.
- Number of workers.
- Initial temperature for simulated annealing.
- Cooling rate for temperature decay.
- The user provides:
-
Hash Calculation:
- The image's hash is computed using the SHA-512 algorithm.
-
Simulated Annealing:
- The EXIF metadata of the image is randomly modified.
- Each modification is evaluated against the target prefix.
- A probabilistic acceptance criterion ensures that non-optimal changes are sometimes accepted, allowing the algorithm to escape local minima.
-
Multiprocessing:
- The simulated annealing process is distributed across multiple workers, each running independently with a unique random seed.
-
Output:
- Once the desired hash prefix is achieved, the tool saves the modified image to the specified location.
git clone https://github.com/Wafulah/ai-image-detector.git
pip install -r requirements.txt
Run the script from the command line using the following syntax:
python image-spoofer.py <target_prefix> <input_image> <output_image> [--max_attempts N] [--num_workers W] [--initial_temp T] [--cooling_rate R] [--hash_algorithm ALGO]
Argument | Type | Description |
---|---|---|
target_prefix |
String | Desired hash prefix (hexadecimal). |
input_image |
String | Path to the input image. |
output_image |
String | Path to save the modified image. |
--max_attempts |
Integer | Maximum number of modification attempts (default: 100,000). |
--num_workers |
Integer | Number of parallel workers for multiprocessing (default: 4). |
--initial_temp |
Float | Initial temperature for simulated annealing (default: 1000). |
--cooling_rate |
Float | Cooling rate for temperature decay (default: 0.99). |
--hash_algorithm |
String | Hashing algorithm to use (e.g., sha256 , md5 ). |
python image-spoofer.py f8b2 "input.jpg" "output.jpg" --max_attempts 50000 --num_workers 6 --initial_temp 1200 --cooling_rate 0.98 --hash_algorithm sha256
-
calculate_file_hash(file_path)
- Reads the binary content of the image file.
- Computes the SHA-512 hash.
- Used to evaluate if the hash matches the desired prefix.
-
simulated_annealing_attempt()
- Performs Simulated Annealing:
- Randomly modifies the
UserComment
EXIF field of the image metadata. - Evaluates the modified hash's similarity to the target prefix.
- Uses probabilistic logic to accept changes based on temperature and similarity.
- Randomly modifies the
- Performs Simulated Annealing:
-
hash_similarity(current_hash, target_prefix)
- Measures the number of matching characters between the hash and the target prefix.
- Guides the algorithm toward better solutions.
-
acceptance_probability(temperature, new_similarity, old_similarity)
- Implements the Simulated Annealing acceptance rule:
- Accepts better solutions outright.
- Accepts worse solutions probabilistically, depending on the temperature.
- Implements the Simulated Annealing acceptance rule:
-
modify_metadata_with_multiprocessing()
- Divides the computational workload among multiple workers.
- Each worker executes
simulated_annealing_attempt()
independently, using unique random seeds.
Simulated Annealing is an optimization technique inspired by the physical annealing process, where a material is heated and then slowly cooled to achieve a stable state.
-
Initialization:
- Start with the original hash and metadata.
- Set an initial temperature.
-
Random Modification:
- Change the
UserComment
field with a random value.
- Change the
-
Evaluation:
- Compare the modified hash to the target prefix using
hash_similarity()
.
- Compare the modified hash to the target prefix using
-
Acceptance Logic:
- If the modification improves the hash, accept it.
- If the modification worsens the hash, accept it with a probability determined by the current temperature.
-
Cooling:
- Gradually reduce the temperature, focusing the search on fine-tuned improvements.
-
Termination:
- Stop when the target prefix is achieved or the maximum attempts are reached.
Simulated Annealing can be computationally intensive. By running multiple independent annealing processes in parallel, the solution space is explored more efficiently.
-
Task Division:
- The total number of attempts is split evenly across multiple workers.
-
Independent Execution:
- Each worker runs
simulated_annealing_attempt()
with a unique random seed.
- Each worker runs
-
Result Aggregation:
- Workers report back as soon as one of them achieves the desired hash prefix.
-
Simulated Annealing:
- Focuses on promising areas of the solution space, avoiding exhaustive brute force.
-
Multiprocessing:
- Utilizes multiple CPU cores to explore independent solution paths simultaneously.
-
Efficient Hash Evaluation:
- Calculates and compares hashes incrementally, minimizing unnecessary computation.
1. Read the input image and calculate its hash.
2. Initialize Simulated Annealing parameters (temperature, cooling rate, etc.).
3. Divide tasks among multiple workers:
- Each worker performs:
a. Randomly modify image metadata.
b. Calculate the hash of the modified image.
c. Compare the hash with the target prefix.
d. Accept or reject the change based on Simulated Annealing rules.
e. Repeat until a match is found or attempts are exhausted.
4. Combine results from all workers.
5. Save the modified image if the target prefix is achieved.
- Saves the modified image to the specified
output_image
path. - Prints the original and modified hashes.
- Reports failure after exhausting all attempts.
- Research:
- Explore hash collision vulnerabilities.
- Security:
- Test hash-based integrity checks.
- Digital Forensics:
- Embed metadata for tracking or identification.
This project is licensed under the MIT License. See LICENSE for details.