Follow these steps to create and activate the Conda environment to set up the project. Use snn-ndd
as the password to download resources.
git clone https://github.com/ast-fortiss-tum/near-duplicate-detection-siamese-networks.git
cd near-duplicate-detection-siamese-networks
The environment.yml
already pins compatible versions.
conda env create --name snn-ndd -f environment.yml
What this does
- Creates a new Conda env called
snn-ndd
. - Installs all Condaβmanaged packages.
conda activate snn-ndd
Applications evaluated in this work
App name | Version | Technology | Description |
---|---|---|---|
Addressbook | 8.2.5 | PHP | A simple, web-based address & phone book, contact manager, organizer. |
Claroline | 1.11.10 | PHP | A collaborative learning environment, allowing teachers or education institutions to create and administer courses through the web. |
Dimeshift | commit 261166d | Node.js, Backbone.js | Expense tracker. |
MantisBT | 1.1.8 | PHP | Bug Tracking system. |
MRBS | 1.4.9 | PHP | A Meeting Room Booking System. |
Pagekit | 1.0.16 | PHP, Vue.js | A modular and lightweight CMS. |
PetClinic | commit 6010d5 | Java Spring, Angular | Demo web application for managing a veterinary clinic. |
Phoenix | 1.1.0 | Elixir, Phoenix framework, React, Redux | Trello tribute done with Elixir, Phoenix Framework, Webpack, React and Redux. |
PPMA | 0.6.0 | PHP | A PHP Password MAnager. |
βββ resources/ # Downloaded baseline datasets, doms, baseline runners (jar) and doc2vec embedding models
βββ scripts/ # Python scripts for running evaluations
β βββ rq1/
β β βββ within-app-classification/
β β β βββ bert_contrastive_classification.py
β β β βββ doc2vec_contrastive_classification.py
β β β βββ markuplm_contrastive_classification.py
β β βββ across-app-classification/
β β β βββ [...same naming as above...]
β β βββ baseline-classification/
β β βββ run_baseline.py
β βββ rq2/ # RQ2 evaluation scripts
β βββ rq3/ # RQ3 evaluation scripts
β βββ rq4/ # RQ4 evaluation scripts
βββ results/ # Generated experiment results
β βββ rq1/
β βββ rq2/
β βββ rq3/
β βββ rq4/
βββ baseline/ # Baseline results
βββ models/ # Saved trained models
βββ embedding/ # Cached embedding files
βββ README.md
-
Download resources
resources.zip
from (use password password assnn-ndd
):https://syncandshare.lrz.de/getlink/fiY4KcSXCwSLW8g8TVmnep/resources.zip
-
Unpack and move the resources folder to the base project
unzip resources.zip -d resources
You should now have folders:
dataset scripts resources/ βββ baseline-dataset/ βββ baseline-runner/ βββ doms/ βββ embedding-models/
-
Run the evaluation
python scripts/rq1/<evaluation_setting>-app-classification/<embedding_type>_contrastive_classification.py
-
<evaluation_setting>
:within
across
-
<embedding_type>
:bert
(adjust the variant inside the script for ModernBERT)doc2vec
markuplm
-
-
Outputs
- Experiment results β
results/rq1/
- Trained models β
models/
- Cached embeddings β
embedding/
- Experiment results β
-
Re-runs will automatically reuse any existing models or embeddings; nothing is re-trained if already present. If you want to use trained models and embeddings, please obtain them from the link below and ensure they are located at the project parent level separately (
./models
,./embeddings
). Then try generating the results.
https://syncandshare.lrz.de/getlink/fi14MEw1swySBLPA6emQKP/models_and_embeddings.zip
Run the evaluation from the project base directory
python scripts/rq1/baseline-classification/within_app_fraggen.py
Run the evaluation from the project base directory
python scripts/rq1/baseline-classification/<evaluation_setting>_app_baseline.py
-
<evaluation_setting>
:within
across
Run below command from the project base directory. Results will be saved in results/rq2/
python scripts/rq2/model-quality-snn.py
Run below command from the project base directory. Results will be saved in results/rq2/
python scripts/rq2/fraggen.py
Run below commands from the project base directory one after the other. The first scripts generates intermediate results files in resources/csv_results_table/
. The second script generates the RQ2 results and save in results/rq2/
python scripts/rq2/a-webembed.py
python scripts/rq2/b-webembed.py
Run below commands from the project base directory one after the other. The first scripts generates intermediate results files in resources/csv_results_table/
. The second script generates the RQ2 results and save in results/rq2/
python scripts/rq2/a-other-baselines.py
python scripts/rq2/b-other-baselines.py
To integrate the state abstraction function during crawling, you want to follow the implementation of the state vertex and adjust the endpoint for your Flask app. When the crawler needs to evaluate two states are, it can simply make a POST request to the endpoint with the necessary data, which allows the Flask app to function as the State Abstraction Function (SAF) and return the results you're looking for. Run below command to up flask app.
python scripts/rq3/saf-snn.py # SNN methods
python scripts/rq3/saf-other-baseline.py # WEBEMBED methods
python scripts/rq3/saf-other-baseline.py # RTED and PDIFF
Use this example for directly evaluating the FRAGGEN method without external SAF.
Code coverage varies depending on the application type. For PHP applications, server-side code coverage is measured, while JavaScript applications focus on client-side coverage. For crawling, we used Crawljax (5.2.4-SNAPSHOT). But it has compatibility considerations with certain versions of the Chrome driver. Alternative browsers can be used (Firefox for example) if specified in the crawl configuration. For evaluating FraGen is only possible with chrome hence we had to downgrade Chrome ( 114.0.5735.90)
In PHP applications, code coverage measurement is conducted using the Xdebug 2.2.42 extension alongside the php-code-coverage 2.2.33 library. Both crawling and code coverage measurement occur simultaneously. It is important to start code coverage before starting the crawl and to stop it after crawling completion to generate the coverage report.
Refer PHP Code Coverage README for more details.
For JavaScript applications, the process operates separately. Crawljax was executed for Dante, which generates corresponding Selenium test cases. Once crawling is complete, Dante can utilize the generated test case files to produce JUnit test cases. For measuring coverage in JavaScript applications, cdp4j 3.0.81, a Java implementation of Chrome DevTools, was used.
Refer JavaScript Code Coverage README for more details.
SNN model training times are already recorded when you perform RQ1. In the results Excel file, you can find the training time for each model.
After crawling the application in RQ3, a crawling report is generated. You can find the crawling time in the result.json
file under statistics β duration.
Inference time is measured using 1,000 randomly selected pairs from our dataset.
- RTED and PDiff: Compute the respective distances with the Crawljax implementation, then add the classifier inference time to obtain the total inference time.
- FragGen: Use only the Crawljax classification time as the methodβs inference time.
The JAR file resources/baseline-runner/BaseLineRunner-1.0-SNAPSHOT.jar
βgenerated from the Java project linked belowβis used to obtain distance-calculation times for RTED and PDiff, and total inference time for FragGen.
Baseline Runner project files
https://syncandshare.lrz.de/getlink/fiT5bUvN5DJfJ5uZ8JxgzC/baseline-runner.zip
Run below commands from the project base directory. Results will be saved in results/rq4/
python scripts/rq4/snn_inference_time.py
Run below commands from the project base directory. Results will be saved in results/rq4/
python scripts/rq4/webembed_baseline_inference_time.py
Run below commands from the project base directory. Results will be saved in results/rq4/
python scripts/rq4/javabased_baseline_inference_time.py