Siamese Neural Network (SSN) For Near Duplicate Detection in Web App Model Inference

🚀 Get Started

Follow these steps to create and activate the Conda environment to set up the project. Use snn-ndd as the password to download resources.

1) Clone the repository

git clone https://github.com/ast-fortiss-tum/near-duplicate-detection-siamese-networks.git
cd near-duplicate-detection-siamese-networks

2) Create the Conda environment

The environment.yml already pins compatible versions.

conda env create --name snn-ndd -f environment.yml

What this does

Creates a new Conda env called snn-ndd.
Installs all Conda‑managed packages.

3) Activate the environment

conda activate snn-ndd

💻 Apps for Web Testing Research

Applications evaluated in this work

App name	Version	Technology	Description
Addressbook	8.2.5	PHP	A simple, web-based address & phone book, contact manager, organizer.
Claroline	1.11.10	PHP	A collaborative learning environment, allowing teachers or education institutions to create and administer courses through the web.
Dimeshift	commit 261166d	Node.js, Backbone.js	Expense tracker.
MantisBT	1.1.8	PHP	Bug Tracking system.
MRBS	1.4.9	PHP	A Meeting Room Booking System.
Pagekit	1.0.16	PHP, Vue.js	A modular and lightweight CMS.
PetClinic	commit 6010d5	Java Spring, Angular	Demo web application for managing a veterinary clinic.
Phoenix	1.1.0	Elixir, Phoenix framework, React, Redux	Trello tribute done with Elixir, Phoenix Framework, Webpack, React and Redux.
PPMA	0.6.0	PHP	A PHP Password MAnager.

📂 Repository Structure

├── resources/                # Downloaded baseline datasets, doms, baseline runners (jar) and doc2vec embedding models
├── scripts/                  # Python scripts for running evaluations
│   ├── rq1/
│   │   ├── within-app-classification/
│   │   │   ├── bert_contrastive_classification.py
│   │   │   ├── doc2vec_contrastive_classification.py
│   │   │   └── markuplm_contrastive_classification.py
│   │   ├── across-app-classification/
│   │   │   └── [...same naming as above...]
│   │   └── baseline-classification/
│   │       └── run_baseline.py
│   ├── rq2/                   # RQ2 evaluation scripts
│   ├── rq3/                   # RQ3 evaluation scripts
│   └── rq4/                   # RQ4 evaluation scripts
├── results/                  # Generated experiment results
│   ├── rq1/
│   ├── rq2/
│   ├── rq3/
│   └── rq4/
├── baseline/                 # Baseline results
├── models/                   # Saved trained models
├── embedding/                # Cached embedding files
└── README.md

🖥️ Generate Reported Results

1) RQ1: Near-Duplicate Detection

a) SNN Evaluation

Download resources resources.zip from (use password password as snn-ndd):
```
https://syncandshare.lrz.de/getlink/fiY4KcSXCwSLW8g8TVmnep/resources.zip
```
1. Unpack and move the resources folder to the base project
```
unzip resources.zip -d resources
```
  You should now have folders:
```
dataset
scripts
resources/
├── baseline-dataset/
├── baseline-runner/
├── doms/
└── embedding-models/
```
2. Run the evaluation
```
python scripts/rq1/<evaluation_setting>-app-classification/<embedding_type>_contrastive_classification.py
```
  - <evaluation_setting>:
    - within
    - across
  - <embedding_type>:
    - bert (adjust the variant inside the script for ModernBERT)
    - doc2vec
    - markuplm
3. Outputs
  - Experiment results → results/rq1/
  - Trained models → models/
  - Cached embeddings → embedding/

Re-runs will automatically reuse any existing models or embeddings; nothing is re-trained if already present. If you want to use trained models and embeddings, please obtain them from the link below and ensure they are located at the project parent level separately (./models, ./embeddings). Then try generating the results.

https://syncandshare.lrz.de/getlink/fi14MEw1swySBLPA6emQKP/models_and_embeddings.zip

b) FragGen Evaluation

Run the evaluation from the project base directory

python scripts/rq1/baseline-classification/within_app_fraggen.py

c) Other Baseline Methods - (WEBEMBED, RTED, PDIFF) Evaluation

Run the evaluation from the project base directory

python scripts/rq1/baseline-classification/<evaluation_setting>_app_baseline.py

<evaluation_setting>:
- within
- across

2) RQ2: Model Quality

a) SNN Evaluation

Run below command from the project base directory. Results will be saved in results/rq2/

python scripts/rq2/model-quality-snn.py

b) FRAGGEN Evaluation

Run below command from the project base directory. Results will be saved in results/rq2/

python scripts/rq2/fraggen.py

c) WEBEMBED Evaluation

Run below commands from the project base directory one after the other. The first scripts generates intermediate results files in resources/csv_results_table/. The second script generates the RQ2 results and save in results/rq2/

python scripts/rq2/a-webembed.py
python scripts/rq2/b-webembed.py

d) Other Baseline (RTED and PDIFF) Evaluation

Run below commands from the project base directory one after the other. The first scripts generates intermediate results files in resources/csv_results_table/. The second script generates the RQ2 results and save in results/rq2/

python scripts/rq2/a-other-baselines.py
python scripts/rq2/b-other-baselines.py

3) RQ3: Code Coverage

To integrate the state abstraction function during crawling, you want to follow the implementation of the state vertex and adjust the endpoint for your Flask app. When the crawler needs to evaluate two states are, it can simply make a POST request to the endpoint with the necessary data, which allows the Flask app to function as the State Abstraction Function (SAF) and return the results you're looking for. Run below command to up flask app.

python scripts/rq3/saf-snn.py # SNN methods
python scripts/rq3/saf-other-baseline.py # WEBEMBED methods
python scripts/rq3/saf-other-baseline.py # RTED and PDIFF

Use this example for directly evaluating the FRAGGEN method without external SAF.

Code coverage varies depending on the application type. For PHP applications, server-side code coverage is measured, while JavaScript applications focus on client-side coverage. For crawling, we used Crawljax (5.2.4-SNAPSHOT). But it has compatibility considerations with certain versions of the Chrome driver. Alternative browsers can be used (Firefox for example) if specified in the crawl configuration. For evaluating FraGen is only possible with chrome hence we had to downgrade Chrome ( 114.0.5735.90)

In PHP applications, code coverage measurement is conducted using the Xdebug 2.2.42 extension alongside the php-code-coverage 2.2.33 library. Both crawling and code coverage measurement occur simultaneously. It is important to start code coverage before starting the crawl and to stop it after crawling completion to generate the coverage report.

Refer PHP Code Coverage README for more details.

For JavaScript applications, the process operates separately. Crawljax was executed for Dante, which generates corresponding Selenium test cases. Once crawling is complete, Dante can utilize the generated test case files to produce JUnit test cases. For measuring coverage in JavaScript applications, cdp4j 3.0.81, a Java implementation of Chrome DevTools, was used.

Refer JavaScript Code Coverage README for more details.

4) RQ4: Time Efficiency

a) SNN Training Times

SNN model training times are already recorded when you perform RQ1. In the results Excel file, you can find the training time for each model.

b) Crawling Times

After crawling the application in RQ3, a crawling report is generated. You can find the crawling time in the result.json file under statistics → duration.

c) Inference Times

Inference time is measured using 1,000 randomly selected pairs from our dataset.

RTED and PDiff: Compute the respective distances with the Crawljax implementation, then add the classifier inference time to obtain the total inference time.
FragGen: Use only the Crawljax classification time as the method’s inference time.

The JAR file resources/baseline-runner/BaseLineRunner-1.0-SNAPSHOT.jar—generated from the Java project linked below—is used to obtain distance-calculation times for RTED and PDiff, and total inference time for FragGen.

Baseline Runner project files

https://syncandshare.lrz.de/getlink/fiT5bUvN5DJfJ5uZ8JxgzC/baseline-runner.zip

SNN evaluation

Run below commands from the project base directory. Results will be saved in results/rq4/

python scripts/rq4/snn_inference_time.py

WEBEMBED evaluation

Run below commands from the project base directory. Results will be saved in results/rq4/

   python scripts/rq4/webembed_baseline_inference_time.py

Other baselines (FragGen, RTED, PDiff) evaluation

Run below commands from the project base directory. Results will be saved in results/rq4/

   python scripts/rq4/javabased_baseline_inference_time.py

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
dataset		dataset
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Siamese Neural Network (SSN) For Near Duplicate Detection in Web App Model Inference

🚀 Get Started

1) Clone the repository

2) Create the Conda environment

3) Activate the environment

💻 Apps for Web Testing Research

📂 Repository Structure

🖥️ Generate Reported Results

1) RQ1: Near-Duplicate Detection

a) SNN Evaluation

b) FragGen Evaluation

c) Other Baseline Methods - (WEBEMBED, RTED, PDIFF) Evaluation

2) RQ2: Model Quality

a) SNN Evaluation

b) FRAGGEN Evaluation

c) WEBEMBED Evaluation

d) Other Baseline (RTED and PDIFF) Evaluation

3) RQ3: Code Coverage

4) RQ4: Time Efficiency

a) SNN Training Times

b) Crawling Times

c) Inference Times

SNN evaluation

WEBEMBED evaluation

Other baselines (FragGen, RTED, PDiff) evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

ast-fortiss-tum/near-duplicate-detection-siamese-networks

Folders and files

Latest commit

History

Repository files navigation

Siamese Neural Network (SSN) For Near Duplicate Detection in Web App Model Inference

🚀 Get Started

1) Clone the repository

2) Create the Conda environment

3) Activate the environment

💻 Apps for Web Testing Research

📂 Repository Structure

🖥️ Generate Reported Results

1) RQ1: Near-Duplicate Detection

a) SNN Evaluation

b) FragGen Evaluation

c) Other Baseline Methods - (WEBEMBED, RTED, PDIFF) Evaluation

2) RQ2: Model Quality

a) SNN Evaluation

b) FRAGGEN Evaluation

c) WEBEMBED Evaluation

d) Other Baseline (RTED and PDIFF) Evaluation

3) RQ3: Code Coverage

4) RQ4: Time Efficiency

a) SNN Training Times

b) Crawling Times

c) Inference Times

SNN evaluation

WEBEMBED evaluation

Other baselines (FragGen, RTED, PDiff) evaluation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages