Skip to content

Updated README.md #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions Documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
Issues encountered while trying to run Tattle-Tale GitHub repo, on both Windows 11 OS Build 22621.3007 and Ubuntu 20.04 LTS.
The first issue is that the python scripts for generating testcases store their contents in the parent folder of the GitHub repo. This creates a breaking change in the Experiment.java class as the class looks for the testcases folder inside of the GitHub repo so the pathing fails.
Issues encountered while trying to run Tattle-Tale GitHub repo, on Windows 11 OS Build 22621.3007.
The way that a connection to the MySQL database is established has a problem because when the connection is tested in `src/main/java/edu/policy/dbms/MySQLConnectionManagerDBCP.java` at line 77
```
dataSource.setUrl(String.format("jdbc:mysql://%s:%s/mysql", SERVER, PORT));
```
when it looks for a database named `mysql` this causes the runtime execution to hard crash because it can't connect. Additionally even if the database name exists in the database there will still be hard crashes during the runtime. This is because the python test scripts instantiate the name of a data base and set it to a value, i.e. hospitaldb, and for each test run the program then tries to use that database name which if the name isn't in the running MySQL server then the program fails to connect. Either way someone pulling the repo has to manually make these changes to get the repo to execute without crashing. I think I had this same issue when I initially tried to use the tax mysql script to
There is an issue where the readme doesn't fully explain the amount of either tables or permissions that the default `Kirby` user needs to complete the experiment. If someone, i.e. like I did, just gives the default user the ability to `SELECT` from tables in the database when the repo runs there is an issue where it tries to locate the table `temp` but when it can't find this table it will continue. This creates confusing behavior because the program can execute but it will always find the exact same cuesets for each algorithm, full den and k-percentile. I haven't quite fully debugged this part yet but I hope to have it finished before Monday.
It seems that these issues can be partially fixed by giving all privileges to the Kirby user. However I ran into a problem that I have yet to fix where the program runs out of heap memory. I tried to solve this using the settings in IntelliJ to increase the available memory but the program crashed before it reached the memory limit, about 10 GB.
I tried using the btm script to deal with the memory heap issues but they still happened even when I gave IntelliJ 22 GB of available memory.
Current Fixes:
1. Change line 77 of `src/main/java/edu/policy/dbms/MySQLConnectionManagerDBCP.java` from**
```
dataSource.setUrl(String.format("jdbc:mysql://%s:%s/mysql", SERVER, PORT));
```
to
```
dataSource.setUrl(String.format("jdbc:mysql://%s:%s/<YOURDATABASENAME>", SERVER, PORT));
```
2. In addition you need to change the user privileges for the default account named in the mysql.properties file to have all privileges on the database value you set in step 1.

3. Change line 278 of your python

4. You'll need to edit the files generated by these python scripts the .json files contain values called `databaseName` which need to match the value that you used in step 1.

5. You'll need to change line 25 of `src/main/java/edu/policy/execution/Experiment.java` from
```
private static final File testCaseDir = new File(System.getProperty("user.dir") + "/testdata/testcases");
```
to
```
private static final File testCaseDir = new File(<ExactPathNameToTestCases>);

6. I heard from the developer that some of these settings could be fixed by running the testscripts in the testscript folder and by editing the testscripts by changing what they name the database. I also learned that the taxes table shouldn't have issues with the heap size so that should give us a way to move forward.
96 changes: 94 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,92 @@ This repository contains the implementation of algorithms for detection and prev

#### Step 1: Configure Database

1. Create database in MySQL ([Export files](https://drive.google.com/drive/folders/1CiCXU08zWgzI2VUKp1vEcadBkTJA6Lbb?usp=sharing), enabling database index and creating hash indexes can improve performance)
1. Create database in MySQL ([Export files](https://drive.google.com/drive/folders/1CiCXU08zWgzI2VUKp1vEcadBkTJA6Lbb?usp=sharing), enabling database index and creating hash indexes can improve performance)
<font color="red">

**2024 Update**
**You'll need to change line 77 of `src/main/java/edu/policy/dbms/MySQLConnectionManagerDBCP.java` from**
```
dataSource.setUrl(String.format("jdbc:mysql://%s:%s/mysql", SERVER, PORT));
```
**to**
```
dataSource.setUrl(String.format("jdbc:mysql://%s:%s/<YOURDATABASENAME>", SERVER, PORT));
```
**Alternatively you could create two databases named hospitaldb and taxdb to store the hospital and taxes tables respectively.**
</font>
2. Update corresponding database info (*username, password, server and port number*) in the `mysql.properties` file under `resources/credentials/` directory.
<font color="red">

**2024 Update**
**In addition you need to change the user privileges for the default account named in the mysql.properties file to have all privileges on the database value you set in step 1.**
</font>

#### Step 2: Prepare Testcases

Use the test script **testcase_gen_tax.py** or **testcase_gen_hospital.py** to generate testcases on Tax or Hospital dataset. The generated testcases will be automatically placed under `testdata/testcases/` directory.
Use the test script **testcase_gen_tax.py** or **testcase_gen_hospital.py** to generate testcases on Tax or Hospital dataset. The generated testcases will be automatically placed under `testdata/testcases/` directory.
<font color="red">

**2024 Update**
**Be sure to run the testscripts from the testscript folder in the GitHub repo. You will also need to change the python testscript to set the correct database for the java classes to connect to.**
**If you're using**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**testcase_gen_tax.py**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**change line 13 from**
```
DCFileName = "/testdata/taxdb_constraints.txt"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**to**
```
DCFileName = "/testdata/taxdb_constraints_noPBD.txt"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**and line 278 from**
```
database_name = "taxdb"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**to**
```
database_name = "<YOURDATABASENAME>"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**testcase_gen_tax_btm.py**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**change line 235 from**
```
database_name = "taxdb"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**to**
```
database_name = "<YOURDATABASENAME>"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**testcase_gen_hospital.py**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**change line 280 from**
```
database_name = "hospitaldb"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**to**
```
database_name = "<YOURDATABASENAME>"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**Additionally you need to create a table called hospital10k in your given database. This table can be a duplicate of your hospital table but with only 10,000 of the hospital table records. This will work if you `DELETE FROM hospital10k WHERE tid > 10000`. I have not tested if any combination of 10,000 hospital table records are also compatible. In addition you must set up the jvm environment to have sufficient space on the jvm and the heap. You will need to allocate over 10 GB (exact size not confirmed) of heap space to complete this test.**

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**testcase_gen_hospital_btm.py**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**change line 236 from**
```
database_name = "hospitaldb"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**to**
```
database_name = "<YOURDATABASENAME>"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**testcase_gen_hospital_scalability.py**
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**change line 192 from**
```
database_name = "hospitaldb"
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**to**
```
database_name = "<YOURDATABASENAME>"
```

</font>

> python testcase_gen_tax.py
>
Expand All @@ -59,6 +139,18 @@ Our codebase is forward-compatible with test cases without btm mode.
Under the working directory (`Tattle-Tale/`), use the following commands to install required dependencies and execute the program.

**Experimental setting**: requiring at least 64 GB RAM [if not possible for limited computing environment, use *btm* mode (as in the scalability experiment) to reduce the memory requirement]
<font color="red">

**2024 Update**
**You'll need to change line 25 of `src/main/java/edu/policy/execution/Experiment.java` from**
```
private static final File testCaseDir = new File(System.getProperty("user.dir") + "/testdata/testcases");
```
**to**
```
private static final File testCaseDir = new File(<ExactPathNameToTestCases>);
```
</font>

> mvn clean install
>
Expand Down