- Java JDK or JRE(Java Runtime Environment). This program use one
jar
file to parse the query and generate the related information. - Python version >= 3.9
- Python package requirements: docopt, requests
- Preprocessing[option]. For generating new statistics (
cost.csv
), we offer the DuckDB version scriptspreprocess.sh
andgen_cost.sh
. Modify the configurations in them, and execute the following command.
$ ./preprocess.sh
- Modify path for
python
inauto_rewrite.sh
. - Execute the following command to get the rewrite querys. The rewrite time is shown in
rewrite_time.txt
- OPTIONS
- Mode: Set generate code mode D(DuckDB)/M(MySql) [default: D]
- Yannakakis/Yannakakis-Plus : Set Y for Yannakakis; N for Yannakakis-Plus [default: N]
$ bash start_parser.sh
$ Parser started.
$ ./auto_rewrite.sh ${DDL_NAME} ${QUERY_DIR} [OPTIONS]
e.g ./auto_rewrite.sh lsqb lsqb M N
- Modify configurations in
load_XXX.sql
(load table schemas) andauto_run_XXX.sh
(auto-run script for different DBMSs). - Execute the following command to execute the queries in different DBMSs.
$ ./auto_run_XXX.sh [OPTIONS]
./query/[graph|lsqb|tpch|job]
: plans for different DBMSs./query/*.sh
: auto-run scripts./query/*.sql
: load data scripts./query/[src|Schema]
: files for auto-run SparkSQL./*.py
: code for rewriter and optimizer./sparksql-plus-web-jar-with-dependencies.jar
: parser jar file
- For queries like
SELECT DISTINCT ...
, please removeDISTINCT
keyword before parsing. - Use
jps
command to get the parser pid which name isjar
, and then kill it.