Skip to content

Commit 897235d

Browse files
committed
Update READMEs
1 parent c66ad81 commit 897235d

File tree

5 files changed

+294
-56
lines changed

5 files changed

+294
-56
lines changed

README.md

Lines changed: 33 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,12 @@
1-
[![CircleCI](https://circleci.com/gh/JetBrains-Research/CFPQ_PyAlgo/tree/master.svg?style=svg)](https://circleci.com/gh/JetBrains-Research/CFPQ_PyAlgo/tree/master)
1+
## CFPQ_PyAlgo
22

3-
# CFPQ_PyAlgo
4-
The CFPQ_PyAlgo is a repository for developing, testing and benchmarking algorithms that solve Formal-Language-Constrained Path Problems, such as Context-Free Path Queries and Regular Path Queries. All algorithms are based on the [GraphBLAS](http://graphblas.org/index.php?title=Graph_BLAS_Forum) framework that allows you to represent graphs as matrices and work with them in terms of linear algebra. For convenience, all the code is written in Python using [pygraphblas](https://github.com/michelp/pygraphblas) or in C/C++ using purely [SuiteSparse](https://github.com/DrTimothyAldenDavis/SuiteSparse/tree/master/GraphBLAS) with a Python wrapper.
3+
The CFPQ_PyAlgo is a repository for developing, testing and evaluating solvers for
4+
Formal-Language-Constrained Path Problems, such as Context-Free Path Queries and Regular Path Queries.
55

6-
# Installation
6+
All algorithms are based on the [GraphBLAS](http://graphblas.org/index.php?title=Graph_BLAS_Forum) framework that allows to represent graphs as matrices
7+
and work with them in terms of linear algebra.
8+
9+
## Installation
710
First of all you need to clone repository with its submodules:
811

912
```bash
@@ -12,26 +15,27 @@ cd CFPQ_PyAlgo/
1215
git submodule init
1316
git submodule update
1417
```
15-
Then the easiest way to get started is to use Docker. An alternative, which is more correct, is to install everything directly.
18+
Then the easiest way to get started is to use Docker. An alternative is to install everything directly.
1619

17-
## Using Docker
20+
### Using Docker
1821
The first way to start is to use Docker:
1922

2023
```bash
2124
# build docker image
22-
docker build --tag <some_tag> .
25+
docker build --tag cfpq_py_algo .
2326

2427
# run docker container
25-
docker run --rm -it -v ${PWD}:/CFPQ_PyAlgo <some_tag> bash
28+
docker run --rm -it -v ${PWD}:/CFPQ_PyAlgo cfpq_py_algo bash
2629
```
27-
After it you can develop everything locally and run tests and benchmarks inside the container. Also you can use PyCharm and [configure an interpreter using Docker]( https://www.jetbrains.com/help/pycharm/using-docker-as-a-remote-interpreter.html).
30+
After it, you can develop everything locally and run tests and benchmarks inside the container.
31+
Also, you can use PyCharm Professional and [configure an interpreter using Docker](https://www.jetbrains.com/help/pycharm/using-docker-as-a-remote-interpreter.html).
2832

29-
## Direct install
30-
The correct way is to install everything into your local python interpreter or virtual environment.
33+
### Direct install
34+
The other way is to install everything into your local python interpreter or virtual environment.
3135

3236
First of all you need to install [pygraphblas](https://github.com/michelp/pygraphblas) package.
3337
```bash
34-
pip3 install pygraphblas
38+
pip3 install pygraphblas==5.1.8.0
3539
```
3640
Secondly you need to install cfpq_data_devtools package and other requirements:
3741

@@ -47,56 +51,32 @@ To check if the installation was successful you can run simple tests
4751
```bash
4852
python3 -m pytest test -v -m "CI"
4953
```
50-
# Benchmark
51-
Look please [Readme](https://github.com/JetBrains-Research/CFPQ_PyAlgo/blob/master/benchmark/README.md) in *benchmark*
52-
53-
# Usage
54-
55-
Let's describe an example of using the implementation outside this environment.
5654

57-
For example, you want to solve a basic problem CFPQ using the matrix algorithm. To do this, you need a context-free grammar (**Gr**), as well as a graph (**G**) in the format of "triplets".
55+
## CLI
56+
CFPQ_Algo provides a command line interface for running
57+
all-pairs CFPQ solver with relation query semantics.
5858

59-
Then the matrix algorithm can be run as follows, where *PATH_TO_GRAMMAR* --- path to file with **Gr**, *PATH_TO_GRAPH* --- path to file with **G**
59+
See [cfpq_cli/README](cfpq_cli/README.md) for more details.
6060

61-
```cython
62-
from src.problems.Base.algo.matrix_base.matrix_base import MatrixBaseAlgo
63-
from cfpq_data import cfg_from_txt
64-
from src.graph.graph import Graph
61+
## Evaluation
6562

66-
from pathlib import Path
63+
CFPQ_PyAlgo provides scripts for performing evaluating performance
64+
of various CFPQ solvers (icluding third-party ones).
6765

68-
algo = MatrixBaseAlgo()
69-
algo.prepare(Graph.from_txt(Path(PATH_TO_GRAPH)), cfg_from_txt(Path(PATH_TO_GRAMMAR)))
70-
res = algo.solve()
71-
print(res.matrix_S.nvals)
72-
```
73-
The given fragment displays the number of pairs of vertices between which the desired path exists.
74-
75-
More examples can be found in *test*
66+
See [cfpq_eval/README](cfpq_eval/README.md) for more details.
7667

77-
# Project structure
68+
## Project structure
7869
The global project structure is the following:
7970

8071
```
81-
.
72+
├── cfpq_algo - new optimized CFPQ algorithm implementations
73+
├── cfpq_cli - scripts for running CFPQ algorithms
74+
├── cfpq_eval - scripts for evaluating performance of various CFPQ solvers (icluding third-party ones)
75+
├── cfpq_matrix - matrix wrappers that improve performance of operations with matrices
76+
├── cfpq_model - graph & grammar representations
8277
├── deps
8378
│ └── CFPQ_Data - repository with graphs and grammars suites
84-
├───benchmark - directory for performance measurements of implementations
85-
├── src
86-
│ ├── problems - directory where all the problems CFPQ that we know how to solve
87-
│ │ ├───AllPaths
88-
│ │ ├───Base
89-
│ │ ├───MultipleSource
90-
│ │ └───SinglePath
91-
│ ├── grammar - directory for all grammar formats representation and its loading
92-
│ ├── graph - directory for all graph formats representation and its loading
93-
│ └── utils - directory for other useful classes and methods
94-
└── test
95-
├───AllPaths - tests for implementations in src.problems.AllPaths
96-
├───Base - tests for implementations in src.problems.Base
97-
├───data - dataset for tests
98-
├───MultipleSource - tests for implementations in src.problems.MultipleSource
99-
├───SinglePath - tests for implementations in src.problems.SinglePath
100-
└───suites
101-
79+
├── benchmark - directory for performance measurements of legacy CFPQ implementations
80+
├── src - legacy CFPQ implementations
81+
└── test - tests
10282
```

benchmark/README.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,14 @@
1-
# How to start
1+
# LEGACY WARNING
2+
3+
This folder contains benchmarks for legacy CFPQ implementations.
4+
5+
New optimized CFPQ implementations are being implemented in [cfpq_algo](../cfpq_algo) folder
6+
and can be evaluated with scripts in [cfpq_eval](../cfpq_eval) folder.
7+
8+
<details>
9+
<summary>Full README</summary>
10+
11+
## How to start
212
First, create a directory for the dataset. It should have two subdirectories for graphs (Graphs) and grammars (Grammars). In the second step, select an algorithm for benchmarking. Then run the command:
313
```
414
python3 -m benchmark.start_benchmark.py -algo ALGO -data_dir DATA_DIR
@@ -10,8 +20,10 @@ There are also a number of optional parameters:
1020
+ -result_dir --- specify a directory for uploading the results
1121
+ -max_len_paths --- Limit on the length of the retrieved paths
1222

13-
# Add new algorithm
23+
## Add new algorithm
1424
To add a new implementation of the algorithm to the list of available measurements, you must:
1525
1. Add you algorithm in *algo_impl.ALGO_PROBLEM*
1626
2. Add you implementation in *algo_impl.ALGO_IMPL*
17-
3. Create new or use the existing pipeline or in *bench.benchmark*
27+
3. Create new or use the existing pipeline or in *bench.benchmark*
28+
29+
</details>

cfpq_cli/README.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# CFPQ_CLI
2+
3+
The `cfpq_cli` module provides a Command Line Interface (CLI) for solving
4+
Context-Free Language Reachability (CFL-r) problem for all vertex pairs
5+
in a graph with respect to a specified context-free grammar.
6+
7+
## Getting Started
8+
9+
Ensure the CFPQ_PyAlgo project is properly set up on your system before using the CLI.
10+
Setup instructions are available in the project's main [README](../README.md).
11+
12+
## Usage
13+
14+
### Running the Script
15+
16+
For detailed information on script options, execute the following command:
17+
18+
```bash
19+
cd .. # Should be run from CFPQ_PyAlgo project root directory
20+
python3 -m cfpq_cli.run_all_pairs_cflr --help
21+
```
22+
23+
The basic command usage is as follows:
24+
25+
```
26+
python3 -m cfpq_cli.run_all_pairs_cflr [OPTIONS] ALGORITHM GRAPH GRAMMAR
27+
```
28+
29+
- `ALGORITHM` selects the algorithm. The available options are `IncrementalAllPairsCFLReachabilityMatrix` and `NonIncrementalAllPairsCFLReachabilityMatrix`.
30+
- `GRAPH` specifies the path to the graph file.
31+
- `GRAMMAR` indicates the path to the grammar file.
32+
33+
#### Optional Arguments
34+
35+
- `--time-limit TIME_LIMIT` sets the maximum execution time in seconds.
36+
- `--out OUT` specifies the output file for saving vertex pairs.
37+
- `--disable-optimize-block-matrix` disables the optimization of block matrices.
38+
- `--disable-optimize-empty` disables the optimization for empty matrices.
39+
- `--disable-lazy-add` disables lazy addition optimization.
40+
- `--disable-optimize-format` disables optimization of matrix formats.
41+
42+
### Example
43+
44+
To solve the CFL-R problem using an incremental algorithm with a 60-second time limit for
45+
[indexed_tree.g](../test/pocr_data/indexed_an_bn/indexed_tree.g) and
46+
[an_bn_indexed.cnf](../test/pocr_data/indexed_an_bn/an_bn_indexed.cnf) and get results in
47+
[results.txt](../results.txt) execute:
48+
49+
```bash
50+
cd .. # Should be run from CFPQ_PyAlgo project root directory
51+
python3 -m cfpq_cli.run_all_pairs_cflr \
52+
IncrementalAllPairsCFLReachabilityMatrix \
53+
test/pocr_data/indexed_an_bn/indexed_tree.g \
54+
test/pocr_data/indexed_an_bn/an_bn_indexed.cnf \
55+
--time-limit 60 \
56+
--out results.txt
57+
```
58+
59+
### Grammar Format
60+
61+
The grammar file should be formatted with each production rule on a separate line, adhering to the following schema:
62+
63+
```
64+
<LEFT_SYMBOL> [RIGHT_SYMBOL_1] [RIGHT_SYMBOL_2]
65+
```
66+
67+
- `<LEFT_SYMBOL>`: the symbol on the left-hand side of a production rule.
68+
- `<RIGHT_SYMBOL_1>` and `<RIGHT_SYMBOL_2>`: the symbols on the right-hand side of the production rule, each of them is optional.
69+
- The symbols must be separated by whitespace.
70+
- The last two line specify the start symbol in the format
71+
```
72+
Count:
73+
<START_SYMBOL>
74+
```
75+
76+
#### Example
77+
```
78+
S AS_i b_i
79+
AS_i a_i S
80+
S c
81+
82+
Count:
83+
S
84+
```
85+
86+
### Graph Format
87+
88+
The graph file should represent edges using the format:
89+
90+
```
91+
<EDGE_SOURCE> <EDGE_DESTINATION> <EDGE_LABEL> [LABEL_INDEX]
92+
```
93+
94+
- `<EDGE_SOURCE>` and `<EDGE_DESTINATION>`: specify the source and destination nodes of an edge.
95+
- `<EDGE_LABEL>`: the label associated with the edge.
96+
- `[LABEL_INDEX]`: an optional index for labels with subscripts, indicating the subscript value.
97+
- The symbols must be separated by whitespace
98+
- Labels with subscripts must end with "\_i". For example, an edge $1 \xrightarrow{x_10} 2$ is denoted by `1 2 x_i 10`.
99+
100+
#### Example
101+
```
102+
1 2 a_i 1
103+
2 3 b_i 1
104+
2 4 b_i 2
105+
1 5 c
106+
```

cfpq_eval/README.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# CFPQ Evaluation
2+
3+
The `cfpq_eval` module in CFPQ_PyAlgo evaluates performance of CFPQ solvers,
4+
integrating with both CFPQ_PyAlgo itself and third-party tools.
5+
6+
## Setting up the environment
7+
8+
Build and run a Docker container for evaluation using [Dockerfile-all-tools](../Dockerfile-all-tools).
9+
10+
Build Docker image:
11+
```
12+
docker build -f Dockerfile-all-tools -t cfpq_eval .
13+
```
14+
15+
Run Docker container:
16+
```
17+
docker run -it cfpq_eval bash
18+
```
19+
20+
## Running the Script
21+
22+
For detailed information on evaluation script options, execute the following command:
23+
24+
```bash
25+
cd .. # Should be run from CFPQ_PyAlgo project root directory
26+
python3 -m cfpq_cli.run_all_pairs_cflr --help
27+
```
28+
29+
The basic command usage is as follows:
30+
31+
```
32+
python3 -m cfpq_eval.eval_all_pairs_cflr algo_config.csv data_config.csv results_path [--rounds ROUNDS] [--timeout TIMEOUT]
33+
```
34+
35+
- `algo_config.csv` specifies algorithm configurations.
36+
- `data_config.csv` specifies the dataset.
37+
- `results_path` specifies path for saving raw results.
38+
- `--rounds` sets run times per config (default is 1).
39+
- `--timeout` limits each configuration's execution time in seconds (optional).
40+
41+
## Configuration Files
42+
43+
### Algorithm Configuration
44+
45+
The `algo_config.csv` outlines algorithms and settings. Supported algorithms:
46+
47+
- `IncrementalAllPairsCFLReachabilityMatrix`
48+
- `NonIncrementalAllPairsCFLReachabilityMatrix`
49+
- `pocr`
50+
- `pearl`
51+
- `graspan`
52+
- `gigascale`
53+
54+
For Matrix-based algorithms options described in [cfpq_cli/README](../cfpq_cli/README.md).
55+
can be used to alter the behaviour.
56+
57+
#### Example
58+
59+
```
60+
algo_name,algo_settings
61+
"Matrix (some optimizations disabled)",IncrementalAllPairsCFLReachabilityMatrix --disable-optimize-empty --disable-lazy-add
62+
"pocr",pocr
63+
```
64+
65+
### Data Configuration
66+
67+
The `data_config.csv` pairs graph and grammar files,
68+
referenced files should be in format described in [cfpq_cli/README](../cfpq_cli/README.md).
69+
70+
#### Example
71+
72+
```
73+
graph_path,grammar_path
74+
data/graphs/aa/leela.g,data/grammars/aa.cnf
75+
data/graphs/java/eclipse.g,data/grammars/java_points_to.cnf
76+
```
77+
78+
## Interpreting Results
79+
80+
Raw data is saved to `results_path`, while quick summary including mean execution time,
81+
memory usage, and output size are rendered in standard output stream.
82+
83+
## Custom Tools Integration
84+
85+
Additional CFPQ solvers can be supported to evaluation by implementing `AllPairsCflrToolRunner` interface
86+
and updating `run_appropriate_all_pairs_cflr_tool()` function.

0 commit comments

Comments
 (0)