Skip to content

Commit be2a2bb

Browse files
authored
Update Neural Solution Docs (#1026)
Signed-off-by: yiliu30 <yi4.liu@intel.com>
1 parent 1c7ef9e commit be2a2bb

File tree

13 files changed

+442
-174
lines changed

13 files changed

+442
-174
lines changed

.azure-pipelines/scripts/codeScan/pyspelling/inc_dict.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2642,7 +2642,10 @@ itemStyle
26422642
NewDataloader
26432643
subprocesses
26442644
LayoutLM
2645+
bfc
2646+
cb
26452647
CCE
26462648
CCFF
26472649
FFFFFF
26482650
classDef
2651+
bdf

neural_solution/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ Neural Solution is a flexible and easy to use tool that brings the capabilities
44

55
# Why Neural Solution?
66

7-
- Efficiency: Neural Solution accelerates the optimization process by seamlessly parallelizing the tuning across multiple nodes.
7+
- Task Parallelism: Neural Solution automatically schedules the optimization task queue by coordinating available resources and allows execution of multiple optimization tasks simultaneously.
8+
- Tuning Parallelism: Neural Solution accelerates the optimization process by seamlessly parallelizing the tuning across multiple nodes.
89
- APIs Support: Neural Solution supports both RESTful and gRPC APIs, enabling users to conveniently submit optimization tasks.
910
- Code Less: When working with Hugging Face models, Neural Solution seamlessly integrates the functionality of the [Neural Coder](https://github.com/intel/neural-compressor/tree/master/neural_coder), eliminating the need for any code modifications during the optimization process.
1011

@@ -13,13 +14,12 @@ Neural Solution is a flexible and easy to use tool that brings the capabilities
1314

1415
# Get Started
1516
## Installation
16-
<details>
17-
<summary>Prerequisites</summary>
17+
### Prerequisites
1818

1919
- Install [Anaconda](https://docs.anaconda.com/free/anaconda/install/)
2020
- Install [Open MPI](https://www.open-mpi.org/faq/?category=building#easy-build)
2121
- Python 3.8 or later
22-
</details>
22+
2323

2424
There are two ways to install the neural solution:
2525
### Method 1. Using pip:

neural_solution/docs/source/README.md

Lines changed: 109 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -1,124 +1,128 @@
1-
## Design Doc for Optimization as a Service [WIP]
2-
3-
4-
5-
### Contents
6-
7-
- [Design Doc for Optimization as a Service \[WIP\]](#design-doc-for-optimization-as-a-service-wip)
8-
- [Contents](#contents)
9-
- [Overview](#overview)
10-
- [Workflow of OaaS](#workflow-of-oaas)
11-
- [Class definition diagram](#class-definition-diagram)
12-
- [Extensibility](#extensibility)
13-
14-
### Overview
15-
16-
Optimization as a service(OaaS) is a platform that enables users to submit quantization tasks for their models and automatically dispatches these tasks to one or multiple nodes for accuracy-aware tuning. OaaS is designed to parallelize the tuning process in two levels: tuning and model. At the tuning level, OaaS execute the tuning process across multiple nodes for one model. At the model level, OaaS allocate free nodes to incoming requests automatically.
17-
18-
19-
### Workflow of OaaS
20-
21-
```mermaid
22-
sequenceDiagram
23-
participant Studio
24-
participant TaskMonitor
25-
participant Scheduler
26-
participant Cluster
27-
participant TaskLauncher
28-
participant ResultMonitor
29-
Par receive task
30-
Studio ->> TaskMonitor: P1-1. Post quantization Request
31-
TaskMonitor ->> TaskMonitor: P1-2. Add task to task DB
32-
TaskMonitor ->> Studio: P1-3. Task received notification
33-
and Schedule task
34-
loop
35-
Scheduler ->> Scheduler: P2-1. Pop task from task DB
36-
Scheduler ->> Cluster: P2-2. Apply for resources
37-
Note over Scheduler, Cluster: the number of Nodes
38-
Cluster ->> Cluster: P2-3. Check the status of nodes in cluster
39-
Cluster ->> Scheduler: P2-4. Resources info
40-
Note over Scheduler, Cluster: host:socket list
41-
Scheduler ->> TaskLauncher: P2-5. Dispatch task
42-
end
43-
and Run task
44-
TaskLauncher ->> TaskLauncher: P3-1. Run task
45-
Note over TaskLauncher, TaskLauncher: mpirun -np 4 -hostfile hostfile python main.py
46-
TaskLauncher ->> TaskLauncher: P3-2. Wait task to finish...
47-
TaskLauncher ->> Cluster: P3-3. Free resource
48-
TaskLauncher ->> ResultMonitor: P3-4. Report the Acc and Perf
49-
ResultMonitor ->> Studio: P3-5. Post result to Studio
50-
and Query task status
51-
Studio ->> ResultMonitor: P4-1. Query the status of the submitted task
52-
ResultMonitor ->> Studio: P4-2. Post the status of queried task
53-
End
54-
1+
# Get started
2+
3+
- [Get started](#get-started)
4+
- [Install Neural Solution](#install-neural-solution)
5+
- [Prerequisites](#prerequisites)
6+
- [Method 1. Using pip](#method-1-using-pip)
7+
- [Method 2. Building from source](#method-2-building-from-source)
8+
- [Start service](#start-service)
9+
- [Submit task](#submit-task)
10+
- [Query task status](#query-task-status)
11+
- [Stop service](#stop-service)
12+
- [Inspect logs](#inspect-logs)
13+
14+
## Install Neural Solution
15+
### Prerequisites
16+
- Install [Anaconda](https://docs.anaconda.com/free/anaconda/install/)
17+
- Install [Open MPI](https://www.open-mpi.org/faq/?category=building#easy-build)
18+
- Python 3.8 or later
19+
20+
There are two ways to install the neural solution:
21+
### Method 1. Using pip
5522
```
23+
pip install neural-solution
24+
```
25+
### Method 2. Building from source
5626

57-
The optimization process is divided into four parts, each executed in separate threads.
58-
59-
- Part 1. Posting new quantization task. (P1-1 -> P1-2 -> P1-3)
60-
61-
- Part 2. Resource allocation and scheduling. (P2-1 -> P2-2 -> P2-3 -> P2-4 -> P2-5)
62-
63-
- Part 3. Task execution and reporting. (P3-1 -> P3-2 -> P3-3 -> P3-4 -> P3-5)
27+
```shell
28+
# get source code
29+
git clone https://github.com/intel/neural-compressor
30+
cd neural-compressor
6431

65-
- Part 4. Updating the status. (P4-1 -> P4-2)
32+
# install neural compressor
33+
pip install -r requirements.txt
34+
python setup.py install
6635

67-
### Class definition diagram
36+
# install neural solution
37+
pip install -r neural_solution/requirements.txt
38+
python setup.py neural_solution install
39+
```
6840

41+
## Start service
42+
43+
```shell
44+
# Start neural solution service with custom configuration
45+
neural_solution start --task_monitor_port=22222 --result_monitor_port=33333 --restful_api_port=8001
46+
47+
# Help Manual
48+
neural_solution -h
49+
# Help output
50+
51+
usage: neural_solution {start,stop} [-h] [--hostfile HOSTFILE] [--restful_api_port RESTFUL_API_PORT] [--grpc_api_port GRPC_API_PORT]
52+
[--result_monitor_port RESULT_MONITOR_PORT] [--task_monitor_port TASK_MONITOR_PORT] [--api_type API_TYPE]
53+
[--workspace WORKSPACE] [--conda_env CONDA_ENV] [--upload_path UPLOAD_PATH]
54+
55+
Neural Solution
56+
57+
positional arguments:
58+
{start,stop} start/stop service
59+
60+
optional arguments:
61+
-h, --help show this help message and exit
62+
--hostfile HOSTFILE start backend serve host file which contains all available nodes
63+
--restful_api_port RESTFUL_API_PORT
64+
start restful serve with {restful_api_port}, default 8000
65+
--grpc_api_port GRPC_API_PORT
66+
start gRPC with {restful_api_port}, default 8000
67+
--result_monitor_port RESULT_MONITOR_PORT
68+
start serve for result monitor at {result_monitor_port}, default 3333
69+
--task_monitor_port TASK_MONITOR_PORT
70+
start serve for task monitor at {task_monitor_port}, default 2222
71+
--api_type API_TYPE start web serve with all/grpc/restful, default all
72+
--workspace WORKSPACE
73+
neural solution workspace, default "./ns_workspace"
74+
--conda_env CONDA_ENV
75+
specify the running environment for the task
76+
--upload_path UPLOAD_PATH
77+
specify the file path for the tasks
6978

79+
```
7080

71-
```mermaid
72-
classDiagram
81+
## Submit task
7382

83+
- For RESTful API: `[user@server hf_model]$ curl -H "Content-Type: application/json" --data @./task.json http://localhost:8000/task/submit/`
84+
- For gRPC API: `python -m neural_solution.frontend.gRPC.client submit --request="test.json"`
7485

86+
> For more details, please reference the [API description](./description_api.md) and [examples](../../examples/README.md).
7587
76-
TaskDB "1" --> "*" Task
77-
TaskMonitor --> TaskDB
78-
ResultMonitor --> TaskDB
79-
Scheduler --> TaskDB
80-
Scheduler --> Cluster
88+
## Query task status
8189

90+
Query the task status and result according to the `task_id`.
8291

83-
class Task{
84-
+ status
85-
+ get_status()
86-
+ update_status()
87-
}
92+
- For RESTful API: `[user@server hf_model]$ curl -X GET http://localhost:8000/task/status/{task_id}`
93+
- For gRPC API: `python -m neural_solution.frontend.gRPC.client query --task_id={task_id}`
8894

89-
class TaskDB{
90-
- task_collections
91-
+ append_task()
92-
+ get_all_pending_tasks()
93-
+ update_task_status()
94-
}
95-
class TaskMonitor{
96-
- task_db
97-
+ wait_new_task()
98-
}
99-
class Scheduler{
100-
- task_db
101-
- cluster
102-
+ schedule_tasks()
103-
+ dispatch_task()
104-
+ launch_task()
105-
}
95+
> For more details, please reference the [API description](./description_api.md) and [examples](../../examples/README.md).
10696
107-
class ResultMonitor{
108-
- task_db
109-
+ query_task_status()
110-
}
111-
class Cluster{
112-
- node_list
113-
+ free()
114-
+ reserve_resource()
115-
+ get_node_status()
116-
}
97+
## Stop service
11798

99+
```shell
100+
# Stop neural solution service with default configuration
101+
neural_solution stop
118102
```
119103

104+
## Inspect logs
105+
106+
The default logs locate in `./ns_workspace/`. Users can specify a custom workspace by using `neural_solution ---workspace=/path/to/custom/workspace`.
107+
108+
There are several logs under workspace:
109+
110+
```shell
111+
(ns) [username@servers ns_workspace]$ tree
112+
.
113+
├── db
114+
│ └── task.db # database to save the task-related information
115+
├── serve_log # service running log
116+
│ ├── backend.log # backend log
117+
│ ├── frontend_grpc.log # grpc frontend log
118+
│ └── frontend.log # HTTP/RESTful frontend log
119+
├── task_log # overall log for each task
120+
│ ├── task_bdf0bd1b2cc14bc19bce12d4f9b333c7.txt # task log
121+
│ └── ...
122+
└── task_workspace # the log for each task
123+
...
124+
├── bdf0bd1b2cc14bc19bce12d4f9b333c7 # task_id
125+
...
120126

121-
### Extensibility
122-
123-
- The service can be deployed on various resource pool, including a set of worker nodes, such as a local cluster or cloud cluster (AWS and GCP).
127+
```
124128

neural_solution/docs/source/description_api.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Neural Solution OaaS API Documentation
1+
# Neural Solution API
22

33
Welcome to Neural Solution OaaS API documentation. This API documentation provides a detailed description of all the endpoints available in Neural Solution OaaS API.
44

neural_solution/docs/source/get_started.md

Lines changed: 0 additions & 18 deletions
This file was deleted.
Loading

0 commit comments

Comments
 (0)