You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If all the above steps execute successfully, FastDeploy is installed correctly.
116
-
117
-
## Quick start
118
-
119
-
The P800 supports the deployment of the ```ERNIE-4.5-300B-A47B-Paddle``` model using the following configurations (Note: Different configurations may result in variations in performance).
120
-
- 32K WINT4 with 8 XPUs (Recommended)
121
-
- 128K WINT4 with 8 XPUs
122
-
- 32K WINT4 with 4 XPUs
123
-
124
-
### Online serving (OpenAI API-Compatible server)
125
-
126
-
Deploy an OpenAI API-compatible server using FastDeploy with the following commands:
127
-
128
-
#### Start service
129
-
130
-
**Deploy the ERNIE-4.5-300B-A47B-Paddle model with WINT4 precision and 32K context length on 8 XPUs(Recommended)**
89
+
Alternatively, you can download the latest versions of XTDK and XVLLM (Not recommended)
{"role": "user", "content": "Where is the capital of China?"},
210
-
],
211
-
stream=True,
212
-
)
213
-
for chunk in response:
214
-
if chunk.choices[0].delta:
215
-
print(chunk.choices[0].delta.content, end='')
216
-
print('\n')
217
-
```
119
+
If all the above steps execute successfully, FastDeploy is installed correctly.
218
120
219
-
For detailed OpenAI protocol specifications, see [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create). Differences from the standard OpenAI protocol are documented in [OpenAI Protocol-Compatible API Server](../../online_serving/README.md).
121
+
## How to deploy services on kunlunxin XPU
122
+
Refer to [**Supported Models and Service Deployment**](../../usage/kunlunxin_xpu_deployment.md) for the details about the supported models and the way to deploy services on kunlunxin XPU.
{"role": "user", "content": "Where is the capital of China?"},
83
+
],
84
+
stream=True,
85
+
)
86
+
for chunk in response:
87
+
if chunk.choices[0].delta:
88
+
print(chunk.choices[0].delta.content, end='')
89
+
print('\n')
90
+
```
91
+
92
+
For detailed OpenAI protocol specifications, see [OpenAI Chat Compeltion API](https://platform.openai.com/docs/api-reference/chat/create). Differences from the standard OpenAI protocol are documented in [OpenAI Protocol-Compatible API Server](../../online_serving/README.md).
0 commit comments