You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`max_split_size_mb` prevents the native allocator from splitting blocks larger than this size (in MB). This can reduce fragmentation and may allow some borderline workloads to complete without running out of memory. You can find more details [<u>here</u>](https://www.hiascend.com/document/detail/zh/CANNCommunityEdition/800alpha003/apiref/envref/envref_07_0061.html).
44
+
:::
45
+
46
+
Install packages required for audio processing:
47
+
48
+
```bash
49
+
pip config set global.index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
50
+
pip install librosa soundfile
51
+
```
52
+
53
+
Run the following script to execute offline inference on a single NPU:
If you run this script successfully, you can see the info shown below:
118
+
119
+
```bash
120
+
The sport referenced is baseball, and the nursery rhyme is 'Mary Had a Little Lamb'.
121
+
```
122
+
123
+
### Online Serving on Single NPU
124
+
125
+
Currently, vllm's OpenAI-compatible server doesn't support audio inputs, find more details [<u>here</u>](https://github.com/vllm-project/vllm/issues/19977).
0 commit comments