Merge pull request #48 from mutablelogic/v1

djthorpe · web-flow · commit a3d6833b2fe2 · 2024-07-31T09:47:30.000+02:00
V1
diff --git a/README.md b/README.md
@@ -38,7 +38,7 @@ available at `http://localhost:8080/v1` and it generally conforms to the
 In order to download a model, you can use the following command (for example):
 
 ```bash
-curl -X POST -H "Content-Type: application/json" -d '{"Path" : "ggml-medium-q5_0.bin" }' localhost:8080/v1/models  
+curl -X POST -H "Content-Type: application/json" -d '{"Path" : "ggml-medium-q5_0.bin" }' localhost:8080/v1/models\?stream=true
 ```
 
 To list the models available, you can use the following command:
@@ -56,13 +56,13 @@ curl -X DELETE localhost:8080/v1/models/ggml-medium-q5_0
 To transcribe a media file into it's original language, you can use the following command:
 
 ```bash
-curl -F model=ggml-medium-q5_0 -F file=@samples/jfk.wav localhost:8080/v1/audio/transcriptions
+curl -F model=ggml-medium-q5_0 -F file=@samples/jfk.wav localhost:8080/v1/audio/transcriptions\?stream=true
 ```
 
 To translate a media file into a different language, you can use the following command:
 
 ```bash
-curl -F model=ggml-medium-q5_0 -F file=@samples/de-podcast.wav -F language=en localhost:8080/v1/audio/transcriptions\?stream=true
+curl -F model=ggml-medium-q5_0 -F file=@samples/ge-podcast.wav -F language=en localhost:8080/v1/audio/translations\?stream=true
 ```
 
 There's more information on the API [here](doc/API.md).
diff --git a/doc/API.md b/doc/API.md
@@ -59,6 +59,24 @@ Downloads a model from remote huggingface repository. If the optional `stream` a
 the progress is streamed back to the client as a series of [text/event-stream](https://html.spec.whatwg.org/multipage/server-sent-events.html) events.
 
 If the model is already downloaded, a 200 OK status is returned. If the model was downloaded, a 201 Created status is returned.
+Example streaming response:
+
+```text
+event: ping
+
+event: progress
+data: {"status":"downloading ggml-medium-q5_0.bin","total":539212467,"completed":10159256}
+
+event: progress
+data: {"status":"downloading ggml-medium-q5_0.bin","total":539212467,"completed":21895036}
+
+event: progress
+data: {"status":"downloading ggml-medium-q5_0.bin","total":539212467,"completed":33540592}
+
+event: ok
+data: {"id":"ggml-medium-q5_0","object":"model","path":"ggml-medium-q5_0.bin","created":1722411778}
+```
+
 
 ### Delete Model
 
@@ -102,6 +120,28 @@ Transcribes audio into the input language.
 
 If the optional `stream` argument is true, the segments of the transcription are returned as a series of [text/event-stream](https://html.spec.whatwg.org/multipage/server-sent-events.html) events. Otherwise, the full transcription is returned in the response body.
 
+Example streaming response:
+  
+```text
+event: ping
+
+event: task
+data: {"task":"translate","language":"en","duration":62.6155}
+
+event: ping
+
+event: segment
+data: {"id":0,"start":0,"end":14.2,"text":" What do you think about new media like Facebook, emails and cell phones?"}
+
+event: segment
+data: {"id":1,"start":14.2,"end":18.2,"text":" The new media make our life much easier."}
+
+event: segment
+data: {"id":2,"start":18.2,"end":23,"text":" You can get in touch with people much faster than before."}
+
+event: ok
+```
+
 ### Translation
 
 This is the same as transcription (above) except that the `language` parameter is not optional, and should be the language to translate the audio into.