Skip to content

Commit 19ba977

Browse files
committed
Updated API docs
1 parent 81a0d81 commit 19ba977

File tree

1 file changed

+215
-0
lines changed

1 file changed

+215
-0
lines changed

docs/guides/API.md

Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,22 @@
11
# LocalLab API Documentation
22

3+
## Base URL
4+
5+
When making API requests, use one of the following base URLs:
6+
7+
- **Local development**: `http://localhost:8000`
8+
- **Remote access**: Use your ngrok URL (e.g., `https://abcd1234.ngrok.io`)
9+
10+
For all examples below, replace `{BASE_URL}` with your actual base URL.
11+
12+
```bash
13+
# For local development
14+
export BASE_URL=http://localhost:8000
15+
16+
# For remote access via ngrok
17+
export BASE_URL=https://your-ngrok-url.ngrok.io
18+
```
19+
320
## REST API Endpoints
421

522
### Text Generation
@@ -48,6 +65,39 @@ Generate text using the loaded model.
4865
}
4966
```
5067

68+
**Example (curl):**
69+
70+
```bash
71+
# Basic generation with minimal parameters
72+
curl -X POST "${BASE_URL}/generate" \
73+
-H "Content-Type: application/json" \
74+
-d '{
75+
"prompt": "Explain quantum computing in simple terms"
76+
}'
77+
78+
# Generation with all parameters
79+
curl -X POST "${BASE_URL}/generate" \
80+
-H "Content-Type: application/json" \
81+
-d '{
82+
"prompt": "Explain quantum computing in simple terms",
83+
"model_id": null,
84+
"stream": false,
85+
"max_length": 8192,
86+
"temperature": 0.7,
87+
"top_p": 0.9,
88+
"top_k": 80,
89+
"repetition_penalty": 1.15
90+
}'
91+
92+
# Streaming generation
93+
curl -X POST "${BASE_URL}/generate" \
94+
-H "Content-Type: application/json" \
95+
-d '{
96+
"prompt": "Explain quantum computing in simple terms",
97+
"stream": true
98+
}'
99+
```
100+
51101
**Error Responses:**
52102

53103
- `400 Bad Request`: Invalid parameters
@@ -104,6 +154,48 @@ Chat completion endpoint similar to OpenAI's API.
104154
}
105155
```
106156

157+
**Example (curl):**
158+
159+
```bash
160+
# Basic chat with minimal parameters
161+
curl -X POST "${BASE_URL}/chat" \
162+
-H "Content-Type: application/json" \
163+
-d '{
164+
"messages": [
165+
{"role": "system", "content": "You are a helpful assistant."},
166+
{"role": "user", "content": "Hello, how are you?"}
167+
]
168+
}'
169+
170+
# Chat with all parameters
171+
curl -X POST "${BASE_URL}/chat" \
172+
-H "Content-Type: application/json" \
173+
-d '{
174+
"messages": [
175+
{"role": "system", "content": "You are a helpful assistant."},
176+
{"role": "user", "content": "Hello, how are you?"}
177+
],
178+
"model_id": null,
179+
"stream": false,
180+
"max_length": 8192,
181+
"temperature": 0.7,
182+
"top_p": 0.9,
183+
"top_k": 80,
184+
"repetition_penalty": 1.15
185+
}'
186+
187+
# Streaming chat
188+
curl -X POST "${BASE_URL}/chat" \
189+
-H "Content-Type: application/json" \
190+
-d '{
191+
"messages": [
192+
{"role": "system", "content": "You are a helpful assistant."},
193+
{"role": "user", "content": "Hello, how are you?"}
194+
],
195+
"stream": true
196+
}'
197+
```
198+
107199
### Batch Generation
108200

109201
#### POST `/generate/batch`
@@ -139,6 +231,38 @@ Generate text for multiple prompts in parallel.
139231
}
140232
```
141233

234+
**Example (curl):**
235+
236+
```bash
237+
# Basic batch generation with minimal parameters
238+
curl -X POST "${BASE_URL}/generate/batch" \
239+
-H "Content-Type: application/json" \
240+
-d '{
241+
"prompts": [
242+
"Write a haiku about nature",
243+
"Tell a short joke",
244+
"Give a fun fact about space"
245+
]
246+
}'
247+
248+
# Batch generation with all parameters
249+
curl -X POST "${BASE_URL}/generate/batch" \
250+
-H "Content-Type: application/json" \
251+
-d '{
252+
"prompts": [
253+
"Write a haiku about nature",
254+
"Tell a short joke",
255+
"Give a fun fact about space"
256+
],
257+
"model_id": null,
258+
"max_length": 8192,
259+
"temperature": 0.7,
260+
"top_p": 0.9,
261+
"top_k": 80,
262+
"repetition_penalty": 1.15
263+
}'
264+
```
265+
142266
### Model Management
143267

144268
#### POST `/models/load`
@@ -153,24 +277,74 @@ Load a specific model.
153277
}
154278
```
155279

280+
**Example (curl):**
281+
282+
```bash
283+
# Load a specific model
284+
curl -X POST "${BASE_URL}/models/load" \
285+
-H "Content-Type: application/json" \
286+
-d '{
287+
"model_id": "microsoft/phi-2"
288+
}'
289+
```
290+
156291
#### GET `/models/current`
157292

158293
Get information about the currently loaded model.
159294

295+
**Example (curl):**
296+
297+
```bash
298+
# Get current model information
299+
curl -X GET "${BASE_URL}/models/current"
300+
```
301+
160302
#### GET `/models/available`
161303

162304
List all available models in the registry.
163305

306+
**Example (curl):**
307+
308+
```bash
309+
# List all available models
310+
curl -X GET "${BASE_URL}/models/available"
311+
```
312+
313+
#### POST `/models/unload`
314+
315+
Unload the current model to free up resources.
316+
317+
**Example (curl):**
318+
319+
```bash
320+
# Unload the current model
321+
curl -X POST "${BASE_URL}/models/unload"
322+
```
323+
164324
### System Information
165325

166326
#### GET `/system/info`
167327

168328
Get detailed system information.
169329

330+
**Example (curl):**
331+
332+
```bash
333+
# Get system information
334+
curl -X GET "${BASE_URL}/system/info"
335+
```
336+
170337
#### GET `/health`
171338

172339
Check the health status of the server.
173340

341+
**Example (curl):**
342+
343+
```bash
344+
# Check server health
345+
curl -X GET "${BASE_URL}/health"
346+
```
347+
174348
## Error Handling
175349

176350
All endpoints return appropriate HTTP status codes:
@@ -193,6 +367,47 @@ Error responses include a detail message:
193367
- 60 requests per minute
194368
- Burst size of 10 requests
195369

370+
## Tips for Using the API
371+
372+
### Default Parameters
373+
374+
All generation endpoints have sensible defaults for the response quality parameters:
375+
376+
- `max_length`: 8192 tokens
377+
- `temperature`: 0.7
378+
- `top_p`: 0.9
379+
- `top_k`: 80
380+
- `repetition_penalty`: 1.15
381+
382+
You can omit any or all of these parameters in your requests, and the server will use these defaults.
383+
384+
### Testing with Different Parameters
385+
386+
When experimenting with different parameter values, here's what to try:
387+
388+
- For more creative responses: Increase `temperature` (0.8-1.0) and `top_p` (0.95-1.0)
389+
- For more focused responses: Decrease `temperature` (0.3-0.5) and `top_p` (0.5-0.7)
390+
- For less repetition: Increase `repetition_penalty` (1.2-1.5)
391+
- For longer responses: Increase `max_length` (up to 16384)
392+
393+
### Handling Streaming Responses
394+
395+
When using streaming endpoints (`stream: true`), the response will be sent as a series of Server-Sent Events (SSE). Each event starts with `data: ` followed by the token or chunk. The end of the stream is marked with `data: [DONE]`.
396+
397+
```bash
398+
# Example of processing streaming responses with bash
399+
curl -X POST "${BASE_URL}/generate" \
400+
-H "Content-Type: application/json" \
401+
-d '{"prompt": "Hello", "stream": true}' | while read -r line; do
402+
if [[ $line == data:* ]]; then
403+
content=${line#data: }
404+
if [[ $content != "[DONE]" ]]; then
405+
echo -n "$content"
406+
fi
407+
fi
408+
done
409+
```
410+
196411
## Related Documentation
197412

198413
- [Getting Started](./getting-started.md)

0 commit comments

Comments
 (0)