6
6
2 . [ Performance Optimization] ( #performance-optimization )
7
7
3 . [ Model Management] ( #model-management )
8
8
4 . [ System Configuration] ( #system-configuration )
9
- 5 . [ Best Practices] ( #best-practices )
9
+ 5 . [ CLI Configuration] ( #cli-configuration )
10
+ 6 . [ Best Practices] ( #best-practices )
10
11
11
12
## Advanced Features
12
13
13
14
### Custom Model Loading
14
15
16
+ ** Using CLI (New!)**
17
+
18
+ ``` bash
19
+ # Load a custom model with the CLI
20
+ locallab start --model meta-llama/Llama-2-7b-chat-hf
21
+ ```
22
+
23
+ ** Using Environment Variables**
24
+
15
25
``` python
16
26
import os
17
27
from locallab import start_server
@@ -48,6 +58,15 @@ responses = await client.batch_generate(prompts)
48
58
49
59
### 1. Memory Optimization
50
60
61
+ ** Using CLI (New!)**
62
+
63
+ ``` bash
64
+ # Enable memory optimizations via CLI
65
+ locallab start --quantize --quantize-type int8 --attention-slicing
66
+ ```
67
+
68
+ ** Using Environment Variables**
69
+
51
70
``` python
52
71
# Enable memory optimizations
53
72
os.environ[" LOCALLAB_ENABLE_QUANTIZATION" ] = " true"
@@ -57,6 +76,15 @@ os.environ["LOCALLAB_ENABLE_CPU_OFFLOADING"] = "true"
57
76
58
77
### 2. Speed Optimization
59
78
79
+ ** Using CLI (New!)**
80
+
81
+ ``` bash
82
+ # Enable speed optimizations via CLI
83
+ locallab start --flash-attention --better-transformer
84
+ ```
85
+
86
+ ** Using Environment Variables**
87
+
60
88
``` python
61
89
# Enable speed optimizations
62
90
os.environ[" LOCALLAB_ENABLE_FLASH_ATTENTION" ] = " true"
@@ -120,14 +148,66 @@ os.environ["LOCALLAB_ENABLE_FILE_LOGGING"] = "true"
120
148
os.environ[" LOCALLAB_LOG_FILE" ] = " locallab.log"
121
149
```
122
150
151
+ ## CLI Configuration
152
+
153
+ The LocalLab CLI provides a powerful way to configure and manage your server. Here are some advanced CLI features:
154
+
155
+ ### Interactive Configuration Wizard
156
+
157
+ ``` bash
158
+ # Run the configuration wizard
159
+ locallab config
160
+ ```
161
+
162
+ ### System Information
163
+
164
+ ``` bash
165
+ # Get detailed system information
166
+ locallab info
167
+ ```
168
+
169
+ ### Advanced CLI Options
170
+
171
+ ``` bash
172
+ # Start with advanced configuration
173
+ locallab start \
174
+ --model microsoft/phi-2 \
175
+ --port 8080 \
176
+ --quantize \
177
+ --quantize-type int4 \
178
+ --attention-slicing \
179
+ --flash-attention \
180
+ --better-transformer
181
+ ```
182
+
183
+ ### Persistent Configuration
184
+
185
+ The CLI stores your configuration in ` ~/.locallab/config.json ` . You can edit this file directly for advanced configuration:
186
+
187
+ ``` json
188
+ {
189
+ "model_id" : " microsoft/phi-2" ,
190
+ "port" : 8080 ,
191
+ "enable_quantization" : true ,
192
+ "quantization_type" : " int8" ,
193
+ "enable_attention_slicing" : true ,
194
+ "enable_flash_attention" : true ,
195
+ "enable_better_transformer" : true
196
+ }
197
+ ```
198
+
199
+ For more details, see the [ CLI Guide] ( ./cli.md ) .
200
+
123
201
## Best Practices
124
202
125
203
1 . ** Resource Management**
204
+
126
205
- Monitor system resources
127
206
- Use appropriate quantization
128
207
- Enable optimizations based on hardware
129
208
130
209
2 . ** Error Handling**
210
+
131
211
``` python
132
212
try :
133
213
response = await client.generate(" Hello" )
@@ -148,6 +228,7 @@ os.environ["LOCALLAB_LOG_FILE"] = "locallab.log"
148
228
149
229
## Related Resources
150
230
151
- - [ API Reference] ( https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/features/api-reference.md )
152
- - [ Configuration Guide] ( https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/features/configuration.md )
153
- - [ Troubleshooting] ( https://github.com/Developer-Utkarsh/LocalLab/blob/main/docs/troubleshooting.md )
231
+ - [ CLI Guide] ( ./cli.md )
232
+ - [ API Reference] ( ./api.md )
233
+ - [ Configuration Guide] ( ../features/configuration.md )
234
+ - [ Troubleshooting] ( ./troubleshooting.md )
0 commit comments