You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+47-55Lines changed: 47 additions & 55 deletions
Original file line number
Diff line number
Diff line change
@@ -176,13 +176,54 @@ The learning process includes:
176
176
-**Strategic Enhancement**: Develop robustness against manipulation
177
177
-**Production Deployment**: Full capability with ongoing adaptation
178
178
179
-
## Requirements
179
+
## Order Dependency in Online Learning
180
+
181
+
When using the adaptive classifier for true online learning (adding examples incrementally), be aware that the order in which examples are added can affect predictions. This is inherent to incremental neural network training.
182
+
183
+
### The Challenge
184
+
185
+
```python
186
+
# These two scenarios may produce slightly different models:
While we've implemented sorted label ID assignment to minimize this effect, the neural network component still learns incrementally, which can lead to order-dependent behavior.
198
+
199
+
### Solution: Prototype-Only Predictions
200
+
201
+
For applications requiring strict order independence, you can configure the classifier to rely solely on prototype-based predictions:
202
+
203
+
```python
204
+
# Configure to use only prototypes (order-independent)
- Predictions are based solely on similarity to class prototypes (mean embeddings)
215
+
- Results are completely order-independent
216
+
- Trade-off: May have slightly lower accuracy than the hybrid approach
217
+
218
+
### Best Practices
180
219
181
-
- Python ≥ 3.8
182
-
- PyTorch ≥ 2.0
183
-
- transformers ≥ 4.30.0
184
-
- safetensors ≥ 0.3.1
185
-
- faiss-cpu ≥ 1.7.4 (or faiss-gpu for GPU support)
220
+
1.**For maximum consistency**: Use prototype-only configuration
221
+
2.**For maximum accuracy**: Accept some order dependency with the default hybrid approach
222
+
3.**For production systems**: Consider batching updates and retraining periodically if strict consistency is required
223
+
4.**Model selection matters**: Some models (e.g., `google-bert/bert-large-cased`) may produce poor embeddings for single words. For better results with short inputs, consider:
224
+
-`bert-base-uncased`
225
+
-`sentence-transformers/all-MiniLM-L6-v2`
226
+
- Or any model specifically trained for semantic similarity
186
227
187
228
## Adaptive Classification with LLMs
188
229
@@ -388,55 +429,6 @@ This real-world evaluation demonstrates that adaptive classification can signifi
388
429
-[RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models](https://arxiv.org/abs/2401.00396)
389
430
-[LettuceDetect: A Hallucination Detection Framework for RAG Applications](https://arxiv.org/abs/2502.17125)
390
431
391
-
## Order Dependency in Online Learning
392
-
393
-
When using the adaptive classifier for true online learning (adding examples incrementally), be aware that the order in which examples are added can affect predictions. This is inherent to incremental neural network training.
394
-
395
-
### The Challenge
396
-
397
-
```python
398
-
# These two scenarios may produce slightly different models:
While we've implemented sorted label ID assignment to minimize this effect, the neural network component still learns incrementally, which can lead to order-dependent behavior.
410
-
411
-
### Solution: Prototype-Only Predictions
412
-
413
-
For applications requiring strict order independence, you can configure the classifier to rely solely on prototype-based predictions:
414
-
415
-
```python
416
-
# Configure to use only prototypes (order-independent)
- Predictions are based solely on similarity to class prototypes (mean embeddings)
427
-
- Results are completely order-independent
428
-
- Trade-off: May have slightly lower accuracy than the hybrid approach
429
-
430
-
### Best Practices
431
-
432
-
1.**For maximum consistency**: Use prototype-only configuration
433
-
2.**For maximum accuracy**: Accept some order dependency with the default hybrid approach
434
-
3.**For production systems**: Consider batching updates and retraining periodically if strict consistency is required
435
-
4.**Model selection matters**: Some models (e.g., `google-bert/bert-large-cased`) may produce poor embeddings for single words. For better results with short inputs, consider:
436
-
-`bert-base-uncased`
437
-
-`sentence-transformers/all-MiniLM-L6-v2`
438
-
- Or any model specifically trained for semantic similarity
439
-
440
432
## Citation
441
433
442
434
If you use this library in your research, please cite:
0 commit comments