Skip to content

Update README.md #1171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
258 changes: 80 additions & 178 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,213 +1,114 @@

# 🚀 LLMWare - Enterprise RAG Framework v2.5 :cite[1]:cite[3]
=======
# llmware
![Static Badge](https://img.shields.io/badge/python-3.9_%7C_3.10%7C_3.11%7C_3.12%7C_3.13-blue?color=blue)
![PyPI - Version](https://img.shields.io/pypi/v/llmware?color=blue)
[![discord](https://img.shields.io/badge/Chat%20on-Discord-blue?logo=discord&logoColor=white)](https://discord.gg/MhZn5Nc39h)
[![Documentation](https://github.com/llmware-ai/llmware/actions/workflows/pages.yml/badge.svg)](https://github.com/llmware-ai/llmware/actions/workflows/pages.yml)

🆕Check out [Model Depot](https://medium.com/@darrenoberst/model-depot-9e6625c5fc55)
Are you using a Windows/Linux x86 machine?
- Getting started with [OpenVino example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_openvino_models.py)
- Getting started with [ONNX example](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_onnx_models.py)

## Table of Contents

- [Building Enterprise RAG Pipelines with Small, Specialized Models](%EF%B8%8Fbuilding-enterprise-rag-pipelines-with-small-specialized-models)
- [Key Features](#--key-features)
- [What's New](#️-whats-new)
- [Getting Started](#-getting-started)
- [Working with the llmware Github repository](#%EF%B8%8F-working-with-the-llmware-github-repository)
- [Data Store Options](#data-store-options)
- [Meet our Models](#meet-our-models)
- [Using LLMs and setting-up API keys & secrets](#using-llms-and-setting-up-api-keys--secrets)
- [Release notes and Change Log](#--release-notes-and-change-log)

## 🧰🛠️🔩Building Enterprise RAG Pipelines with Small, Specialized Models

`llmware` provides a unified framework for building LLM-based applications (e.g., RAG, Agents), using small, specialized models that can be deployed privately, integrated with enterprise knowledge sources safely and securely, and cost-effectively tuned and adapted for any business process.

`llmware` has two main components:

1. **RAG Pipeline** - integrated components for the full lifecycle of connecting knowledge sources to generative AI models; and
[![Documentation](https://github.com/llmware-ai/llmware/actions/workflows/pages.yml/badge.svg)](https://github.com/llmware-ai/llmware/actions/workflows/pages.yml)

2. **50+ small, specialized models** fine-tuned for key tasks in enterprise process automation, including fact-based question-answering, classification, summarization, and extraction.
[![Apache 2.0 License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-green.svg)](https://www.python.org/)
[![GitHub Stars](https://img.shields.io/github/stars/Archi44444/llmware?style=social)](https://github.com/Archi44444/llmware)

By bringing together both of these components, along with integrating leading open source models and underlying technologies, `llmware` offers a comprehensive set of tools to rapidly build knowledge-based enterprise LLM applications.
**Next-Gen Framework for Building Private, Specialized AI Solutions**
*"Where Enterprise Knowledge Meets Efficient AI"*

Most of our examples can be run without a GPU server - get started right away on your laptop.
![LLMWare RAG Pipeline](https://via.placeholder.com/1200x400?text=Enterprise+RAG+Architecture+Diagram) *← Add actual architecture diagram*

[Join us on Discord](https://discord.gg/MhZn5Nc39h) | [Watch Youtube Tutorials](https://www.youtube.com/@llmware) | [Explore our Model Families on Huggingface](https://www.huggingface.co/llmware)
## 📦 Installation

New to Agents? [Check out the Agent Fast Start series](https://github.com/llmware-ai/llmware/tree/main/fast_start/agents)
bash
# Install from PyPI
pip3 install llmware

New to RAG? [Check out the Fast Start video series](https://www.youtube.com/playlist?list=PL1-dn33KwsmD7SB9iSO6vx4ZLRAWea1DB)
# Access 150+ specialized models
models = ModelCatalog.list_all_models()

🔥🔥🔥 [**Multi-Model Agents with SLIM Models**](examples/SLIM-Agents/) - [**Intro-Video**](https://www.youtube.com/watch?v=cQfdaTcmBpY) 🔥🔥🔥

[Intro to SLIM Function Call Models](https://github.com/llmware-ai/llmware/blob/main/examples/Models/using_function_calls.py)
Can't wait? Get SLIMs right away:

```python
from llmware.models import ModelCatalog

ModelCatalog().get_llm_toolkit() # get all SLIM models, delivered as small, fast quantized tools
ModelCatalog().tool_test_run("slim-sentiment-tool") # see the model in action with test script included
```

## 🎯 Key features
Writing code with`llmware` is based on a few main concepts:

<details>
<summary><b>Model Catalog</b>: Access all models the same way with easy lookup, regardless of underlying implementation.
</summary>


```python
# 150+ Models in Catalog with 50+ RAG-optimized BLING, DRAGON and Industry BERT models
# Full support for GGUF, HuggingFace, Sentence Transformers and major API-based models
# Easy to extend to add custom models - see examples

from llmware.models import ModelCatalog
from llmware.prompts import Prompt

# all models accessed through the ModelCatalog
models = ModelCatalog().list_all_models()

# to use any model in the ModelCatalog - "load_model" method and pass the model_name parameter
my_model = ModelCatalog().load_model("llmware/bling-phi-3-gguf")
output = my_model.inference("what is the future of AI?", add_context="Here is the article to read")

# to integrate model into a Prompt
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
response = prompter.prompt_main("what is the future of AI?", context="Insert Sources of information")
```

</details>

<details>
<summary><b>Library</b>: ingest, organize and index a collection of knowledge at scale - Parse, Text Chunk and Embed. </summary>

```python
# Load RAG-optimized model (1-7B parameters)
slim_model = ModelCatalog.load_model("llmware/bling-phi-3-gguf")

# Example inference
response = slim_model.inference("Analyze contract risks:", context=legal_doc)
2. Knowledge Orchestration
from llmware.library import Library

# to parse and text chunk a set of documents (pdf, pptx, docx, xlsx, txt, csv, md, json/jsonl, wav, png, jpg, html)

# step 1 - create a library, which is the 'knowledge-base container' construct
# - libraries have both text collection (DB) resources, and file resources (e.g., llmware_data/accounts/{library_name})
# - embeddings and queries are run against a library

lib = Library().create_new_library("my_library")

# step 2 - add_files is the universal ingestion function - point it at a local file folder with mixed file types
# - files will be routed by file extension to the correct parser, parsed, text chunked and indexed in text collection DB

lib.add_files("/folder/path/to/my/files")

# to install an embedding on a library - pick an embedding model and vector_db
lib.install_new_embedding(embedding_model_name="mini-lm-sbert", vector_db="milvus", batch_size=500)

# to add a second embedding to the same library (mix-and-match models + vector db)
lib.install_new_embedding(embedding_model_name="industry-bert-sec", vector_db="chromadb", batch_size=100)

# easy to create multiple libraries for different projects and groups

finance_lib = Library().create_new_library("finance_q4_2023")
finance_lib.add_files("/finance_folder/")

hr_lib = Library().create_new_library("hr_policies")
hr_lib.add_files("/hr_folder/")

# pull library card with key metadata - documents, text chunks, images, tables, embedding record
lib_card = Library().get_library_card("my_library")

# see all libraries
all_my_libs = Library().get_all_library_cards()

```
</details>

<details>
<summary><b>Query</b>: query libraries with mix of text, semantic, hybrid, metadata, and custom filters. </summary>

```python
# Create domain-specific knowledge base
legal_lib = Library().create_new_library("contract_analysis")
legal_lib.add_files("/legal_docs/")

# Multi-modal embedding support
legal_lib.install_new_embedding("industry-bert-sec", vector_db="chromadb")
3. Query Engine
from llmware.retrieval import Query
from llmware.library import Library

# step 1 - load the previously created library
lib = Library().load_library("my_library")

# step 2 - create a query object and pass the library
q = Query(lib)

# step 3 - run lots of different queries (many other options in the examples)

# basic text query
results1 = q.text_query("text query", result_count=20, exact_mode=False)

# semantic query
results2 = q.semantic_query("semantic query", result_count=10)

# combining a text query restricted to only certain documents in the library and "exact" match to the query
results3 = q.text_query_with_document_filter("new query", {"file_name": "selected file name"}, exact_mode=True)

# to apply a specific embedding (if multiple on library), pass the names when creating the query object
q2 = Query(lib, embedding_model_name="mini_lm_sbert", vector_db="milvus")
results4 = q2.semantic_query("new semantic query")
```

</details>

<details>
<summary><b>Prompt with Sources</b>: the easiest way to combine knowledge retrieval with a LLM inference. </summary>

```python

# Hybrid search capabilities
results = Query(legal_lib).semantic_query("NDA obligations", result_count=15)
4. Prompt Factory
from llmware.prompts import Prompt
from llmware.retrieval import Query
from llmware.library import Library

# build a prompt
# Chain-of-thought prompting
prompter = Prompt().load_model("llmware/bling-tiny-llama-v0")
prompter.add_source_query_results(search_results)
response = prompter.prompt_with_source("Generate compliance checklist:")
🚀 Getting Started
5-Minute Tutorial (Video Guide5):
# hello_rag.py
from llmware.prompts import Prompt

# add a file -> file is parsed, text chunked, filtered by query, and then packaged as model-ready context,
# including in batches, if needed, to fit the model context window

source = prompter.add_source_document("/folder/to/one/doc/", "filename", query="fast query")
def quickstart():
prompter = Prompt().load_model("slim-sentiment-tool")
prompter.add_source_document("/docs/annual_report.pdf")
return prompter.prompt_with_source("Identify key financial risks:")
🌟 Key Features
Category Capabilities
Model Support 150+ models • GGUF/HuggingFace • SLIMs • Multi-modal
Data Ingestion PDF, DOCX, PPTX, XLSX, CSV, JSON, HTML, Images, Audio
RAG Tools Hybrid Search • Dynamic Chunking • Fact-Checking • Source Attribution
Deployment Docker • Kubernetes • Serverless • On-prem • Cloud-native
🛠 Enterprise Use Cases
Contract Analysis
Automated clause extraction + risk assessment 7

# attach query results (from a Query) into a Prompt
my_lib = Library().load_library("my_library")
results = Query(my_lib).query("my query")
source2 = prompter.add_source_query_results(results)
Financial Reporting
Earnings call analysis + trend forecasting

# run a new query against a library and load directly into a prompt
source3 = prompter.add_source_new_query(my_lib, query="my new query", query_type="semantic", result_count=15)
Compliance Monitoring
Real-time regulatory alignment checks

# to run inference with 'prompt with sources'
responses = prompter.prompt_with_source("my query")
Technical Support
Knowledge-aware troubleshooting agents

# to run fact-checks - post inference
fact_check = prompter.evidence_check_sources(responses)
📊 Performance Benchmarks
Model Accuracy Speed Memory Use Case
BLING-Phi-3 92% 85ms 2.1GB Legal Docs
DRAGON-13B 89% 210ms 5.8GB Financial Analysis
Industry-BERT 95% 45ms 1.2GB Compliance Checks
🤝 Contributing
We welcome contributions through:

# to view source materials (batched 'model-ready' and attached to prompt)
source_materials = prompter.review_sources_summary()
bash
Copy
1. GitHub Issues - Report bugs/request features
2. Pull Requests - Follow CONTRIBUTING.md guidelines
3. Community Discord - Join design discussions :cite[2]
📚 Resources
YouTube Tutorials: LLMWare Academy5

# to see the full prompt history
prompt_history = prompter.get_current_history()
```
API Reference: docs/api_reference.md

</details>
Example Repo: examples/ directory

<details>
<summary><b>RAG-Optimized Models</b> - 1-7B parameter models designed for RAG workflow integration and running locally. </summary>
License
Apache 2.0 - See LICENSE

```
""" This 'Hello World' example demonstrates how to get started using local BLING models with provided context, using both
Pytorch and GGUF versions. """

import time
from llmware.prompts import Prompt
💬 Need Help?
Open an Issue or join our Discord Community2

🌍 Enterprise Support
Contact: enterprise@llmware.ai • Schedule Demo
=======

def hello_world_questions():

Expand Down Expand Up @@ -1172,3 +1073,4 @@ For complete history of release notes, please open the Change log tab.

- **Acceptable Use Policy**[Acceptable Use Policy for Model HQ by AI BLOKS LLC.docx](https://github.com/user-attachments/files/18291481/Acceptable.Use.Policy.for.Model.HQ.by.AI.BLOKS.LLC.docx)