A provider plugin for LangExtract that supports 100+ LLM models through LiteLLM's unified API, including OpenAI GPT models, Anthropic Claude, Google PaLM, Azure OpenAI, and many open-source models.
Note: This is a third-party provider plugin for LangExtract. For the main LangExtract library, visit google/langextract.
Install from PyPI:
pip install langextract-litellm
Or install from source for development:
git clone https://github.com/JustStas/langextract-litellm
cd langextract-litellm
pip install -e .
This provider handles model IDs that start with litellm
and supports a wide range of models through LiteLLM's unified API:
- OpenAI models:
litellm/gpt-4
,litellm/gpt-4o
,litellm/gpt-3.5-turbo
, etc. - Anthropic models:
litellm/claude-3-opus
,litellm/claude-3-sonnet
,litellm/claude-3-haiku
, etc. - Google models:
litellm/gemini-1.5-pro
,litellm/palm-2
, etc. - Azure OpenAI:
litellm/azure/your-deployment-name
- Open-source models:
litellm/llama-2-7b-chat
,litellm/mistral-7b
,litellm/codellama-34b
, etc. - And many more: See LiteLLM's supported models
Note: All model IDs must be prefixed with litellm/
or litellm-
to be handled by this provider.
Configure authentication using LiteLLM's standard environment variable format. Set the appropriate variables based on your provider:
export OPENAI_API_KEY="your-api-key"
export ANTHROPIC_API_KEY="your-api-key"
export HUGGINGFACE_API_KEY="your-api-key"
export AZURE_API_KEY="your-azure-key"
export AZURE_API_BASE="https://your-resource.openai.azure.com/"
export AZURE_API_VERSION="2024-02-01"
export VERTEXAI_PROJECT="your-project-id"
export VERTEXAI_LOCATION="us-central1"
# Also run: gcloud auth application-default login
See the LiteLLM documentation for environment variables for other providers like HuggingFace, Cohere, AI21, etc.
import langextract as lx
# Create model configuration
config = lx.factory.ModelConfig(
model_id="litellm/azure/gpt-4o", # or "gpt-4", "claude-3-sonnet", etc.
provider="LiteLLMLanguageModel",
)
model = lx.factory.create_model(config)
# Extract entities
result = lx.extract(
text_or_documents="Lady Juliet gazed longingly at the stars, her heart aching for Romeo",
model=model,
prompt_description="Extract characters, emotions, and relationships in order of appearance.",
examples=[...]
)
import langextract as lx
import textwrap
# Define extraction prompt
prompt = textwrap.dedent("""\
Extract characters, emotions, and relationships in order of appearance.
Use exact text for extractions. Do not paraphrase or overlap entities.
Provide meaningful attributes for each entity to add context.""")
# Provide high-quality examples to guide the model
examples = [
lx.data.ExampleData(
text="ROMEO. But soft! What light through yonder window breaks? It is the east, and Juliet is the sun.",
extractions=[
lx.data.Extraction(
extraction_class="character",
extraction_text="ROMEO",
attributes={"emotional_state": "wonder"}
),
lx.data.Extraction(
extraction_class="emotion",
extraction_text="But soft!",
attributes={"feeling": "gentle awe"}
),
lx.data.Extraction(
extraction_class="relationship",
extraction_text="Juliet is the sun",
attributes={"type": "metaphor"}
),
]
)
]
# Create model configuration
config = lx.factory.ModelConfig(
model_id="litellm/azure/gpt-4o",
provider="LiteLLMLanguageModel",
)
model = lx.factory.create_model(config)
# Extract entities
result = lx.extract(
text_or_documents="Lady Juliet gazed longingly at the stars, her heart aching for Romeo",
model=model,
prompt_description=prompt,
examples=examples
)
print("✅ Extraction successful!")
print(f"Results: {result}")
The model ID must start with litellm/
or litellm-
to be handled by this provider.
# Explicit LiteLLM prefix
model_id = "litellm/azure/gpt-4o"
model_id = "litellm/gpt-4"
model_id = "litellm/claude-3-sonnet"
# Alternative prefix formats
model_id = "litellm-gpt-4o"
model_id = "litellm-claude-3-sonnet"
You can pass additional parameters supported by LiteLLM:
config = lx.factory.ModelConfig(
model_id="litellm/gpt-4",
provider="LiteLLMLanguageModel",
temperature=0.7,
max_tokens=1000,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
timeout=30,
)
The extraction will return structured data with precise character intervals:
AnnotatedDocument(
extractions=[
Extraction(
extraction_class='character',
extraction_text='Lady Juliet',
char_interval=CharInterval(start_pos=0, end_pos=11),
alignment_status=<AlignmentStatus.MATCH_EXACT: 'match_exact'>,
attributes={'emotional_state': 'longing'}
),
Extraction(
extraction_class='emotion',
extraction_text='aching',
char_interval=CharInterval(start_pos=52, end_pos=58),
alignment_status=<AlignmentStatus.MATCH_FUZZY: 'match_fuzzy'>,
attributes={'feeling': 'heartfelt yearning'}
),
Extraction(
extraction_class='relationship',
extraction_text='her heart aching for Romeo',
char_interval=CharInterval(start_pos=42, end_pos=68),
alignment_status=<AlignmentStatus.MATCH_EXACT: 'match_exact'>,
attributes={'type': 'romantic longing'}
)
],
text='Lady Juliet gazed longingly at the stars, her heart aching for Romeo'
)
The provider includes robust error handling and will return error messages instead of raising exceptions:
# If API call fails, you'll get:
ScoredOutput(score=0.0, output="LiteLLM API error: [error details]")
- Install in development mode:
pip install -e .
- Run tests:
python test_plugin.py
- Build package:
python -m build
- Publish to PyPI:
twine upload dist/*
langextract
litellm
Apache License 2.0