smolmodels
is a Python library that lets you create machine learning models by describing what you want them to do in
plain English. Instead of wrestling with model architectures and hyperparameters, you simply describe your intent,
define your inputs and outputs, and let smolmodels
handle the rest.
import smolmodels as sm
# Define a house price predictor in terms of inputs, outputs, and expected behaviour
model = sm.Model(
intent="Predict house prices based on property features",
input_schema={
"square_feet": float,
"bedrooms": int,
"location": str,
"year_built": int
},
output_schema={
"predicted_price": float
}
)
# Build the model, using the backend of your choice; optionally generate synthetic training data
model.build("house-prices.csv", generate_samples=1000, provider="openai:gpt-4o-mini")
# Make predictions
price = model.predict({
"square_feet": 2500,
"bedrooms": 4,
"location": "San Francisco",
"year_built": 1985
})
# Save the model for later use
sm.save_model(model, "house-price-predictor")
smolmodels
combines graph search with LLMs to generate candidate models that meet the specified intent, and then
selects the best model based on performance and constraints. The process consists of four main phases:
-
Intent Analysis: problem description is analyzed to understand the type of model needed and what metric to optimise for.
-
Data Generation: synthetic data can be generated to enable model build when there is no training data available, or when the existing data has insufficient coverage of the feature space.
-
Model Building:
- Selects appropriate model architectures
- Handles feature engineering
- Manages training and validation
-
Validation & Refinement: the model is tested against constraints and refined using directives (like "optimize for speed" or "prioritize model types with better explainability").
Models are defined using natural language descriptions and schema specifications, abstracting away machine learning specifics.
Built-in synthetic data generation for training and validation.
Guide the model building process with high-level directives:
from smolmodels import Directive
model.build(directives=[
Directive("Optimize for inference speed"),
Directive("Prioritize interpretability")
])
Optional declarative constraints for model validation:
from smolmodels import Constraint
# Ensure predictions are always positive
positive_constraint = Constraint(
lambda inputs, outputs: outputs["predicted_price"] > 0,
description="Predictions must be positive"
)
model = Model(
intent="Predict house prices...",
constraints=[positive_constraint],
...
)
You can use multiple LLM providers as a backend for model generation. You can specify the provider and model in the
format provider:[model]
when calling build()
:
model.build("house-prices.csv", provider="openai:gpt-4o-mini")
Currently supported providers are openai
, anthropic
, google
and deepseek
. You need to configure the
appropriate API keys for each provider as environment variables (see installation instructions).
pip install smolmodels
Set required API keys as environment variables. Which API keys are required depends on which provider you are using.
export OPENAI_API_KEY=<your-API-key>
export ANTHROPIC_API_KEY=<your-API-key>
export GOOGLE_API_KEY=<your-API-key>
export DEEPSEEK_API_KEY=<your-API-key>
- Define model:
import smolmodels as sm
model = sm.Model(
intent="Classify customer feedback as positive, negative, or neutral",
input_schema={"text": str},
output_schema={"sentiment": str}
)
- Build and save:
# Build with existing data
model.build(dataset="feedback.csv", provider="openai:gpt-4o-mini")
# Or generate synthetic data
model.build(generate_samples=1000)
# Save model for later use
sm.save_model(model, "sentiment_model")
- Load and use:
# Load existing model
loaded_model = sm.load_model("sentiment_model")
# Make predictions
result = loaded_model.predict({"text": "Great service, highly recommend!"})
print(result["sentiment"]) # "positive"
Performance evaluated on 20 OpenML benchmark datasets and 12 Kaggle competitions. Higher performance observed on 12/20 OpenML datasets, with remaining datasets showing performance within 0.005 of baseline. Experiments conducted on standard infrastructure (8 vCPUs, 30GB RAM) with 1-hour runtime limit per dataset.
Complete code and results are available at plexe-ai/plexe-results.
For full documentation, visit docs.plexe.ai.
We welcome contributions! See CONTRIBUTING.md for guidelines.
Apache-2.0 License - see LICENSE for details.