π Open-Source Web Search Backend Framework via Plug-and-Play Configuration
- Overview
- Support
- Quick Start
- API Reference
- Client Tests
- Demo
- How do Tapestry work?
- Project Structures
Tapestry
is an open-source backend framework designed to build customizable AI web search pipelines. Tapestry allows developers to flexibly combine plug-and-play modules, including search engines, domain-specific crawling, LLMs, and algorithms for improving search performance (e.g., deduplication, query rewriting).
Engine | API Key | Search | Youtube Search | News Search | Scholar Search | Shopping |
---|---|---|---|---|---|---|
Serper | β | β | β | β | β | β |
Serp | β | β | β | β | β | β |
Brave | β | β | β | β | β | β |
DuckDuckGo | β | β | β | β | β | β |
- OpenAI, Anthropic, Gemini
This guide provides instructions for running the Tapestry service using Docker or Kubernetes.
Before launching the service, you must configure your environment variables. All settings are managed through a .env
file in the envs
directory.
-
Copy the Example Configuration:
Create your environment file by copying the provided template.cp envs/example.env envs/.env
-
Edit the
.env
File:
Open the newly created.env
file and fill in the required values, such as your API keys and database credentials.- For a detailed explanation of each variable, refer to the guide at
envs/README.md
. - Important: The
POSTGRES_HOST
variable must be set correctly for your deployment environment:- For Docker:
POSTGRES_HOST=postgres
- For Kubernetes:
POSTGRES_HOST=tapestry-postgres
- For Docker:
- For a detailed explanation of each variable, refer to the guide at
This is the recommended method for local development and testing.
-
Ensure your
.env
file is configured as described above, withPOSTGRES_HOST=postgres
. -
Run the launch script:
The script handles directory setup and starts all services using Docker Compose.bash scripts/run.sh
The script uses the
.env
file in the project root by default. -
Accessing the Service: The API will be available at
http://localhost:9012
. You can change the port via theAPP_PORT
variable in your.env
file.
For deployment in a Kubernetes cluster.
-
Ensure your
.env
file is configured as described above, withPOSTGRES_HOST=tapestry-postgres
. Also, ensureLOG_DIR
andPOSTGRES_DATA_DIR
are absolute paths that exist on your Kubernetes nodes. -
Run the deployment script:
This script automates the entire deployment process.bash scripts/run_k8s.sh [K8S_IP] [SERVICE_PORT] [POSTGRES_PORT] [NODE_PORT]
Script Arguments:
K8S_IP
: The IP address of your Kubernetes cluster (defaults to127.0.0.1
).SERVICE_PORT
: The internal port for the application service (defaults to9012
).POSTGRES_PORT
: The port for the PostgreSQL service (defaults to5432
).NODE_PORT
: The external port (NodePort) to access the service (defaults to30800
).
Example:
bash scripts/run_k8s.sh 127.0.0.1 9012 5432 30800
-
Accessing the Service: The API will be available at
http://[K8S_IP]:[NODE_PORT]
(e.g.,http://127.0.0.1:30800
).
POST /websearch
-
query
|string
| Required: The search query string. -
language
|string
| Optional, defaults to"en"
: Response language. ISO 639-1 two-letter language code (e.g., en, ko, ja). -
search_type
|string
| Optional, defaults to"auto"
:auto
: The LLM automatically infers the search type from the query.general
: Uses only indexed content from general search results for answering.news
: Uses only indexed content from news search results for answering.scholar
: Uses only indexed content from scholarly search results for answering. If the search engine does not support this, it falls back togeneral
search.youtube
: Extracts and uses only YouTube video links from video search results for answering. If the search engine does not support this, it falls back togeneral
search.
-
persona_prompt
|string
| Optional, defaults toNone
: Persona instructions for the LLM. -
custom_prompt
|string
| Optional, defaults toNone
: Additional custom instructions to inject into the LLM. -
messages
|array
| Optional, defaults toNone
: Previous conversation history. Must follow the format:[{"role": "user", "content": ""}, {"role": "assistant", "content": ""}, ...]
. -
target_nuance
|string
| Optional, defaults to"Natural"
: Desired response nuance. -
use_youtube_transcript
|bool
| Optional, defaults toFalse
: If YouTube results are included, use transcript information. -
top_k
|int
| Optional, defaults toNone
: Use the topk
search results. -
stream
|bool
| Optional, defaults toTrue
: Return the response as a streaming output.
The API returns a streaming JSON response with the following status types:
processing
: Indicates the current processing step.streaming
: Returns incremental answer tokens as they are generated (ifstream=true
).complete
: Final answer and metadata.
- Processing
{"status": "processing", "message": {"title": "Web search completed"}}
- Streaming
{"status": "streaming", "delta": {"content": "token_text"}}
- Complete
{
"status": "complete",
"message": {
"content": "<final_answer_string>",
"metadata": {
"queries": ["<query1>", "<query2>", ...],
"sub_titles": ["<subtitle1>", "<subtitle2>", ...]
},
"models": [
{
"model": {"model_name": "<model_name>", "model_vendor": "<model_vendor>", "model_type": "<model_type>"},
"usage": {"input_token_count": 0, "output_token_count": 0}
},
...
]
}
}
You can test the API using the provided client script.
-
Run the client:
python tests/client.py --query "what is an ai search engine?"
-
Configure the Endpoint:
Before running, opentests/client.py
and ensure theSERVER_URL
variable points to the correct endpoint for your environment:- Docker:
http://127.0.0.1:9012/websearch
- Kubernetes:
http://127.0.0.1:30800/websearch
(or yourK8S_IP
andNODE_PORT
).
- Docker:
Note:
GitHub does not support embedded YouTube videos in README files.
Please click the image below to watch the demo on YouTube.
Tapestry provides a Gradio-based Web UI for interactive web search and chatbot experience.
-
Local Run:
bash gradio/run_demo.sh
You can set the port and API address:
GRADIO_PORT=8888 API_URL=http://my-api:9000/websearch bash gradio/run_demo.sh
-
Docker Run:
bash gradio/run_docker_demo.sh
You can also set the port and API address:
GRADIO_PORT=8888 API_URL=http://my-api:9000/websearch bash gradio/run_docker_demo.sh
For more details, please refer to
gradio/README.md
.
GET /health
: Health check endpoint.POST /websearch
: Main QA endpoint with a streaming response.
Tapestry/
βββ main.py # Main FastAPI server
βββ src/ # Core source code (models, search, db, utils, etc.)
βββ gradio/ # Gradio Web UI
βββ tests/ # Test clients & API guide
βββ envs/ # Environment variable examples and docs
βββ configs/ # Configuration files
βββ k8s/ # Kubernetes manifests
βββ scripts/ # Automation scripts (run.sh, run_k8s.sh)
βββ benchmark/ # Benchmark scripts
βββ misc/ # Miscellaneous (images, gifs)
βββ requirements.txt # Python dependencies
βββ Dockerfile # Docker build file
βββ docker-compose.yaml # Docker Compose file
βββ LICENSE # License
βββ .gitignore # Git ignore rules