Skip to content

Commit 6c5b4b5

Browse files
mongodbenBen Perlmutternlarew
authored
(EAI-930): Retrieval tool call (feature branch) (#763)
* (EAI-988): Refactor `GenerateResponse` for tool call support (#687) * refactor GenerateRespose * Clean up imports * consolidate generate user prompt to the legacy file * update test config imports * Fix broken tests * (EAI-989): Refactor verified answers to wrap `GenerateResponse` (#688) verified answer generate response Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * handle streaming * separate generateresponse * typo fix --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * (EAI-990): Refactor search as a tool (#705) * refactor GenerateRespose * Clean up imports * consolidate generate user prompt to the legacy file * update test config imports * Fix broken tests * get started * nominally working generate res w/ search * small refactors * aint pretty but fully functional * hacky if more functional * more hack * tools * functional if not pretty * Add processing * working tool calling * making progress * keepin on * Clean config * working e2e * update model version * Remove no longer used stuff * decouple search results for references and whats shown to model * fix scripts build errs * fix broken tests * update default ref links * fix broken tests * Apply suggestions from code review Co-authored-by: Nick Larew <nick.larew@mongodb.com> * revert default reference links * adding missing test --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> Co-authored-by: Nick Larew <nick.larew@mongodb.com> * (EAI-992): Remove `ChatLlm` (#751) * refactor GenerateRespose * Clean up imports * consolidate generate user prompt to the legacy file * update test config imports * Fix broken tests * get started * nominally working generate res w/ search * small refactors * aint pretty but fully functional * hacky if more functional * more hack * tools * functional if not pretty * Add processing * working tool calling * making progress * keepin on * Clean config * working e2e * update model version * Remove no longer used stuff * decouple search results for references and whats shown to model * fix scripts build errs * remove ChatLlm * lite fixes * Remove stub --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * (EAI-993): deprecate framework (#752) * refactor GenerateRespose * Clean up imports * consolidate generate user prompt to the legacy file * update test config imports * Fix broken tests * get started * nominally working generate res w/ search * small refactors * aint pretty but fully functional * hacky if more functional * more hack * tools * functional if not pretty * Add processing * working tool calling * making progress * keepin on * Clean config * working e2e * update model version * Remove no longer used stuff * decouple search results for references and whats shown to model * fix scripts build errs * fix broken tests * deprecation * build out docs following last week convo * clean up spec + contact * fix merge funk * docs updates --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * (EAI-1071): Fix broken Atlas OpenAPI ingest (#765) * update to fix broken test * Update packages/ingest-mongodb-public/src/sources/snooty/snootyAstToOpenApiSpec.ts --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * (EAI-995): add guardrail (#755) * refactor GenerateRespose * Clean up imports * consolidate generate user prompt to the legacy file * update test config imports * Fix broken tests * get started * nominally working generate res w/ search * small refactors * aint pretty but fully functional * hacky if more functional * more hack * tools * functional if not pretty * Add processing * working tool calling * making progress * keepin on * Clean config * working e2e * update model version * Remove no longer used stuff * decouple search results for references and whats shown to model * fix scripts build errs * fix broken tests * update default ref links * fix broken tests * input guardrail refactor * guardrail works well * simpler validity metric * add guardrail to server * add next step todo * llm refusal msg * remove TODO comment * merge fix * fix unnec changes * NL feedback --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * fix type in text * (EAI-991 & EAI-1050): Evaluate and clean up retrieval as a tool (#757) * refactor GenerateRespose * Clean up imports * consolidate generate user prompt to the legacy file * update test config imports * Fix broken tests * get started * nominally working generate res w/ search * small refactors * aint pretty but fully functional * hacky if more functional * more hack * tools * functional if not pretty * Add processing * working tool calling * making progress * keepin on * Clean config * working e2e * update model version * Remove no longer used stuff * decouple search results for references and whats shown to model * fix scripts build errs * fix broken tests * update default ref links * fix broken tests * input guardrail refactor * guardrail works well * simpler validity metric * add guardrail to server * add next step todo * llm refusal msg * remove TODO comment * evals on new architecture * Get urls in a way that supports verified answers * dont eval on retrieved elems if no context * Cleaner handling * update trace handling * update trace handling * undo git funk * handle undefined case * Fix tracing test --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> * remove console logs + redunancies --------- Co-authored-by: Ben Perlmutter <mongodben@mongodb.com> Co-authored-by: Nick Larew <nick.larew@mongodb.com>
1 parent 7e2930a commit 6c5b4b5

File tree

171 files changed

+2927
-61638
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

171 files changed

+2927
-61638
lines changed

README.md

Lines changed: 16 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,32 @@
1-
# MongoDB Chatbot Framework
1+
# MongoDB Knowledge Service
22

3-
The MongoDB Chatbot Framework is a set of libraries that you can use to build
4-
full-stack intelligent chatbot applications using MongoDB and [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/).
5-
The MongoDB Chatbot Framework includes first class support for
6-
retrieval-augmented generation (RAG).
3+
This repo contains the work of the MongoDB Education AI team.
74

8-
The framework can take your chatbot application from prototype to production.
5+
## MongoDB Knowledge Service
96

10-
You can quickly get an AI chatbot enhanced with your data up and running using
11-
the framework's built-in data ingest process, chatbot server, and web UI. As you
12-
refine your application and scale to more users, you can modify the chatbot's
13-
behavior to meet your needs.
7+
The MongoDB Knowledge Service lets you learn about MongoDB using generative AI. To learn more about it, refer to the [MongoDB Knolwedge Service documentation](https://mongodb.github.io/chatbot)
148

15-
The framework is flexible and customizable. It supports multiple AI models and
16-
complex prompting strategies. It also includes tools for programmatic evaluation of your chatbot's AI components.
9+
## MongoDB Chatbot Framework (deprecated)
1710

18-
## Documentation
11+
The team building the MongoDB Knowledge Service previous developed the MongoDB Chatbot Framework. This consisted of the npm packages:
1912

20-
To learn how to use the MongoDB Chatbot Framework, refer to the documentation:
21-
<https://mongodb.github.io/chatbot/>.
13+
- `mongodb-chatbot-server`
14+
- `mongodb-chatbot-ui` (still used, refer to [UI](https://mongodb.github.io/chatbot/ui))
15+
- `mongodb-rag-core`
16+
- `mongodb-rag-ingest`
2217

23-
You can also check out the following articles and videos about the framework:
18+
The MongoDB Chatbot Framework in now deprecated. We will no longer be maintaining it.
2419

25-
- [[Video] MongoDB Chatbot Framework Learning Byte](https://learn.mongodb.com/courses/mongodb-chatbot-framework)
26-
- [[Article] Build a Production-Ready, Intelligent Chatbot With the MongoDB Chatbot Framework](https://dev.to/mongodb/build-a-production-ready-intelligent-chatbot-with-the-mongodb-chatbot-framework-4dd)
27-
- [[Article] Taking RAG to Production with the MongoDB Documentation AI Chatbot](https://www.mongodb.com/developer/products/atlas/taking-rag-to-production-documentation-ai-chatbot/)
20+
To learn more about the framework, refer to the the blog post [Build a Production-Ready, Intelligent Chatbot With the MongoDB Chatbot Framework](https://dev.to/mongodb/build-a-production-ready-intelligent-chatbot-with-the-mongodb-chatbot-framework-4dd).
2821

29-
## MongoDB Docs AI Chatbot Implementation
22+
### Why Deprecate the MongoDB Chatbot Framework?
3023

31-
This repo also contains the implementation of the MongoDB Docs Chatbot,
32-
which uses the MongoDB Chatbot Framework.
24+
Since we first launched the framework a year and a half ago, there's been a lot of progress in the TypeScript ecosystem for AI frameworks. We have decided that these frameworks remove the need for the MongoDB Chatbot Framework. Additionally, supporting the framework in addition to the Knowledge Service has been a maintenance burden on our small team.
3325

34-
The MongoDB Docs Chatbot uses the MongoDB [documentation](https://www.mongodb.com/docs/) and [Developer Center](https://www.mongodb.com/developer/) as its sources of truth.
26+
In particular, we've been very impressed by the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction). It has a great developer experience, is well maintained, and a robust feature set. We've moved most of our LLM call logic to the AI SDK. You can refer to our [mongodb/chatbot repository](https://github.com/mongodb/chatbot) to see how we're using it. For a tutorial on building with MongoDB Atlas and the AI SDK, refer to the blog post [Building a Chat Application That Doesn't Forget!](https://dev.to/mongodb/building-a-chat-application-with-mongodb-memory-provider-for-vercel-ai-sdk-56ap) by MongoDB's own Jesse Hall.
3527

36-
The chatbot builds on the following technologies:
28+
For building more agentic applications in TypeScript, [Mastra](https://mastra.ai/en/docs) (itself built on the AI SDK), [LangGraph.js](https://langchain-ai.github.io/langgraphjs/), and the [OpenAI Agents SDK](https://openai.github.io/openai-agents-js/) all seem to be solid options.
3729

38-
- Atlas Vector Search: Indexes and queries content for use in project.
39-
- MongoDB Atlas: Persists conversations and content.
40-
- ChatGPT API: LLM to pre-process user queries and summarize responses to user queries.
41-
- OpenAI Embeddings API: Create vector embeddings for user queries and content. Used by Atlas Vector Search.
42-
43-
To learn more about how we built the chatbot, check out the MongoDB Developer Center blog post
44-
[Taking RAG to Production with the MongoDB Documentation AI Chatbot](https://www.mongodb.com/developer/products/atlas/taking-rag-to-production-documentation-ai-chatbot/).
4530

4631
## Contributing
4732

docs/docs/contact.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Contact
2+
3+
4+
- MongoDB employees can reach out to the Education AI team on Slack at `#ask-education-ai`
5+
- External users can create an issue on the [mongodb/chatbot GitHub repo](https://github.com/mongodb/chatbot/issues/new)

docs/docs/data-sources.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# Data Sources
2+
3+
The MongoDB Knowledge Service uses [retrieval augmented generation](https://www.mongodb.com/resources/basics/artificial-intelligence/retrieval-augmented-generation) to answer user queries.
4+
5+
All the data sources are public on the web.
6+
7+
8+
## Sources
9+
10+
The Knowledge Service ingests the following data sources:
11+
12+
- MongoDB Technical Documentation (https://mongodb.com/docs)
13+
- MongoDB Developer Center blog (https://mongodb.com/developer)
14+
- MongoDB University transcripts and landing pages (https://learn.mongodb.com)
15+
- Select marketing and sales pages from https://mongodb.com
16+
- Select external data sources:
17+
- Mongoose.js docs (https://mongoosejs.com)
18+
- Prisma MongoDB connector docs (https://www.prisma.io/docs/orm/overview/databases/mongodb)
19+
- Terraform MongoDB Provider docs (https://registry.terraform.io/providers/mongodb/mongodbatlas/latest)
20+
- WiredTiger docs (https://source.wiredtiger.com/)
21+
- Practical MongoDB Aggregations book (https://www.practical-mongodb-aggregations.com/)
22+
23+
24+
## Source Code
25+
26+
You can see the source code for the data source ingestion here: https://github.com/mongodb/chatbot/tree/main/packages/ingest-mongodb-public

docs/docs/datasets.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Datasets
2+
3+
The Education AI team maintains various datasets for use with AI systems. All datasets can be found in the [MongoDB Education AI HuggingFace](https://huggingface.co/mongodb-eai).
4+
5+
## Content
6+
7+
Content datasets can be useful for building RAG systems and training models.
8+
9+
| Name | Type | Description | Visibility | Use Cases | Links |
10+
| :---- | :---- | :---- | :---- | :---- | :---- |
11+
| Public documentation | Long-form content | Markdown version of docs and developer center content. | Public | RAG, model training | https://huggingface.co/datasets/mongodb-eai/docs |
12+
| Code example dataset | Prompt-completion | Code examples extracted from the MongoDB docs and developer center with prompts that could be used to generate the code. | Public | Model fine-tuning | https://huggingface.co/datasets/mongodb-eai/code-example-prompts |
13+
14+
## Benchmarks
15+
16+
| Name | Type | Description | Visibility | |
17+
| :---- | :---- | :---- | :---- | :---- |
18+
| Natural language-to-Node.js Mongosh | Code generation | Assess how well LLMs generate `mongosh` code given a natural language prompt and information about a database. | External | https://huggingface.co/datasets/mongodb-eai/natural-language-to-mongosh |
19+

docs/docs/framework.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# MongoDB Chatbot Framework (deprecated)
2+
3+
The team building the MongoDB Knowledge Service previous developed the MongoDB Chatbot Framework. This consisted of the npm packages:
4+
5+
- `mongodb-chatbot-server`
6+
- `mongodb-chatbot-ui` (still used, refer to [UI](./ui.md))
7+
- `mongodb-rag-core`
8+
- `mongodb-rag-ingest`
9+
10+
The MongoDB Chatbot Framework in now deprecated. We will no longer be maintaining it.
11+
12+
To learn more about the framework, refer to the the blog post [Build a Production-Ready, Intelligent Chatbot With the MongoDB Chatbot Framework](https://dev.to/mongodb/build-a-production-ready-intelligent-chatbot-with-the-mongodb-chatbot-framework-4dd).
13+
14+
## Why Deprecate the MongoDB Chatbot Framework?
15+
16+
Since we first launched the framework a year and a half ago, there's been a lot of progress in the TypeScript ecosystem for AI frameworks. We have decided that these frameworks remove the need for the MongoDB Chatbot Framework. Additionally, supporting the framework in addition to the Knowledge Service has been a maintenance burden on our small team.
17+
18+
In particular, we've been very impressed by the [Vercel AI SDK](https://ai-sdk.dev/docs/introduction). It has a great developer experience, is well maintained, and a robust feature set. We've moved most of our LLM call logic to the AI SDK. You can refer to our [mongodb/chatbot repository](https://github.com/mongodb/chatbot) to see how we're using it. For a tutorial on building with MongoDB Atlas and the AI SDK, refer to the blog post [Building a Chat Application That Doesn't Forget!](https://dev.to/mongodb/building-a-chat-application-with-mongodb-memory-provider-for-vercel-ai-sdk-56ap) by MongoDB's own Jesse Hall.
19+
20+
For building more agentic applications in TypeScript, [Mastra](https://mastra.ai/en/docs) (itself built on the AI SDK), [LangGraph.js](https://langchain-ai.github.io/langgraphjs/), and the [OpenAI Agents SDK](https://openai.github.io/openai-agents-js/) all seem to be solid options.

docs/docs/index.md

Lines changed: 7 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,77 +1,16 @@
11
---
22
title: Home
3-
description: Build full-stack intelligent chatbot applications using MongoDB and Atlas Vector Search.
3+
description: MongoDB Knowledge Service
44
---
55

6-
# MongoDB Chatbot Framework
6+
# MongoDB Knowledge Service
77

8-
:::warning[👷‍♂️ Work In Progress 👷‍♂️]
8+
The MongoDB Knowledge Service lets you learn about MongoDB using generative AI.
99

10-
The MongoDB Chatbot Framework is under active development
11-
and may undergo breaking changes.
10+
## Server
1211

13-
We aim to keep the documentation up to date with the latest version.
12+
To call the MongoDB Knowledge Service API, refer to the [OpenAPI specification](/server/openapi/).
1413

15-
:::
14+
## UI Library
1615

17-
Build full-stack intelligent chatbot applications using MongoDB
18-
and [Atlas Vector Search](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/).
19-
20-
The MongoDB Chatbot Framework is a set of libraries that you can use to build a
21-
production-ready chatbot application. The framework provides first-class support
22-
for retrieval augmented generation (RAG), and is extensible to support other
23-
patterns for building intelligent chatbots.
24-
25-
The framework can take your chatbot application from prototype to production.
26-
27-
You can quickly get an AI chatbot enhanced with your data up and running using
28-
the framework's built-in data ingest process, chatbot server, and web UI. As you
29-
refine your application and scale to more users, you can modify the chatbot's
30-
behavior to meet your needs.
31-
32-
The framework is flexible and customizable. It supports multiple AI models and
33-
complex prompting strategies.
34-
35-
## How It Works
36-
37-
The MongoDB Chatbot Framework has the following core components:
38-
39-
- [MongoDB Atlas](./mongodb.md): Database for the application that stores content and conversation.
40-
Indexes content using Atlas Vector Search.
41-
- [Ingest CLI](./ingest/configure.md): Configurable CLI application that you can use to ingest content into a MongoDB collection for use with Atlas Vector Search.
42-
- [Chatbot Server](./server/configure.md): Express.js server routes that you can use to build a chatbot application.
43-
- [Chatbot UI](./ui.md): React.js UI components that you can use to build a chatbot application.
44-
45-
## Quick Start
46-
47-
To get started using the MongoDB Chatbot Framework, refer to the [Quick Start](./quick-start.md) guide.
48-
49-
## Design Principles
50-
51-
The MongoDB Chatbot Framework is designed around the following principles:
52-
53-
- Composability: You can use components of the chatbot framework independently of each other.
54-
For example, we have some users who are using only our ingestion CLI to ingest content into MongoDB Atlas, but use other tools to build their chatbot and UI.
55-
- Pluggability: You can plug in your own implementations of components.
56-
For example, you can plug in your own implementations of the `DataSource` interface
57-
to ingest content from different data sources.
58-
- Inversion of Control: The framework makes decisions about boilerplate aspects
59-
of intelligent chatbot systems so that you can focus on building logic unique to your application.
60-
61-
## MongoDB Docs Chatbot
62-
63-
This framework is used to build the MongoDB Docs Chatbot, a RAG chatbot that answers questions about the MongoDB documentation. You can try it out on [mongodb.com/docs](https://www.mongodb.com/docs/).
64-
65-
Here's a reference architecture for how the MongoDB Chatbot Framework system works for the MongoDB Docs Chatbot.
66-
67-
Data ingestion:
68-
69-
![Data Ingestion Architecture](/img/ingest-diagram.webp)
70-
71-
Chat Server:
72-
73-
![Chat Server Architecture](/img/server-diagram.webp)
74-
75-
### How We Built It
76-
77-
- To learn more about how we built the chatbot, check out the MongoDB Developer Center blog post [Taking RAG to Production with the MongoDB Documentation AI Chatbot](https://www.mongodb.com/developer/products/atlas/taking-rag-to-production-documentation-ai-chatbot/).
16+
You can call the MongoDB Knowledge Service API from a web application using the `mongodb-chatbot-ui` npm package. The UI library provides a React component that you can use to build a chat interface. Learn more in the [UI documentation](./ui.md).

docs/docs/ingest/command-reference.md

Lines changed: 0 additions & 16 deletions
This file was deleted.

0 commit comments

Comments
 (0)