Skip to content

Commit 9c2e39c

Browse files
committed
Delete spec prompt
1 parent e93912e commit 9c2e39c

File tree

6 files changed

+49
-50
lines changed

6 files changed

+49
-50
lines changed

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,11 @@
2222

2323
### Introduction
2424

25-
* Easily build an AI model using Langchain and Streamlit.
25+
* Project Purpose:
26+
* Build a powerful "LLM" model using langchain and streamlit, **enabling your LLM model to do what ChatGPT can't**:
27+
* **Connect with external data** by using PDF documents as an example, allowing the LLM model to understand the uploaded files through RetrievalQA techniques.
28+
* Integrate LLM with other tools to achieve **internet connectivity**. For instance, using Serp API as an example, leverage the Langchain framework to enable querying the model for **current issues** (i.e., **Google search engine**).
29+
* Integrate LLM with the **LLM Math model**, enabling accurate **mathematical calculations**.
2630

2731
* This project consists of three main components:
2832
* [`DataConnection`](../model/data_connection.py): Allows LLM to communicate with external data, i.e., read PDF files and perform text segmentation for large PDFs to avoid exceeding OPENAI's 4000-token limit.
@@ -34,17 +38,16 @@
3438

3539

3640
* `docGPT` is developed based on **Langchain** and **Streamlit**.
37-
* `Langchain`: LangChain is a framework for **developing applications supported by language models**. It supports the following applications:
38-
1. Connecting LLM models with external data sources.
39-
2. Allowing interaction with LLM models.
40-
* `Streamlit`: Streamlit enables fast and free deployment of Python applications.
41-
4241

4342
---
4443

4544
### What's LangChain?
4645

47-
For an introduction to LangChain, it is recommended to refer to the official documentation or the GitHub [repository](https://github.com/hwchase17/langchain).
46+
* LangChain is a framework for developing applications powered by language models. It supports the following applications:
47+
1. Connecting LLM models with external data sources.
48+
2. Enabling interactions with LLM models.
49+
50+
* For an introduction to LangChain, it is recommended to refer to the official documentation or the GitHub [repository](https://github.com/hwchase17/langchain).
4851

4952
**Questions that ChatGPT cannot answer can be handled by Langchain!**
5053

READM.zh-TW.md renamed to README.zh-TW.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -18,31 +18,31 @@
1818

1919
---
2020

21-
2221
### Introduction
2322

24-
* 使用 langchain、streamlit 輕鬆搭建出一個 AI 模型
25-
23+
* 專案目的:
24+
* 使用 langchain、streamlit 輕鬆搭建出一個強大的 "LLM" 模型,**讓您的 LLM 模型能夠實現 ChatGPT 做不到的事**:
25+
***外部數據連接**,本專案以 **PDF 文件**為例子,透過 RetrievalQA 技術讓 LLM 理解您上傳的文件
26+
* 整合 LLM 與其他工具,達到**連網功能**,本專案以 Serp API 為例子,透過 Langchain 框架,使您能夠詢問模型有關**現今問題** (即 **google 搜尋引擎**)
27+
* 整合 LLM 與 **LLM Math 模型**,使您能夠讓模型準確做到**數學計算**
2628
* 本專案的設計架構主要有三個元素:
2729
* [`DataConnection`](../model/data_connection.py): 讓 LLM 負責與外部數據溝通,也就是讀取 PDF 檔案,並針對大型 PDF 進行文本切割,避免超出 OPENAI 4000 tokens 的限制
2830
* [`docGPT`](../docGPT/): 該元素就是讓模型了解 PDF 內容的核心,包含將 PDF 文本進行向量嵌入、建立 langchain 的 retrievalQA 模型。詳細簡介請[參考](https://python.langchain.com/docs/modules/chains/popular/vector_db_qa)
2931
* [`agent`](../agent/agent.py): 負責管理模型所用到的工具、並根據使用者提問**自動判斷**使用何種工具處理,工具包含
3032
* `SerpAI`: 當使用者問題屬於 "**現今問題**",使用該工具可以進行 **google 搜索**
3133
* `llm_math_chain`: 當使用者問題屬於 "**數學計算**",使用該工具可以進行 數學計算
3234
* `docGPT`: 當使用者詢問有關 PDF 文檔內容,使用該工具可以進行解答 (該工具也是我們透過 retrievalQA 建立的)
33-
34-
3535
* `docGPT` 是基於 **langchain****streamlit** 開發的
36-
* `langchain`: LangChain 是一個用於**開發由語言模型支持的應用程序的框架**。它支持以下應用程序
37-
1. 可以將 LLM 模型與外部數據源進行連接
38-
2. 允許與 LLM 模型進行交互
39-
* `streamlit`: streamlit 使 python 可以**快速、免費**的部署屬於你的應用程序
4036

4137
---
4238

4339
### What's LangChain?
4440

45-
有關 langchain 的介紹,建議查看官方文件、[Github源專案](https://github.com/hwchase17/langchain)
41+
* LangChain 是一個用於**開發由語言模型支持的應用程序的框架**。它支持以下應用程序
42+
1. 可以將 LLM 模型與外部數據源進行連接
43+
2. 允許與 LLM 模型進行交互
44+
* 有關 langchain 的介紹,建議查看官方文件、[Github源專案](https://github.com/hwchase17/langchain)
45+
4646

4747
**ChatGPT 無法回答的問題,交給 Langchain 實現!**
4848

agent/agent.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,6 @@ def create_doc_chat(self, docGPT) -> Tool:
4747
func=docGPT.run,
4848
description="""
4949
useful for when you need to answer questions from the context of PDF,
50-
especially ask the specification of display.
5150
"""
5251
)
5352
return tool

app.py

Lines changed: 12 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ def load_api_key() -> None:
8989
if temp_file_path:
9090
os.remove(temp_file_path)
9191

92-
docGPT, docGPT_spec, calculate_tool, search_tool = None, None, None, None
92+
docGPT, calculate_tool, search_tool = None, None, None
9393

9494
try:
9595
agent_ = AgentHelper()
@@ -99,14 +99,8 @@ def load_api_key() -> None:
9999
)
100100
docGPT_tool = agent_.create_doc_chat(docGPT)
101101

102-
docGPT_spec = DocGPT(docs=docs)
103-
docGPT_spec.create_qa_chain(
104-
chain_type='refine',
105-
)
106-
docGPT_spec_tool = agent_.create_doc_chat(docGPT_spec)
107102
except Exception as e:
108-
print(e)
109-
pass
103+
st.write(e)
110104

111105
try:
112106
search_tool = agent_.get_searp_chain
@@ -117,12 +111,12 @@ def load_api_key() -> None:
117111
calculate_tool = agent_.get_calculate_chain
118112

119113
tools = [
120-
docGPT_tool, docGPT_spec_tool,
121-
calculate_tool, search_tool
114+
docGPT_tool,
115+
search_tool
122116
]
123117
agent_.initialize(tools)
124118
except Exception as e:
125-
print(e)
119+
st.write(e)
126120

127121

128122
if not st.session_state['openai_api_key']:
@@ -139,10 +133,12 @@ def load_api_key() -> None:
139133

140134
@lru_cache(maxsize=20)
141135
async def get_response(query: str):
142-
if agent_ and query and query != '':
143-
response = agent_.query(query)
144-
return response
145-
136+
try:
137+
if agent_.agent_ is not None:
138+
response = agent_.query(query)
139+
return response
140+
except Exception as e:
141+
pass
146142

147143
query = st.text_input(
148144
"#### Question:",
@@ -153,7 +149,7 @@ async def get_response(query: str):
153149
user_container = st.container()
154150

155151
with user_container:
156-
if query:
152+
if query and query != '':
157153
response = asyncio.run(get_response(query))
158154
st.session_state.query.append(query)
159155
st.session_state.response.append(response)

docGPT/docGPT.py

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
from langchain.memory import ConversationBufferMemory
1010
from langchain.prompts import PromptTemplate
1111
from langchain.vectorstores import Chroma
12+
from langchain.chat_models import ChatOpenAI
1213

1314

1415
openai.api_key = os.getenv('OPENAI_API_KEY')
@@ -44,7 +45,7 @@ def __init__(
4445
@property
4546
def create_qa_chain(self) -> RetrievalQA:
4647
qa_chain = RetrievalQA.from_chain_type(
47-
llm=OpenAI(temperature=0),
48+
llm=self.llm,
4849
chain_type=self.chain_type,
4950
retriever=self.retriever,
5051
chain_type_kwargs=self.chain_type_kwargs
@@ -61,12 +62,6 @@ def __init__(
6162
) -> None:
6263
super().__init__(chain_type, retriever, llm)
6364

64-
def _get_chat_history(self, inputs) -> str:
65-
res = []
66-
for human, ai in inputs:
67-
res.append(f"Human:{human}\nAI:{ai}")
68-
return "\n".join(res)
69-
7065
@property
7166
def create_qa_chain(self) -> ConversationalRetrievalChain:
7267
# TODO: cannot use conversation qa chain
@@ -75,11 +70,10 @@ def create_qa_chain(self) -> ConversationalRetrievalChain:
7570
return_messages=True
7671
)
7772
qa_chain = ConversationalRetrievalChain.from_llm(
78-
llm=OpenAI(temperature=0),
73+
llm=self.llm,
7974
chain_type=self.chain_type,
8075
retriever=self.retriever,
81-
memory=memory,
82-
get_chat_history=self._get_chat_history
76+
memory=memory
8377
)
8478
return qa_chain
8579

@@ -88,11 +82,16 @@ class DocGPT:
8882
def __init__(self, docs):
8983
self.docs = docs
9084
self.qa_chain = None
85+
self.llm = ChatOpenAI(
86+
temperature=0.2,
87+
max_tokens=2000,
88+
model_name='gpt-3.5-turbo'
89+
)
9190

9291
self.prompt_template = """
93-
Cite each reference using [Page Number] notation (every result has this number at the beginning).
94-
Only answer what is asked. The answer should be short and concise. Answer step-by-step.
92+
Only answer what is asked. Answer step-by-step.
9593
If the content has sections, please summarize them in order and present them in a bulleted format.
94+
Utilize line breaks for better readability.
9695
For example, sequentially summarize the introduction, methods, results, and so on.
9796
9897
{context}
@@ -154,14 +153,14 @@ def create_qa_chain(
154153
self.qa_chain = RChain(
155154
chain_type=chain_type,
156155
retriever=retriever,
157-
llm=OpenAI(temperature=0),
156+
llm=self.llm,
158157
chain_type_kwargs=chain_type_kwargs
159158
).create_qa_chain
160159
else:
161160
self.qa_chain = CRChain(
162161
chain_type=chain_type,
163162
retriever=retriever,
164-
llm=OpenAI(temperature=0)
163+
llm=self.llm
165164
).create_qa_chain
166165

167166
def run(self, query: str) -> str:

requirements.txt

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1-
langchain==0.0.224
1+
langchain==0.0.228
22
openai==0.27.8
3-
streamlit==1.24.0
3+
streamlit==1.24.1
44
streamlit_chat==0.1.1
55
pymupdf
6+
chromadb
7+
tiktoken

0 commit comments

Comments
 (0)