24-LLM-Project

This Repository contains codebase for Enhancing Large Language Models (LLMs) provided by Upstage (Solar). The project offers the pipeline that acheives high performance on given ewha.pdf document and MMLU-Pro dataset. The optimal prompts and multiple knowledge base(KB)s for the Question-and-Answering Tasks are also included in ./db/* and prompts.py.

🌟 Overview

📍Getting Started

Requirements

Clone this repository and Create conda environment and install all the dependencies:

git clone https://github.com/finallyupper/24-LLM-Project.git

conda create --name pj python=3.12

conda activate pj

pip install -r requirements.txt

Note for CPU-only Environments
If you do not have a GPU, edit the requirements.txt file as follows before running the installation:

# faiss-gpu-cu12==1.9.0.0
faiss-cpu==1.9.0

You can change various hyperparameters including top k and thresholds in config.yaml. Before start, don't forget to make .env file in the repository and place the following information:

UPSTAGE_API_KEY = ""
LANGCHAIN_TRACING_V2=false
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY=""
LANGCHAIN_PROJECT="24-nlp-0"

How to play 💭

Run the following command to start testing the model:

python main.py

NOTE ) If you get the debugging messages in your terminal as following, it means the codes work correctly!

Directory Structure

24-LLM-Project/
│
├── README.md         
├── main.py   
├── prompts.py           
├── __init__.py      
│
├── engine/           
│   ├── langchain_engine.py
│   ├── raptor.py
│   └── utils.py
│   └── preprocess/
|
├── config.yaml 
|
├── data/           
│   ├── testset.csv
│   ├── ewha.pdf
│   └── ewha_chunk_doc_fix.json 
│   └── ...
├── db/           
│   ├── raptor/
│   └──── business/
|   └──── history/  
|   └──── law/
|   └──── philosophy/  
|   └──── psychology/  
|   └──── RAPTOR_faiss_fix_overlap/    
|
├── assets/           
│   ├── pipeline.png
│   ├── skeleton.ipynb       
│   └── ...

Results

Best Template

References 🔎

Huggingface datasets

Prompt Engineering

https://www.promptingguide.ai/kr/techniques/cot
https://github.com/run-llama/llama_index/tree/main
https://ko.upstage.ai/blog/insight/prompt-engineering-guide-maximizing-the-use-of-llm-with-prompt-design https://python.langchain.com/v0.1/docs/modules/model_io/prompts/few_shot_examples_chat/

Codes/Others

Langchain API , Upstage API
https://smith.langchain.com/hub/
https://console.upstage.ai/api/chat
https://wikidocs.net/book/14314 https://github.com/teddylee777/langchain-kr/tree/main
https://bcho.tistory.com/1419
https://rudaks.tistory.com/entry/langchain-%EB%8B%A4%EC%A4%91-%EB%B2%A1%ED%84%B0%EC%A0%80%EC%9E%A5%EC%86%8C-%EA%B2%80%EC%83%89%EA%B8%B0MultiVector-Retriever langchain-ai/langchain#13447

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

24-LLM-Project

🌟 Overview

📍Getting Started

Requirements

How to play 💭

Directory Structure

Results

Best Template

References 🔎

Huggingface datasets

Prompt Engineering

Codes/Others

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
assets		assets
data		data
db		db
engine		engine
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
config.yaml		config.yaml
main.py		main.py
prompts.py		prompts.py
requirements.txt		requirements.txt

sml09181/Upstage-QA

Folders and files

Latest commit

History

Repository files navigation

24-LLM-Project

🌟 Overview

📍Getting Started

Requirements

How to play 💭

Directory Structure

Results

Best Template

References 🔎

Huggingface datasets

Prompt Engineering

Codes/Others

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages