Skip to content

feat: Add dotenv support and improve setup documentation #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions 02-household-queries/dspy_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ def run_retrieval(query, retrieve_k):

print(f"Top {retrieve.k} passages for query: {query} \n", "-" * 30, "\n")
for i, passage in enumerate(topK_passages):
print(f"[{i+1}]", passage, "\n")
print(f"[{i + 1}]", passage, "\n")
return retrieval


Expand Down Expand Up @@ -315,7 +315,7 @@ def print_eval_table(eval_score, results):
print(score, "|", ex.q_id, "|", ex.answer, "|", pred.get("answer", "")[:20])
print()
print("score:", eval_score)
print(f"{correct_count} ({int(correct_count/len(results)*100)}%) correct")
print(f"{correct_count} ({int(correct_count / len(results) * 100)}%) correct")
print("--------------------------------------")


Expand Down
55 changes: 54 additions & 1 deletion 03-document-classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,57 @@ This prototype uses a multimodal LLM (GPT-4o) to automatically:
- Determine what kind of evidence the document provides (e.g., proof of identity, residence, expenses, etc.)
- Extract arbitrary key/value pairs from the document

To run it, set OPENAI_API_KEY in your environment and then `streamlit run app.py`.
## Prerequisites

- An OpenAI API key with access to GPT-4o

## Setup Instructions

1. Clone this repository and navigate to the project directory:
```bash
cd 03-document-classification
```

2. Create and activate a Python virtual environment:
```bash
# Create the virtual environment
python -m venv venv

# Activate it on macOS
source venv/bin/activate
```

3. Install the required packages:
```bash
pip install python-dotenv streamlit openai
```

4. Create a `.env` file in the project directory and add your OpenAI API key:
```bash
echo "OPENAI_API_KEY=your-api-key-here" > .env
```
Replace `your-api-key-here` with your actual OpenAI API key.

5. Run the Streamlit app:
```bash
streamlit run app.py
```

The app should now be running and accessible at http://localhost:8501 (or another port if 8501 is in use).

## Using the App

1. Open the app in your web browser
2. Use the file uploader to upload one or more document images
3. Click "Process documents" to analyze them

## Troubleshooting

If you encounter any issues:

1. Make sure your virtual environment is activated (you should see `(venv)` in your terminal)
2. Verify your OpenAI API key is correct and has access to GPT-4o
3. Check that all required packages are installed by running:
```bash
pip install -r requirements.txt
```
7 changes: 6 additions & 1 deletion 03-document-classification/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,15 @@
import json
import time
from functools import wraps
from dotenv import load_dotenv
import os

import openai
import streamlit as st

# Load environment variables from .env file
load_dotenv()

if __name__ == "__main__":
if "__streamlitmagic__" not in locals():
import streamlit.web.bootstrap
Expand All @@ -25,7 +30,7 @@

@st.cache_resource
def get_client():
return openai.OpenAI()
return openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


PROMPT = """Please review the attached document and respond with a JSON object matching the DocumentAnalysis type definition provided below. Do not respond with anything else besides the DocumentAnalysis JSON object.
Expand Down
3 changes: 2 additions & 1 deletion 03-document-classification/requirements.in
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
openai
streamlit
streamlit
python-dotenv
Loading