This repository contains a Jupyter Notebook that uses LangChain and OpenAI APIs to build a semantic search engine around journals from the Brazilian National Congress, applying RAG to uncover and keep track of political events. A specific journal from May 1, 2025 is used.
To install the required Python packages, use uv:
-
Initialize the project (if not already done):
uv init
-
Install dependencies from
pyproject.toml
:uv sync
For this to work, you need to load an environment variable containing your OpenAI API key. If you are using this Jupyter Notebook in Visual Studio Code using the Jupyter extension, you can just create a .env
file and place your API key there:
OPENAI_API_KEY=YOUR_API_KEY_HERE
The variable should be automatically loaded by the extension.
To run the Streamlit app, use the following command:
streamlit run main.py
For more details, here's the related Medium post I wrote: Keeping Up With Congress With Retrieval-Augmented Generation (RAG)