ETH-UDK Navigator is an AI-powered web application developed by the ETH Library to support subject indexing and discovery using the ETH Zurich adaptation of the Universal Decimal Classification (UDC).
The application combines classical classification data with modern AI tools, allowing users to explore subject hierarchies, interactively visualize semantic relationships, and search classification terms using natural language via semantic vector search.
Use the Explorer view to browse top-level classification terms and drill down into narrower or related concepts.
Use the Graph view to display an interactive graph of broader, narrower, and related terms. This helps understand the semantic structure of a concept within ETH-UDK.
Use the Vector Query tool (requires login) to:
- Paste in a title, abstract, or table of contents from a document
- Select a classification namespace and level range
- Submit the form to see the most semantically relevant ETH-UDK terms
This tool uses OpenAI embeddings and Pinecone vector search to find terms that best match the meaning of your input.
⚠️ The vector query functionality is experimental and will serve as a foundation for future AI-based workflows for subject indexing.
🧪 This project is also Replit-compatible. You can run it directly in Replit by forking this repo.
- Python 3.11+
pip
- Git
git clone https://github.com/your-username/eth-udk-navigator.git
cd eth-udk-navigator
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
- Copy the template:
cp .env.example .env
- Fill in your
.env
file with the correct secrets:
FLASK_SECRET_KEY=your-generated-secret-key
VECTOR_QUERY_PASSWORD=your-password
PINECONE_API_KEY=your-pinecone-key
OPENAI_API_KEY=your-openai-key
To generate a secure
FLASK_SECRET_KEY
:
python -c "import secrets; print(secrets.token_hex(32))"
python main.py
The app will be accessible at: http://localhost:5000
eth-udk-navigator/
├── templates/ # HTML templates (Jinja2)
│ ├── home.html
│ ├── index.html
│ ├── graph.html
│ ├── vector_query.html
│ ├── login.html
│ ├── _footer.html
│ └── _navbar.html
├── static/ # Static files (CSS, JS)
│ ├── styles.css
│ ├── nav.css
├── data.json # ETH-UDK classification data
├── main.py # Flask app
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
└── README.md # This file
The Vector Query page is protected via password login. Users must enter a password defined in the environment variable VECTOR_QUERY_PASSWORD
. Sessions are managed securely via Flask sessions.
- Update the
data.json
file if you want to load a different classification structure. - To change the model or vector search logic, look into the
vector_query
route inmain.py
. - Semantic embeddings are created using OpenAI's
text-embedding-3-large
model. You may adapt this if you use a different provider or model. - The app uses Pinecone for vector search. You can swap this out for another provider or a local vector DB (e.g.
faiss
) with some adjustments.
This project is licensed under the Apache License 2.0. See the LICENSE
file for details.
Created and maintained by the ETH Library team, part of the AI Library Automation initiative.
Questions? Feedback? Contact us at: api@library.ethz.ch
Happy hacking ✨