An interactive Streamlit application that combines the official SPICE metric with a conceptual scene-graph tuple visualizer. Evaluate your image captions end-to-end, inspect SPICE precision/recall/F₁, see live logs of the Java subprocess, and explore extracted (object–relation–object / object–attribute) scene-graph tuples as draggable force-directed graphs.
P. Anderson, B. Fernando, M. Johnson, and S. Gould, “SPICE: Semantic Propositional Image Caption Evaluation,” arXiv preprint arXiv:1607.08822, Jul. 2016.
- SPICE Evaluation Invokes the SPICE-1.0 Java JAR in “detailed” mode to compute the official SPICE metric (precision, recall, F₁).
- Live Logging Streams the SPICE subprocess stdout/stderr into a collapsible, scrollable terminal box in the UI.
- Interactive Scene-Graph Visualization Extracts simple tuples from captions and renders them with PyVis: draggable nodes, curved edges, physics controls.
- Extracted Tuple Preview View simplified lists of extracted tuples (candidate & references) in collapsible code blocks.
spice/
┣ SPICE-1.0/
┃ ┣ lib/ ← SPICE’s dependencies
┃ ┣ spice-1.0.jar
┃ ┗ get_stanford_models.sh
┣ lib/ ← Frontend JS/CSS assets (tom-select, vis-network)
┣ spice_cache ← Temp files: JSON in/out, HTML graphs, LMDB
┣ scene_graph_visualizer.py ← PyVis graph-builder
┣ evaluator.py ← SPICE invocation + tuple extraction + WordNet matching
┣ app.py ← Streamlit frontend
┣ requirements.txt ← Python dependencies
Follow these steps to set up and run the dashboard locally.
- Java 8+ (to run the SPICE JAR)
- Conda or Miniconda
On Debian/Ubuntu you may need:
sudo apt update
sudo apt install -y default-jre unzip
https://github.com/Gayanukaa/SPICE-Evaluator.git
cd experiments/spice/
-
Download SPICE-1.0
- From panderson’s SPICE site
- Place the
spice-1.0.jar
andlib/
folder under theSPICE-1.0/
directory.
-
Fetch Stanford CoreNLP Models
cd SPICE-1.0 chmod +x get_stanford_models.sh ./get_stanford_models.sh
This downloads the Stanford CoreNLP JARs (tokenizer, parser, NER, etc.) into
SPICE-1.0/lib/
.
conda create -n env python=3.11.8 -y
conda activate env
pip install -r requirements.txt
From the project root:
streamlit run app.py
The app will open in your browser (usually at http://localhost:8501
). Enter your Candidate Caption and one or more Reference Captions in the sidebar, then click Evaluate.
If you want to test SPICE outside of Streamlit, you can run:
java -Xmx8G \
-jar SPICE-1.0/spice-1.0.jar \
sample_input.json \
-detailed
This produces detailed SPICE output (stdout
and a JSON file).
-
SPICE JAR not found Ensure
SPICE-1.0/spice-1.0.jar
exists and is executable. -
CoreNLP model errors /
NoClassDefFoundError
You may encounter the issue described here: Labbeti/aac-metrics#7
Resolution: Manually download the Stanford CoreNLP ZIP and extract it intoSPICE-1.0/lib/
: http://nlp.stanford.edu/software/stanford-corenlp-full-2015-12-09.zip -
Missing NLTK data NLTK auto-downloads WordNet & OMW on first run. To install manually:
import nltk nltk.download('wordnet') nltk.download('omw-1.4')
-
PyVis graphs not rendering Ensure your network allows CDN loading, or switch
cdn_resources='local'
/remote
inscene_graph_visualizer.py
.
Open to contributions. Feel free to fork the repo, create a feature branch, commit your changes, push to your fork, and open a Pull Request.
This project is licensed under the GNU Affero General Public License v3.0.