Math2Visual

Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models

📄 ACL 2025 Findings Paper — Math2Visual

🎥 ACL 2025 Video

📘 Annotated Visual Language and Visual Dataset

🤖 Visual Language Generation Model

Description

In this project, we present Math2Visual, an automatic framework for generating pedagogically meaningful visuals from math word problem text descriptions. Math2Visual leverages a pre-defined visual language and a design space grounded in interviews with math teachers, to illustrate the core mathematical relationships in math word problems. Using Math2Visual, we construct an annotated dataset of 1,903 visuals and evaluate Text-to-Image (TTI) models for their ability to generate visuals that align with our design. We further fine-tune several TTI models with our dataset, demonstrating improvements in educational visual generation. Our work establishes a new benchmark for automated generation of pedagogically meaningful visuals and offers insights into key challenges in producing multimodal educational content, such as the misrepresentation of mathematical relationships and the omission of essential visual elements.

Access the Dataset on Hugging Face

We have released the full dataset on Hugging Face, including:

Annotated visual language with corresponding math word problems
Generated formal and intuitive visuals in both .svg and .png formats

👉 Browse the dataset on Hugging Face

You can preview images and download files directly from the Hugging Face web interface.

Generating Your Own Educational Visuals from Math Word Problems!!

Step 1: Install dependency

git clone https://github.com/eth-lre/math2visual.git
conda create -n math2visual python=3.12.4
conda activate math2visual
cd math2visual

Option A: Using Our Fine-tuned Model:

pip install -r requirements_a.txt

Option B: Using OpenAI API:

pip install -r requirements_b.txt

Step 2: Set your OpenAI key into environment through (you can skip this step if using option A):

touch .env
echo "OPENAI_API_KEY=<your_openai key>" >> .env

Step 3: Generate visual language from your math word problem

Option A: Using Our Fine-tuned Model:

Download our model adapter on Hugging Face

Place the adapter_model.safetensors into model/check-point/

Download base model meta-llama/Llama-3.1-8B on Hugging Face

Place the downloaded folder into model/base_model/

Replace the 'mwp' and 'formula' fields with your own math word problem content in generate_visual_language_with_our_model.py (around line 102). Then run:

python3 generate_visual_language_with_our_model.py

It will print out the generated visual language and save it in /output_visual_language/visual_langauge.txt

Option B: Using OpenAI API:

Replace the 'mwp' and 'formula' fields with your own math word problem content in generate_visual_language_with_gpt.py (around line 196). Then run:

python3 generate_visual_language_with_gpt.py

It will print out the generated visual language and save it in /output_visual_language/visual_langauge.txt

Step 4: Generate "formal visual" from visual language

Replace the 'visual_language' field with your own generated visual language in generate_visual_formal.py (around line 1406). Then run:

python3 generate_visual_formal.py

It will generate the visual and save it in /output_visual_formal/01.svg

Step 5: Generate "intuitive visual" from visual language

Replace the 'visual_language' field with your own generated visual language in generate_visual_intuitive.py (around line 4263). Then run:

python3 generate_visual_intuitive.py

It will generate the visual and save it in /output_visual_intuitive/01.svg

Citation

Junling Wang, Anna Rutkiewicz, April Wang, and Mrinmaya Sachan. 2025. Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models. In Findings of the Association for Computational Linguistics: ACL 2025, pages 11229–11257, Vienna, Austria. Association for Computational Linguistics.

@inproceedings{wang-etal-2025-generating-pedagogically,
    title = "Generating Pedagogically Meaningful Visuals for Math Word Problems: A New Benchmark and Analysis of Text-to-Image Models",
    author = "Wang, Junling  and
      Rutkiewicz, Anna  and
      Wang, April  and
      Sachan, Mrinmaya",
    editor = "Che, Wanxiang  and
      Nabende, Joyce  and
      Shutova, Ekaterina  and
      Pilehvar, Mohammad Taher",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2025",
    month = jul,
    year = "2025",
    address = "Vienna, Austria",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.findings-acl.586/",
    pages = "11229--11257",
    ISBN = "979-8-89176-256-5",
    abstract = "Visuals are valuable tools for teaching math word problems (MWPs), helping young learners interpret textual descriptions into mathematical expressions before solving them.However, creating such visuals is labor-intensive and there is a lack of automated methods to support this process. In this paper, we present Math2Visual, an automatic framework for generating pedagogically meaningful visuals from MWP text descriptions. Math2Visual leverages a pre-defined visual language and a design space grounded in interviews with math teachers, to illustrate the core mathematical relationships in MWPs.Using Math2Visual, we construct an annotated dataset of 1,903 visuals and evaluate Text-to-Image (TTI) models for their ability to generate visuals that align with our design. We further fine-tune several TTI models with our dataset, demonstrating improvements in educational visual generation. Our work establishes a new benchmark for automated generation of pedagogically meaningful visuals and offers insights into key challenges in producing multimodal educational content, such as the misrepresentation of mathematical relationships and the omission of essential visual elements."
}

This work is licensed under a This work is licensed under the Apache License 2.0.
For research inquiries, please contact: Junling Wang — wangjun [at] ethz [dot] ch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Math2Visual

Description

Access the Dataset on Hugging Face

Generating Your Own Educational Visuals from Math Word Problems!!

Step 1: Install dependency

Option A: Using Our Fine-tuned Model:

Option B: Using OpenAI API:

Step 2: Set your OpenAI key into environment through (you can skip this step if using option A):

Step 3: Generate visual language from your math word problem

Option A: Using Our Fine-tuned Model:

Option B: Using OpenAI API:

Step 4: Generate "formal visual" from visual language

Step 5: Generate "intuitive visual" from visual language

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
output_visual_formal		output_visual_formal
output_visual_intuitive		output_visual_intuitive
output_visual_language		output_visual_language
svg_dataset		svg_dataset
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
example_operation_math2visual.png		example_operation_math2visual.png
generate_visual_formal.py		generate_visual_formal.py
generate_visual_intuitive.py		generate_visual_intuitive.py
generate_visual_language_with_gpt.py		generate_visual_language_with_gpt.py
generate_visual_language_with_our_model.py		generate_visual_language_with_our_model.py
math2visual_title.png		math2visual_title.png
requirements_a.txt		requirements_a.txt
requirements_b.txt		requirements_b.txt

eth-lre/math2visual

Folders and files

Latest commit

History

Repository files navigation

Math2Visual

Description

Access the Dataset on Hugging Face

Generating Your Own Educational Visuals from Math Word Problems!!

Step 1: Install dependency

Option A: Using Our Fine-tuned Model:

Option B: Using OpenAI API:

Step 2: Set your OpenAI key into environment through (you can skip this step if using option A):

Step 3: Generate visual language from your math word problem

Option A: Using Our Fine-tuned Model:

Option B: Using OpenAI API:

Step 4: Generate "formal visual" from visual language

Step 5: Generate "intuitive visual" from visual language

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages