mcts-langchain

This is a fork of BrendanGraham14/mcts-llm, I have integrated langchain implementations for model call and refactored the code.

MCTSr

Based on Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B by Zhang, et al.

At a high level, MCTSr iteratively generates solutions to a specified (math) problem.

In a MCTSr tree, nodes correspond to attempted answers, and edges correspond to attempts to improve the answer.

Initialize

Generate an solution to the problem. This paper uses a "dummy" solution (e.g. "I don't know").

Select a node to expand

We gather a set of candidate nodes which haven't been fully expanded.

A node is fully expanded if either:

it has max_children
any of its children have a Q value which is greater than its own

Once we've gathered the candidates, we compute UCT scores for each candidate node. There are a few ways we can make our selection:

Greedily (choose the node with the highest UCT)
Importance sampling (sample from the set of candidates, weighted by their UCT score)
Pairwise importance sampling (sample the max from a pair of nodes from the set of candidates, weighted by the difference between the pair's UCT scores)

The authors mention that they perform greedy selection in the paper. In their repo, they also perform pairwise sampling and save the (question, answer1, answer2) tuples for use in DPO.

Expand the node

Expansion involves several steps:

Generate a critique of the current solution.
Refine the solution based on the critique.
Add a new child, corresponding to the refined solution.
Self-evaluate the reward of the new child.
Backpropagate the reward from the new child through its parents, through to the root.

Usage

Imports

from mcts_llm.mctsr import MCTSr
from mcts_llm.prompt_configs import llama_3_8b_prompt_config

Instantiate the bLLM

from langchain_community.llms import Ollama

model = Ollama(model="llama3:8b")

Just Run It!!

llama = MCTSr(model=model, problem="what are the cube roots of unity", prompt_config=llama_3_8b_prompt_config)
llama.run()

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
datasets		datasets
results		results
src/mcts_llm		src/mcts_llm
.gitignore		.gitignore
README.md		README.md
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mcts-langchain

MCTSr

Initialize

Select a node to expand

Expand the node

Usage

Imports

Instantiate the bLLM

Just Run It!!

About

Uh oh!

Releases

Packages

Languages

Ritvik19/mcts-langchain

Folders and files

Latest commit

History

Repository files navigation

mcts-langchain

MCTSr

Initialize

Select a node to expand

Expand the node

Usage

Imports

Instantiate the bLLM

Just Run It!!

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages