Better construction of reflection data in o1-journey?

It seems that some o1-journey reflection data publiced on Hugging Face are actually **correcting correct reasoning steps**. Is it possible that there are still room in how the reasoning tree is traversed and how reflection is constructed.

For example, I randomly sampled instances related to the keyword "wait". Every single checked reflection (rows 19, 39, 56, 75) have reflections that appeared to be unnecessary

https://huggingface.co/datasets/GAIR/o1-journey/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Better construction of reflection data in o1-journey? #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Better construction of reflection data in o1-journey? #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions