-
Notifications
You must be signed in to change notification settings - Fork 104
Open
Description
Hi, @Yifan-Song793
Thanks for releasing this dataset and code.
I have a question: how to calculate the correct path rate based on the tool execution log, for example, for the query and golden solution path:
{
"query": "give me the number of movies directed by Sofia Coppola",
"solution": [
"GET /search/person",
"GET /person/{person_id}/movie_credits"
]
},
I have the following execution log:
Query: give me the number of movies directed by Sofia Coppola
Planner: search for the person with the name "Sofia Coppola"
API response: Sofia Coppola's person_id is 4567
collect the list of movies directed by Sofia Coppola whose person_id is 4567
API response: The list of movies directed by Sofia Coppola includes The Virgin Suicides, Lost in Translation, Marie Antoinette, Somewhere, The Bling Ring, A Very Murray Christmas, and The Beguiled.
count the number of movies in the list directed by Sofia Coppola
API response: The number of movies directed by Sofia Coppola is 7.
Thought: I am finished executing a plan and have the information the user asked for or the data the user asked to create
Final Answer: Sofia Coppola has directed a total of 7 movies.
API Selector: GET /search/person to search for Sofia Coppola
API response: Sofia Coppola's person_id is 4567
Instruction: Continue. Collect the list of movies directed by Sofia Coppola whose person_id is 4567.
GET /person/4567/movie_credits to collect the list of movies directed by Sofia Coppola.
API response: The list of movies directed by Sofia Coppola includes The Virgin Suicides, Lost in Translation, Marie Antoinette, Somewhere, The Bling Ring, A Very Murray Christmas, and The Beguiled.
Instruction: Continue. Count the number of movies in the list directed by Sofia Coppola.
No API call needed.
Final Answer: Sofia Coppola has directed a total of 7 movies.
So how should I determine whether the actual execution path matches the golden standard?
Metadata
Metadata
Assignees
Labels
No labels