Skip to content

Teaching the AI assistant to call tools #586

@DavidMStraub

Description

@DavidMStraub

(NB: this is a feature request, but also the start of a discussion - I think we need some good ideas first.)

Currently, the AI assistant is not very smart as it can only retrieve individual Gramps objects and doesn't know anything about relationships, so you can't even ask it for your grandfather.

To solve that, we need to teach it how to call tools/functions.

In approaching that, there are several questions to answer:

  • which functions should it call?
  • how (if at all) can we make the tool calling not just work in OpenAI models, but also in open source models for people running chat locally?

One challenge I see is that the number of possible functions is quite large:

  • retrieve a person by some filter
  • retrieve an event by some filter
  • find people with a certain relationship
  • ...

Although I haven't tried it myself yet, common lore is that for an LLM to identify the right function to call only works well if the number of functions is small, probably below 10.

What I find quite promising is leveraging query languages like GQL or @dsblank's Object QL, where I suspect the latter is a better choice.

What could be done is the following:

  1. Create a large number of possible queries that are considered useful for the assistant
  2. Describe what the query does
  3. Use an LLM to generate questions based on the description of what the query does
  4. Use an embedding model to compute vector embeddings for each of the questions and store them with the query

Now, with these embeddings at hand, when the assistant gets a question, it could

  1. Calculate the embedding for the question with the same embedding model used for the query language questions
  2. Use vector similarity to identify the 5 most likely queries
  3. Feed these 5 queries as function calls to the LLM and let it decide which function to use
  4. Execute the query recommended by the LLM and feed the results back to the LLM
  5. Generate the answer

Funnily enough, this would even be less resource intensive than the retrieval-based answers, since it only needs a vector index of queries that can be computed in advance once and for all.

I don't think I'll have time to work on this myself in the next 2 months or so, but if anyone experiments with this or has other ideas, please share here!

🤖

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    Status

    Medium Priority

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions