Skip to content

avijeett007/kno2gether-gemini-multimodal-playground

Repository files navigation

Multimodal Live API - Web console - Customized Guide by Kno2gether

Now Updated to Work with Anthropic MCP Server

πŸŽ₯ DeepDive Video

Watch the DeepDive Session on Our YouTube Channel:

Gemini 2.0 DeepDive

Subscribe

πŸŽ₯ Improved JARVIS Video

Watch the Improved Gemini 2.0 with MCP Coding Deepdive Session on Our YouTube Channel:

Gemini 2.0 DeepDive

Subscribe

Community and Support

Hosting Partners

πŸ“Ί Video Tutorials

Follow along with our detailed video tutorials on the Kno2gether YouTube Channel for step-by-step guidance and best practices.

🀝 Contributing

We welcome contributions! Please feel free to submit a Pull Request.

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

This repository contains a react-based starter app for using the Multimodal Live API over a websocket. It provides modules for streaming audio playback, recording user media such as from a microphone, webcam or screen capture as well as a unified log view to aid in development of your application.

To get started, create a free Gemini API key. We have provided several example applications on other branches of this repository:

Below is an example of an entire application that will use Google Search grounding and then render graphs using vega-embed:

import { type FunctionDeclaration, SchemaType } from "@google/generative-ai";
import { useEffect, useRef, useState, memo } from "react";
import vegaEmbed from "vega-embed";
import { useLiveAPIContext } from "../../contexts/LiveAPIContext";

export const declaration: FunctionDeclaration = {
  name: "render_altair",
  description: "Displays an altair graph in json format.",
  parameters: {
    type: SchemaType.OBJECT,
    properties: {
      json_graph: {
        type: SchemaType.STRING,
        description:
          "JSON STRING representation of the graph to render. Must be a string, not a json object",
      },
    },
    required: ["json_graph"],
  },
};

export function Altair() {
  const [jsonString, setJSONString] = useState<string>("");
  const { client, setConfig } = useLiveAPIContext();

  useEffect(() => {
    setConfig({
      model: "models/gemini-2.0-flash-exp",
      systemInstruction: {
        parts: [
          {
            text: 'You are my helpful assistant. Any time I ask you for a graph call the "render_altair" function I have provided you. Dont ask for additional information just make your best judgement.',
          },
        ],
      },
      tools: [{ googleSearch: {} }, { functionDeclarations: [declaration] }],
    });
  }, [setConfig]);

  useEffect(() => {
    const onToolCall = (toolCall: ToolCall) => {
      console.log(`got toolcall`, toolCall);
      const fc = toolCall.functionCalls.find(
        (fc) => fc.name === declaration.name
      );
      if (fc) {
        const str = (fc.args as any).json_graph;
        setJSONString(str);
      }
    };
    client.on("toolcall", onToolCall);
    return () => {
        client.off("toolcall", onToolCall);
    };
  }, [client]);

  const embedRef = useRef<HTMLDivElement>(null);

  useEffect(() => {
    if (embedRef.current && jsonString) {
      vegaEmbed(embedRef.current, JSON.parse(jsonString));
    }
  }, [embedRef, jsonString]);
  return <div className="vega-embed" ref={embedRef} />;
}

## Configuration

The application requires a configuration file for MCP tools. To set up:

1. Copy the example config:
```bash
cp mcp-config.example.json mcp-config.json
  1. Edit the config file with your settings:
  • mcp-config.json: Configuration for MCP tools and server settings

Note: The actual config file is gitignored to prevent committing sensitive information.

development

This project was bootstrapped with Create React App. Project consists of:

  • an Event-emitting websocket-client to ease communication between the websocket and the front-end
  • communication layer for processing audio in and out
  • a boilerplate view for starting to build your apps and view logs

Available Scripts

In the project directory, you can run:

npm start

Runs the app in the development mode.
Open http://localhost:3000 to view it in the browser.

The page will reload if you make edits.
You will also see any lint errors in the console.

npm run build

Builds the app for production to the build folder.
It correctly bundles React in production mode and optimizes the build for the best performance.

The build is minified and the filenames include the hashes.
Your app is ready to be deployed!

See the section about deployment for more information.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published