Skip to content

joinalahmed/ondevice-slm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

MediaPipe LLM Inference task for web

Overview

This web sample demonstrates how to use the LLM Inference API to run common text-to-text generation tasks like information retrieval, email drafting, and document summarization, on web.

Prerequisites

  • A browser with WebGPU support (eg. Chrome on macOS or Windows).

Running the demo

Follow the following instructions to run the sample on your device:

  1. Make a folder for the task, named as llm_task, and copy the index.html and index.js files into your llm_task folder.
  2. Download Gemma 2B (TensorFlow Lite 2b-it-gpu-int4 or 2b-it-gpu-int8) or convert an external LLM (Phi-2, Falcon, or StableLM) following the guide (only gpu backend is currently supported), into the llm_task folder.
  3. In your index.js file, update modelFileName with your model file's name.
  4. Run python3 -m http.server 8000 under the llm_task folder to host the three files (or python -m SimpleHTTPServer 8000 for older python versions).
  5. Open localhost:8000 in Chrome. Then the button on the webpage will be enabled when the task is ready (~10 seconds).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published