Dialog

A modular VoIP ➞ STT ➞ AI Agent ➞ TTS ➞ VoIP implementation.

Introduction

Dialog provides a framework and a set of interfaces for building VoIP Agent applications.

Features

An easy to understand and extensible modular framework
Event driven architecture
Facilities for multithreaded deployments
Talk over interruption of agent
Conversation history
Agent-driven STT and TTS selection

NB Dialog is still undergoing active refactoring. Prior to 1.0.0, public interfaces may change on turns of the minor and commit messages will be minimal.

Installation

Development Installation

These instructions describe how to clone the Dialog repository and build the package.

Clone the repository.

git clone https://github.com/faranalytics/dialog.git

Change directory into the Dialog repository.

cd dialog

Install the package dependencies.

npm install && npm update

Build the Dialog package.

You can use the clean:build script in order to do a clean build.

npm run clean:build

Alternatively, you can use the watch script in order to watch and build the package. This will build the package each time you make a change to a file in ./src. If you use the watch script, you will need to open a new terminal in order to build and run your application.

npm run watch

Install Dialog into your package

Change directory into your package directory and install the package.

npm install <path-to-the-dialog-respository> --save

You should now be able to import Dialog artifacts into your package.

Usage

Example applications are provided in the example subpackages.

How it works

When a call is initiated, a Controller (e.g., a Twilio or Telnyx Controller) emits an init event. The init handler is called with a VoIP instance as its single argument. The VoIP instance handles the websocket connection that is set on it by the Controller. In the init handler, an instance of a Dialog application is constructed by passing a VoIP, STT, Agent, and TTS implementation into a Dialog constructor and calling its start method. The start method of the Dialog instance connects the component interfaces that comprise the application.

An important characteristic of the architecture is that a new instance of each component of a Dialog application - a VoIP, STT, TTS, and an Agent - is created on each call; this means each instance may maintain state relevant to its respective call.

Excerpted from src/main.ts.

controller.on("init", (voip: VoIP) => {
  const stt = new DeepgramSTT({ apiKey: DEEPGRAM_API_KEY });
  const tts = new CartesiaTTS({ apiKey: CARTESIA_API_KEY });
  const agent = new OpenAIAgent({
    apiKey: OPENAI_API_KEY,
    system: OPENAI_SYSTEM_MESSAGE,
    greeting: OPENAI_GREETING_MESSAGE,
    model: OPENAI_MODEL,
  });
  const dialog = new Dialog({ voip, stt, tts, agent });
  dialog.start();
});

Implementations

Dialog provides example implementations for each of the artifacts that comprise a VoIP Agent application.

VoIP

A VoIP implementation is provided that uses the Twilio API.

A VoIP implementation is provided that uses the Telnyx API.

Speech to text (STT)

An STT implementation is provided that uses the Deepgram API.

Text to speech (TTS)

A TTS implementation is provided that uses the Cartesia API.

AI agent

An Agent implementation is provided that uses the OpenAI API.

Custom Implementations

Dialog provides VoIP, STT, Agent, and TTS example implementations. You can use a provided implementation as-is, subclass it, or implement your own. If you plan to implement your own VoIP, STT, Agent, or TTS component, interfaces are provided for each component of the VoIP application.

Custom Agents

A custom Agent implementation will allow you to manage conversation history, turn of speech, agent interruption, STT and TTS selection, and other nuances.

You can extend the provided OpenAIAgent class, as in the example below, or just implement the Agent interface. The straight-forward openai_agent.ts implementation can be used as a guide.

A custom `Agent` based on `openai_agent.ts`.

This custom Agent implementation adds a timestamp to each user message.

import { randomUUID } from "node:crypto";
import { log, Agent, OpenAIAgent, OpenAIAgentOptions } from "@farar/dialog";

export class CustomAgent extends OpenAIAgent implements Agent {
  protected mutex: Promise<void>;

  constructor(options: OpenAIAgentOptions) {
    super(options);
    this.mutex = Promise.resolve();
  }
  public onTranscript = (transcript: string): void => {
    this.mutex = (async () => {
      try {
        await this.mutex;
        this.uuid = randomUUID();
        log.notice(`User message: ${transcript}`);
        this.history.push({ role: "user", content: `${new Date().toISOString()}\n${transcript}` });
        this.stream = await this.openAI.chat.completions.create({
          model: "gpt-4o-mini",
          messages: this.history,
          temperature: 0,
          stream: true
        });
        await this.dispatchStream(this.uuid, this.stream);
      }
      catch (err) {
        log.error(err);
      }
    })();
  };
}

Support

If you have a feature request or run into any issues, feel free to submit an issue or start a discussion. You’re also welcome to reach out directly to one of the authors.

Adam Patterson

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
examples		examples
src		src
tests		tests
.gitignore		.gitignore
.npmignore		.npmignore
LICENSE		LICENSE
README.md		README.md
eslint.config.mjs		eslint.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Dialog

Introduction

Features

Table of contents

Installation

Development Installation

Clone the repository.

Change directory into the Dialog repository.

Install the package dependencies.

Build the Dialog package.

Install Dialog into your package

Change directory into your package directory and install the package.

Usage

How it works

Implementations

VoIP

Speech to text (STT)

Text to speech (TTS)

AI agent

Custom Implementations

Custom Agents

A custom `Agent` based on `openai_agent.ts`.

Support

About

Uh oh!

Uh oh!

Languages

License

faranalytics/dialog

Folders and files

Latest commit

History

Repository files navigation

Dialog

Introduction

Features

Table of contents

Installation

Development Installation

Clone the repository.

Change directory into the Dialog repository.

Install the package dependencies.

Build the Dialog package.

Install Dialog into your package

Change directory into your package directory and install the package.

Usage

How it works

Implementations

VoIP

Speech to text (STT)

Text to speech (TTS)

AI agent

Custom Implementations

Custom Agents

A custom Agent based on openai_agent.ts.

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages

A custom `Agent` based on `openai_agent.ts`.