This is a serverless Telegram bot that transcribes voice, audio, and video notes sent to it using the Groq Whisper API. It also stores and retrieves transcriptions using AWS DynamoDB.
- The bot receives a voice, audio, or video note message from a user.
- It downloads the file from Telegram and checks its duration. If the duration is above a specified limit, it sends a warning message to the user and exits.
- The bot checks if the transcription already exists in AWS DynamoDB. If it does, the saved transcription is sent back to the user.
- If no transcription is found, the bot transcribes the audio using the Groq Whisper API.
- The transcription is sent back to the user as a text message and stored in DynamoDB for future reference.
/start
: Initializes the bot and provides a welcome message/help
: Provides information on how to use the bot and its features/transcribe
: Transcribes the voice, audio, or video note in the reply message/translate
: Translates (into English) the voice, audio, or video note in the reply message/summarize
: Summarizes the voice, audio, or video note in the reply message/caveman
: Transcribes the voice, audio, or video note in the reply message in a "caveman" style
/check
: Report Telegram webhook queue status (pending updates, last error)/reset
: Reset webhook withdrop_pending_updates=true
These commands are only available to the developer whose Telegram user ID is set via the DEV_TELEGRAM_ID
environment variable. If DEV_TELEGRAM_ID
is not set, the bot ignores these commands.
Why this exists: when the Lambda handler returns a non-200 response, Telegram will retry the same webhook delivery. This can create a resend loop and queue backlog. The developer commands allow monitoring the queue and forcefully resetting the webhook with drop_pending_updates=true
to clear the backlog.
Maybe in the future, I'll make a fork of Teloxide that would be specially made to be run on serverless functions.
- The bot is built using the
teloxide
crate for interacting with the Telegram API. - The transcription is done using the
reqwest
crate to send a request to the Groq Whisper API. - The file is first converted to a 16 kHz mono FLAC file using FFmpeg before being sent to the Groq Whisper API.
- The bot uses AWS DynamoDB to store and retrieve transcriptions, ensuring that repeated requests for the same audio do not require retranscription.
- The bot is deployed as a serverless function using AWS Lambda.
TELEGRAM_BOT_TOKEN
: the token for the Telegram bot.GROQ_API_KEY
: the API key for the Groq Whisper API.DYNAMODB_TABLE
: the name of the DynamoDB table where transcriptions are stored.DEV_TELEGRAM_ID
: Telegram user ID allowed to run developer commands (/check
,/reset
). If unset, the commands are ignored.LAMBDA_URL
: The URL of the Lambda function. This is used to set the webhook in the Telegram API when resetting it.
Before deploying this bot, ensure you have the following prerequisites installed:
- AWS CLI: Follow the instructions here to install the AWS Command Line Interface.
- cargo-lambda: Install
cargo-lambda
(not with cargo, it doesn't support cross compilation) by following the instructions here.
To build:
cargo lambda build --release --arm64
To deploy:
cargo lambda deploy
-
Download FFmpeg Static Build: Download the ARM64 static build of FFmpeg for Amazon Linux 2023:
wget https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-arm64-static.tar.xz tar -xf ffmpeg-release-arm64-static.tar.xz mv ffmpeg-*-arm64-static/ffmpeg ./ffmpeg chmod +x ffmpeg
-
Create the Layer Directory: Create a directory structure for the Lambda layer:
mkdir -p ffmpeg-layer/bin mv ffmpeg ffmpeg-layer/bin/
-
Zip the Layer: Compress the
ffmpeg-layer
directory:zip -r ffmpeg-layer.zip ffmpeg-layer
-
Publish the Layer: Publish the layer to AWS Lambda using the AWS CLI:
aws lambda publish-layer-version \ --layer-name ffmpeg-arm64 \ --description "FFmpeg static binary for ARM64 on Amazon Linux 2023" \ --zip-file fileb://ffmpeg-layer.zip \ --compatible-runtimes provided.al2023 \ --compatible-architectures arm64
-
Attach the Layer to Your Lambda Function: Update the Cargo.toml file to include the ARN of the FFmpeg Lambda layer
Ensure that your AWS Lambda function has the necessary permissions to access DynamoDB. You will need to attach a policy that grants the Lambda function read and write permissions to the DynamoDB table. This can be done by attaching the AWSLambdaDynamoDBExecutionRole
managed policy or by creating a custom policy with the necessary permissions.
Do literally whatever you want with this code. I don't care.
Contributions are welcome! If you'd like to help improve this bot, please open a pull request with your changes.