
Overseer is a lightweight command-line tool for monitoring GPU memory usage and receiving real-time alerts via Telegram. Designed for researchers, developers, and system admins working with shared or remote GPU servers. Overseer notifies you when GPUs become available or when processes start or stop. With a simple setup and minimal overhead, it helps you make the most of your computing resources without constantly checking nvidia-smi
.
- Set up your bot following instructions here, get your api token for it.
- Start a dialog with your bot on an account that you would like to get notifications to. IMPORTANT: you need to send some message to your bot (this is due to the Telegram bot API limitations)
- Install
overseer
:pip install gpu-overseer
- Start monitoring your GPUs:
TELEGRAM_API_TOKEN=<your API token> TELEGRAM_API_URL=<relevant bot API URL> overseer monitor
You should instantly get a notification with a current GPU utilization status.
- If you do not get any notifications, try
overseer notify <any_message>
- this will send your message to all chat ids known to bot. In case you do not get this message either, try messaging your bot once again just like during setup. - In case you move your notifier to a new machine or in any other way change its location, the notifier may forget all chat ids, and you will have to message it once again.