Grounded SAM 2

Grounded tracking on anything in realtime using natural language queries with Grounding DINO, Grounding DINO 1.5, and Ollama which provides easy access to various LLMs.

Pre-requisites

Installation

Make sure you have Anaconda installed. If not, you can download it here.
Prepare your environment:
- Create the conda environment with python 3.10
```
conda create -n dino-llama python=3.10 -y
conda activate dino-llama
```
- Install pytorch for your system. Follow the official documentation for your specific system.
- Install the rest of the dependencies from the package
```
git clone https://github.com/Jshulgach/Grounded-SAM-2-Stream.git
cd Grounded-SAM-2-Stream 
pip install -e .
```

Prepare the LLM model:
- Install Ollama on your respective OS (Linux, MacOS, Windows) by following the readme instructions
- Create the llm model with the customized prompt.
```
ollama create dino_llama -f <\path\to\repo>\llm\Modelfile
```
Set up CUDA for GPU usage with Grounding DINO (Optional). You should be able to download the CUDA Toolkit from the Nvidia website.

Video Demo

A demo is included to handlea single video directory passed (or number for webcam). The DINOStream application creates a server and listens for messages from a client.

python dino_stream.py 0

Open a separate terminal and send a message to the server. The default server address is localhost and the default port is 15555. You can change the server address and port in the dino_stream.py file:

import socket

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('localhost', 15555))
s.send(b'I am looking for something blue that squirts water')

Multi-Camera Streaming Demo

To run the multi-camera streaming demo, you can run the following command which opens multiple webcams (assuming you have more than one connected):

python multi_stream.py

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
llm		llm
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE_cctorch		LICENSE_cctorch
LICENSE_groundingdino		LICENSE_groundingdino
README.md		README.md
dino_stream.py		dino_stream.py
multicam_demo.py		multicam_demo.py
setup.py		setup.py
video_demo.py		video_demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

Grounded SAM 2

Contents

Pre-requisites

Installation

Video Demo

Multi-Camera Streaming Demo

Acknowledgements

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Licenses found

Jshulgach/Grounded-SAM-2-Stream

Folders and files

Latest commit

History

Repository files navigation

Grounded SAM 2

Contents

Pre-requisites

Installation

Video Demo

Multi-Camera Streaming Demo

Acknowledgements

About

Topics

Resources

License

Licenses found

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages