Skip to content

newocean-group/T-Rex2

Repository files navigation

Open In Colab

This repository contains an unofficial implementation of T-Rex2. Currently, only the visual encoder has been implemented.

Deepwiki docs: https://deepwiki.com/newocean-group/T-Rex2.

📖 Model Architecture:

model architecture

📃Datasets are used for training the model:

  • Object365
  • OpenImagesV7
  • CrowdHuman
  • Hiertext
  • LVIS

To train the model without text prompts and with a batch size of 1 due to hardware limitations. I use the following training process:

🖼️ Visual Results:

Note : This model has been trained for approximately 2.7M steps (batch size = 1) and is still in the training process.

⚙️ Installation

To use the model, follow these steps:

  1. Clone the repository:

    git clone https://github.com/newocean-group/T-Rex2.git
  2. Download and install CUDA toolkit:

    # Make sure you have the correct version installed. For example, I installed CUDA 11.8
    
  3. Compiling CUDA operators:

    cd ops
    python setup.py install
  4. Install other dependencies:

    pip install -r requirements.txt
  5. Log in to your HuggingFace account on your device to automatically download the model weights using the following command:

    huggingface-cli login
    Enter your token

🔍 Demo

I have attached a .ipynb file in the repository. You can refer to it to know how to use the model.

Additionally, I have provided another .ipynb file that illustrates the process of learning class embeddings for the model.

Note: You may need to adjust the threshold value to achieve the best results.

💡 Conclusion

This model has been implemented based on my current knowledge and can be further improved with future research.

Additionally, the model can be modified for instance segmentation based on the approach described in this paper. The modified model architecture would resemble the following:

model architecture

References

About

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages