Monocular Deep Learning Multimodal with Object Relevance Estimation for Real-Time Navigation of Visually Impaired individuals (MMOR)
Real time Deep Learning assistant for visually impaired people. The model architecture fusion panoptic segmentation Panoptic FPN and monocular depth estimation Midas. The outcome is a video captured on a mobile device, generating spoken descriptions of user's environment to facilitate navigation, applying a heuristic algorithm for adapting prediction to user environment expectation. The model has been tested on members from Asociación Cultural y Recreativa para la Proyección del Invidente Puebla, A.C. (ACRIP) and result effective for user experience analysis. For more description check our Article or Presentation
Python >= 3.7
Run with GPU accelerator
DroidCam >= 6.5.2
- Download DroidCam Client on Widows, (Mac or Linux)https://www.dev47apps.com/droidcam/linux/].
- Download DroinCam - WebCam app on your smartphone
- Connect DroidCam Client from a computer to your smartphone, by connect both devices to same WiFi > copy from smartphone to laptop the Decive IP and DroidCam Port > Start
- Run Colab Notebook to learn about basic usage.
- For more information check documentation
- Enrique García enriquegv001@gmail.com
- Rafael Espinosa rafael.espinosa.castaneda@tec.mx