What-can-A-I-see 👀

AI tool to assist visually impaired people. It takes a voice prompt and an image and generates an audio description of the image considering the user prompt

Our project aims to provide an assistive technology tool to support visually impaired people, providing them with a description of the scene they are in. This project would integrate image description, speech-to-text, and voice synthesis models. Provided an image and a voice prompt, the model generates a description considering the user's prompt, and it outputs the description as audio. The goal of our technology is to use deep learning techniques in order to improve daily life quality and increase the autonomy of sight-impaired people.

At the moment our tool works as follows:

The webcam starts working and by pressing the SPACE bar you can take a picture
Then, you can orally tell the prompt which contains your request for the description
The tool formulates a description
The description is provided as an audio

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
Image description model.py		Image description model.py
README.md		README.md
demo_tk.py		demo_tk.py
description_model.py		description_model.py
image_from_webcam.py		image_from_webcam.py
image_from_webcam_V2.py		image_from_webcam_V2.py
main.py		main.py
requirements.txt		requirements.txt
speech_to_text.py		speech_to_text.py
text_to_speech.py		text_to_speech.py
web_app.py		web_app.py
web_app_audioactiv.py		web_app_audioactiv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What-can-A-I-see 👀

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Santilopezc/What-can-A-I-see

Folders and files

Latest commit

History

Repository files navigation

What-can-A-I-see 👀

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages