Devpost link: https://devpost.com/software/holoflash
Servers have been shut down because the hackathon is over and we can't pay for the operating costs.
Our inspiration was fueled by the recent push towards Augmented and Mixed reality coupled with our collegiate desires to make methods of studying more efficient and enjoyable. We were also inspired by a paper from the University of Delawares very own HCI lab that highlights the beneficial impacts AI based virtual assistants can have on learning. arXiv:2306.17278
Our application empowers students by utilizing Googles Gemini AI to swiftly convert audio files, pdfs, and images into flashcards relating to the inputs subject matter. On top of this, users can also review their flashcards in augmented reality using the Microsoft Hololens, hence the name of our project "Holo-Flash". This provides students with a new interactive way to study which that hasn't been exercised much in the past. With this new method of studying students may be more willing to study as it makes the experience simple, fun, and engaging.
https://devpost.com/software/holoflash
We started by creating a Python Flask server, deployed on a Debian virtual machine to establish easy isolation between the frontend and backend codebases. On this Debian virtual machine we deployed the Google Cloud Vision API to extract text from images, Google Cloud Speech-to-Text API to transcribe audio files, and Google Gemini to generate flashcards from the extracted text. We also used MongoDB Atlas to store the flashcards and user data. Within the Flask server, we constructed a rich RESTful API so that our frontend could interface with the previously mentioned models as well as our MongoDB Atlas database.
For the frontend, we created a React app that allows users to upload files to be fed to the models, and subsequently receive their flashcards. We also created a user authentication system, which bridged the gap between our server and React app. Since our app is designed to be an interactive medium for learning, we decided to integrate the Microsoft HoloLens 2 into our project. Using Unity, we created a 3D environment in which our users could use their login credentials to access their flashcards in augmented reality.
- AI Integration
- Google Gemini
- How to use the Google Gemini API
- Deploying the Google cloud models on a virtual machine making sure to utilize a C3DAMD Genoa GPU to decrease model runtime to make the user experience more enjoyable.
- Google Gemini
- Backend Development
- MongoDB Atlas
- We had to learn how to use MongoDB Atlas to store the flashcards and user data in a secure and efficient manner.
- Manipulating the ouptuts of the models to fit into our database schema was difficult to say the least
- MongoDB Atlas
- Frontend Development
- Unity and Augmented Reality
- Implementing BackEnd conversation onto an Augmented Reality program supported by a GameEngine
- React integration
- Creating a user authentication system
- Routing webpages and storing components in an organized manner
- Unity and Augmented Reality
- Time
- We had to learn how to utilize these technologies in harmony in only 24 hours, and none of us slept lol
Building a full stack application by using technologies that we discovered the day of the hackathon. We were full stack engineers and that is enough to be proud of.
Major Learnings
- Full Stack Development
- MongoDB Atlas
- Google Cloud Platform
- Python Flask
- AI Integration
- Google Gemini
- ChatGPT API
- Google Cloud Speech-to-Text API
- Google Cloud Vision API
- Frontend Development
- React
- Node.js
- Unity
- CSS
- Augmented Reality
- Microsoft HoloLens
- Unity
- C#
- Version Control
- Git
- GitHub
- Teamwork
- Communication
- Collaboration
- Task Delegation
- Time Management
- Determination
Our next steps would to bring support into Virtual reality, mainly on platforms like the meta quest 3 and meta quest 2.