This repository contains the AI Ambassadors Project of Microsoft Learn Student Ambassadors. The project is about DeepFake Image Detection Model. The project is divided into two parts:
-
Inspired by "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale" by Dosovitskiy et al. (2020) The concerned Vision Transformer model is used for transfer learning on the DeepFake Detection Dataset.
-
A subset of Open Forensics Dataset (Deepfake Dataset) consisting of approx. 1.9 million images is used for training the model. The dataset is divided into 3 sets: Training, Validation, and Testing Sets with a 14:4:1 ratio. Thereby, making the use of Transfer Learning for training the model to work on detecting DeepFake Images.
-
The model trained is then deployed onto Hugging Face Model Hub for public use. The model is available for use at: DeepFake Image Detection Model
- Achieved an accuracy of 97% on the Testing Set.
- Model trained on Azure ML Studio notebooks, with a compute VM of 16 Core, 128 GB RAM for a period of 6 hours.
- The model is available for use at: DeepFake Image Detection Model
- The Web Application is developed using Django Framework.
- The Web Application retrieves the image from the user via a user-friendly interface. During deployment, the model is retrieved from the Hugging Face Model Hub via the API, hence the user does not need to download the model in the concerned project directory.
- The Web Application can be further deployed on Azure App Services for public use.
Deepfake.Analysis.Demo.mp4
The presentation for the project can be found at: Team DeepShield - Deepfake Analysis Presentation. (Access restricted to Microsoft Learn Student Ambassadors)
-
Clone the repository using the following command:
git clone https://github.com/Polymath-Saksh/deepfake.git
-
Install the required dependencies using the following command:
pip install -r requirements.txt
-
Run the Django Server using the following command:
python manage.py runserver
-
Dmytro Lakubovskyi's Kaggle for ideas on the Vision Transformer Model and Deepfake Model Development.
-
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale by Dosovitskiy et al. (2020) for the Vision Transformer Model.
This project is licensed under the MIT License - see the LICENSE file for details.