An audio video-based multi-modal fusion approach for speech emotion recognition

Status: Submitted for Review

Authors: S M JISHANUL ISLAM, SAHID HOSSAIN MUSTAKIM, MUSFIRAT HOSSAIN, MYSUN MASHIRA, NUR ISLAM SHOURAV, MD. RAYHAN AHMED, SALEKUL ISLAM, A.K.M. MUZAHIDUL ISLAM, AND SWAKKHAR SHATABDA

Requirements

Prerequisite: A CUDA-enabled GPU is preferred. However, for those who would run this code on CPU, ensure to tweak the batch size in correspondence to your hardware capacity. Tweak the batch size in the hyperparams.py file before running the notebooks. If the installations fail, kindly refer to: conda instructions.

If frequent problems arise while running on the local environment, kindly resort to the instructions for cloud notebooks, and run on any cloud platform.

Step-2: Clone this repository:

git clone https://github.com/S-M-J-I/Multimodal-Emotion-Recognition

If you have SSH configured:

git clone git@github.com:S-M-J-I/Multimodal-Emotion-Recognition.git

Step-3: Install pipenv. Skip if you already have it in your system.

pip3 install --user pipenv

Step-4: Install the modules. Run the following command in the terminal:

pipenv install -r requirements.txt

Run the model

To load the model, you can use torch.hub() to load it without having the model's structure:

import torch

# example: load the savee model
model = torch.hub.load("S-M-J-I/Multimodal-SER",'savee_model', pretrained=True, num_classes=7, fine_tune_limit=3)

After this, the model will load and it will be ready for use!

Alternatively, this repo contains a weights directory.

Run the pipelines

To run the notebooks on SAVEE and RAVDESS, we recommend you download the dataset and unpack it in this directory. Then set the path to the directory in their respective notebooks.
Note: while setting the file path, ensure the exta '/' is added to the end. Example: /path_to_dir/

To run the model on the datasets, navigate to the individual notebooks made for them in the explore directory.

Run the following command in the terminal to start the local server:

pipenv run jupyter notebook

For any assistance or issues, kindly open an Issue in this repository.

Contributions

This repository is not accepting any contributors OUTSIDE the author list mentioned. For any issues related to the code, we request you to open an Issue.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
cloud_notebooks		cloud_notebooks
explore		explore
src		src
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
AlernateInstructions.md		AlernateInstructions.md
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
__init__.py		__init__.py
hubconf.py		hubconf.py
load_test.py		load_test.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

An audio video-based multi-modal fusion approach for speech emotion recognition

Status: Submitted for Review

Requirements

Run the model

Run the pipelines

For any assistance or issues, kindly open an Issue in this repository.

Contributions

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

S-M-J-I/Multimodal-SER

Folders and files

Latest commit

History

Repository files navigation

An audio video-based multi-modal fusion approach for speech emotion recognition

Status: Submitted for Review

Requirements

Run the model

Run the pipelines

For any assistance or issues, kindly open an Issue in this repository.

Contributions

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages