Skip to content

akthammomani/ai_powered_maestro_finder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python TensorFlow Keras (tf.keras) Keras Tuner Basic Pitch Streamlit CSS CNN accuracy

AI-Powered Maestro Finder Streamlit App

Logo

This project is a part of Neural Networks and Deep Learning (AAI-511-02) course in the Applied Artificial Intelligence Master Program at the University of San Diego (USD).

-- Project Status: Completed

Introduction

AI-Powered Maestro Finder turns symbolic music into quick, explainable predictions. It’s a complete pipeline—from raw MIDIs to trained models to an interactive Streamlit app—built to be readable, reproducible, and easy to extend.

Objective

  • Make composer recognition accessible to students, musicians, and curious listeners.
  • Demonstrate an end-to-end ML workflow: data wrangling → augmentation → feature extraction → modeling → friendly UI.
  • Provide a clean template you can extend with new composers or tasks (style, period, instrument).

Features

  • Upload MIDI or record audio (auto-transcribed to MIDI with Basic Pitch).
  • Confidence bars and piano-roll visualization.
  • Optional sheet-music view (MusicXML rendered via OpenSheetMusicDisplay).
  • On-device inference using a compact CNN.
  • Clear, modular code: utils/ for I/O, features, inference, and visualization.

Methods

Stage Key steps
Data wrangling De-duplicate and clean raw MIDIs; trim to piano range (A0–C8); remove empty or corrupted tracks.
Class balancing Pitch-shift, time-stretch, and velocity-jitter minority classes to balance counts.
Feature extraction Two parallel views: (1) Piano-roll 88×512 @ 8 fps for CNN, (2) Event tokens for LSTM (used for research and comparison).
Modeling Train CNN on piano-rolls and LSTM on sequences; evaluate both.
Final selection CNN chosen for the app based on accuracy, stability, and speed.
Inference MIDI → piano-roll → CNN → probabilities. For audio, run Basic Pitch → MIDI → same pipeline.
Front end Streamlit UI with confidence bars, piano-roll, and optional sheet music.

Results (test split)

Model Input window Accuracy Macro F1
CNN Piano-roll 88 × 512 0.984 0.984
LSTM Piano-roll 88 × 512 0.830 0.832

Takeaway: with piano-roll inputs, the CNN's inductive bias (2D convs over time×pitch + residuals + SE) aligns better with musical texture and converges faster.
We use the CNN as the production model in the app.

Releases

No releases published

Packages

No packages published