PercGAN

PercGAN: A computationally inexpensive approach to high fidelity stereo audio sample generation.

Pretrained Models

Kick Drum

Kick drum generation model trained on curated kick drum samples.

Kick vs. Generated Comparison:

Snare Drum

Snare drum generation model, focused on producing punchy, tight snare sounds.

Snare vs. Generated Comparison:

Model Usage

1. Prerequisites

Optional but highly recommended: Set up a Python virtual environment.
- Audio loader package librosa requires an outdated version of Numpy
Install requirements by running pip3 install -r requirements.txt

2. Generate audio from pretrained models

Use the generation script with command line arguments:

# Generate 2 kick drum samples with default output path
python src/generate.py --type kick --count 2

# Generate 5 snare samples with custom output path
python src/generate.py --type snare --count 5 --output_path my_samples

Parameters:

--type: Type of audio to generate (kick or snare)
--count: Number of samples to generate (integer value)
--output_path: Directory to save generated audio files (default: "outputs")

Technical Approach

PercGAN combines these audio generation techniques:

Mel-Spectrogram Representation: For efficient learning of frequency patterns
Progressive Growing: Training on increasingly higher resolution spectrograms
Style-Based Generation: Using StyleGAN for better style control
Multi-Scale Spectral Losses: Specialized frequency, decay, and coherence losses
Griffin-Lim Reconstruction: Converting spectrograms back to audio

Custom Training

To train on your own audio samples:

Collect one-shot audio samples (<1.5 seconds each)
Update the model parameters in src/utils/model_params.json
Encode your samples with src/data_processing/encode_audio_data.py
Train with src/stereo_sample_gan.py
Generate new samples with the unified generation script

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github		.github
outputs		outputs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PercGAN

Pretrained Models

Kick Drum

Snare Drum

Model Usage

1. Prerequisites

2. Generate audio from pretrained models

Technical Approach

Custom Training

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

shuklabhay/percgan

Folders and files

Latest commit

History

Repository files navigation

PercGAN

Pretrained Models

Kick Drum

Snare Drum

Model Usage

1. Prerequisites

2. Generate audio from pretrained models

Technical Approach

Custom Training

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages