Skip to content

Commit 2c2acf8

Browse files
committed
Update README
1 parent 2992d94 commit 2c2acf8

File tree

1 file changed

+9
-5
lines changed

1 file changed

+9
-5
lines changed

README.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
# Resemble Enhance
22

3+
[![PyPI](https://img.shields.io/pypi/v/resemble-enhance.svg)](https://pypi.org/project/resemble-enhance/)
4+
[![Hugging Face Space](https://img.shields.io/badge/Hugging%20Face%20%F0%9F%A4%97-Space-yellow)](https://huggingface.co/spaces/ResembleAI/resemble-enhance)
5+
[![License](https://img.shields.io/github/license/resemble-ai/Resemble-Enhance.svg)](https://github.com/resemble-ai/resemble-enhance/blob/main/LICENSE)
36

47
https://github.com/resemble-ai/resemble-enhance/assets/660224/bc3ec943-e795-4646-b119-cce327c810f1
58

6-
79
Resemble Enhance is an AI-powered tool that aims to improve the overall quality of speech by performing denoising and enhancement. It consists of two modules: a denoiser, which separates speech from a noisy audio, and an enhancer, which further boosts the perceptual audio quality by restoring audio distortions and extending the audio bandwidth. The two models are trained on high-quality 44.1kHz speech data that guarantees the enhancement of your speech with high quality.
810

911
## Usage
@@ -26,9 +28,9 @@ resemble_enhance in_dir out_dir
2628
resemble_enhance in_dir out_dir --denoise_only
2729
```
2830

29-
### Gradio
31+
### Web Demo
3032

31-
To serve the gradio demo, run:
33+
We provide a web demo built with Gradio, you can try it out [here](https://huggingface.co/spaces/ResembleAI/resemble-enhance), or also run it locally:
3234

3335
```
3436
python app.py
@@ -38,7 +40,7 @@ python app.py
3840

3941
### Data Preparation
4042

41-
You need to prepare a foreground speech dataset and a background non-speech dataset. In addition, you need to prepare a RIR dataset.
43+
You need to prepare a foreground speech dataset and a background non-speech dataset. In addition, you need to prepare a RIR dataset ([examples](https://github.com/RoyJames/room-impulse-responses)).
4244

4345
```bash
4446
data
@@ -49,7 +51,7 @@ data
4951
│   ├── 00001.wav
5052
│   └── ...
5153
└── rir
52-
   ├── 00001.wav
54+
   ├── 00001.npy
5355
   └── ...
5456
```
5557

@@ -65,6 +67,8 @@ python -m resemble_enhance.denoiser.train --yaml config/denoiser.yaml
6567

6668
#### Enhancer
6769

70+
Then, you can train the enhancer in two stages. The first stage is to train the autoencoder and vocoder. And the second stage is to train the latent conditional flow matching (CFM) model.
71+
6872
##### Stage 1
6973

7074
```bash

0 commit comments

Comments
 (0)