Request: voice enhancement #126

nachazo · 2024-01-02T10:54:37Z

nachazo
Jan 2, 2024

Hello!
This is a totally fantastic toolbox, congrats & thanks!

I only miss a common feature I use: AI voice enhancement. Normally, I use "Enhance speech" from Adobe (podcast.adobe.com/enhance).

I don't know if using open AI models someone are using anything similar, but here is my request and "desires" for next version!

Many thanks!

LWinterberg · 2024-01-02T12:45:11Z

LWinterberg
Jan 2, 2024

I have a suspicion that the "enhance speech" of Adobe podcast is just a EQ + compressor slapped onto your voice. Those are things you can do in Audacity as well, either using the native plugins or one of the many VSTs. AI models likely will be way overkill for that.

0 replies

nachazo · 2024-01-02T12:52:05Z

nachazo
Jan 2, 2024
Author

I have a suspicion that the "enhance speech" of Adobe podcast is just a EQ + compressor slapped onto your voice. Those are things you can do in Audacity as well, either using the native plugins or one of the many VSTs. AI models likely will be way overkill for that.

Hi! In my experience, the process also applies noise reduction and echo remover. Then, EQ and seems to do some compressor+limiter.

In my opinion, the most interesting is the AI echo remover (as the noise reduction is in openvino already).

0 replies

RyanMetcalfeInt8 · 2024-01-02T13:26:22Z

RyanMetcalfeInt8
Jan 2, 2024
Maintainer

Hi @nachazo,

Thanks for the feedback! I'll keep an eye out for more open source noise suppression / voice enhancement models that we could potentially add support for. I have been looking around a bit for noise suppression models that work better than the default one included (dense-unet), as I'm not too thrilled with the quality that it produces... nothing has jumped out yet but I need to keep looking. Ideally I'd like to support a set of noise suppression models that work well for various situations / environments.

Regards,
Ryan

0 replies

RyanMetcalfeInt8 · 2024-01-10T19:54:36Z

RyanMetcalfeInt8
Jan 10, 2024
Maintainer

Just a quick update here -- I was looking into this open source project ( https://github.com/resemble-ai/resemble-enhance ). A nice writeup is here:

It provides a couple of models. One for denoising, and one for enhancement. This could be something worth porting over / pulling into the set of plugins if the quality is good enough.

I was trying out the web-based demo on some noisy audio that I had lying around. I thought that the denoise output was similar to running 'noise suppression + normalize', so I wasn't too impressed with that one. I found the 'enhanced' audio sounded a bit overprocessed. All in all, pretty far away from Adobe's speech enhance output (which is what I'm looking to find).

Anyway, I concluded my initial evaluation on resemble as something I'll wait on, and see if this project improves (it's only a few weeks old after all).

It's possible I was expecting too much, or used samples that it wasn't trained to enhance. Let me know if anyone has good luck with the demo, and thinks it would indeed be useful as new plugin feature. Happy to take a second glance.

Also, feel free to point out open source projects that I might have missed.

Ryan

0 replies

nachazo · 2024-01-15T11:23:54Z

nachazo
Jan 15, 2024
Author

Hi! I don't know the license or if the code is only for nvidia, but here: https://github.com/NVIDIA/MAXINE-AFX-SDK
Seems to be some effects of the "NVIDIA Broadcast" app (https://www.nvidia.com/en-us/geforce/broadcasting/broadcast-app/) (Room Echo Cancellation, Background Noise Suppression). Video demo: https://www.youtube.com/watch?v=_kHFTeL1RVU
See you!!

0 replies

RyanMetcalfeInt8 · 2024-01-16T01:37:15Z

RyanMetcalfeInt8
Jan 16, 2024
Maintainer

Hi @nachazo,

Right, it looks like model itself is proprietary, as far as I can tell -- so unfortunately I can't do much with that.

The part of the code that is open source (MIT) seems to be the upper layer of (control) software that sort of moves audio samples in / out of the broadcast SDK.

Thanks,
Ryan

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request: voice enhancement #126

Uh oh!

{{title}}

Uh oh!

Replies: 6 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Request: voice enhancement #126

Uh oh!

nachazo Jan 2, 2024

Replies: 6 comments

Uh oh!

LWinterberg Jan 2, 2024

Uh oh!

Uh oh!

nachazo Jan 2, 2024 Author

Uh oh!

RyanMetcalfeInt8 Jan 2, 2024 Maintainer

Uh oh!

RyanMetcalfeInt8 Jan 10, 2024 Maintainer

Uh oh!

Uh oh!

nachazo Jan 15, 2024 Author

Uh oh!

RyanMetcalfeInt8 Jan 16, 2024 Maintainer

nachazo
Jan 2, 2024

LWinterberg
Jan 2, 2024

nachazo
Jan 2, 2024
Author

RyanMetcalfeInt8
Jan 2, 2024
Maintainer

RyanMetcalfeInt8
Jan 10, 2024
Maintainer

nachazo
Jan 15, 2024
Author

RyanMetcalfeInt8
Jan 16, 2024
Maintainer