Skip to content

Add audio processing module #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Add audio processing module #99

wants to merge 16 commits into from

Conversation

ladvoc
Copy link
Contributor

@ladvoc ladvoc commented Apr 10, 2025

This PR adds support for the WebRTC audio processing module and enables AEC for microphone tracks.

@ladvoc ladvoc marked this pull request as ready for review April 25, 2025 23:30
@ladvoc ladvoc requested a review from theomonnom April 25, 2025 23:30
@holofermes
Copy link
Contributor

holofermes commented Apr 27, 2025

Hey all, I've been keeping an eye on this PR, and just give this a spin, and while this works on mac/windows, it fails on android arm64 ( I assume it would be the same for the other arm architectures ).

I think the latest android .so does is not up to date:

NullReferenceException: Object reference not set to an instance of an object.
 at LiveKit.AudioProcessingModule..ctor (System.Boolean echoCancellationEnabled, System.Boolean noiseSuppressionEnabled, System.Boolean highPassFilterEnabled, System.Boolean gainControllerEnabled) [0x00000] in <00000000000000000000000000000000>:0 
 at LiveKit.RtcAudioSource..ctor (System.Int32 channels, LiveKit.RtcAudioSourceType audioSourceType) [0x00000] in <00000000000000000000000000000000>:0 

I noticed the install.py does not have android listed in the platforms, but I manually downloaded and replaced the ffi-android-arm64/liblivekit_ffi.so and ran a build with it but that doesn't work either as it fails to load the .so, which I suspect this is why the install.py does not grab the android builds.

@ladvoc
Copy link
Contributor Author

ladvoc commented Apr 28, 2025

Hi @holofermes, thank you for reporting this. Android should definitely be included as one of the platforms in install.py. I will look into this and make the necessary changes in a separate PR.

{
while (true)
{
Thread.Sleep(Constants.TASK_DELAY);
Copy link
Member

@theomonnom theomonnom Apr 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're likely going to have a skew here. (this will impact the AEC a lot in long room duration)
Is there a way to directly process the frames as we receive them?


private void OnAudioRead(float[] data, int channels, int sampleRate)
{
_captureBuffer.Write(data, (uint)channels, (uint)sampleRate);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could directly use ProcessReverseStream here?

@@ -101,78 +103,67 @@ private void Update()
while (true)
{
Thread.Sleep(Constants.TASK_DELAY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also get a skew here, so as soon as we're a bit late, we're going to hear bad quality input (jittery audio).

It's OK to push faster than realtime, the Rust-SDKs will handle it in a high precision queue

@theomonnom
Copy link
Member

I see the TASK_DELAY is 5ms.
We're sending 10ms, this work because Rust will buffer, so maybe it's fine?

@ladvoc
Copy link
Contributor Author

ladvoc commented Apr 30, 2025

Hi @theomonnom, thank you for your feedback. Yes, there does appear to be a skew for longer room durations. I've moved the calls to the APM methods directly into the audio filter callbacks, however, this seems to introduce some audio artifacts that I haven't been able to explain yet. I think the issue is related to the forward stream being processed before the reverse stream, however, I need to do some more investigation to see if this is the case.

@theomonnom
Copy link
Member

Hi @theomonnom, thank you for your feedback. Yes, there does appear to be a skew for longer room durations. I've moved the calls to the APM methods directly into the audio filter callbacks, however, this seems to introduce some audio artifacts that I haven't been able to explain yet. I think the issue is related to the forward stream being processed before the reverse stream, however, I need to do some more investigation to see if this is the case.

I think it's most likely because this function is too slow:

private void OnAudioRead(float[] data, int channels, int sampleRate)

You could also try to increase the default DSP buffer of Unity

Comment on lines +41 to +51
private void OnAudioRead(float[] data, int channels, int sampleRate)
{
_captureBuffer.Write(data, (uint)channels, (uint)sampleRate);
while (true)
{
using var frame = _captureBuffer.ReadDuration(AudioProcessingModule.FRAME_DURATION_MS);
if (frame == null) break;

_apm.ProcessReverseStream(frame);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this one too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants