speech communication detection #822

aeschimannr · 2023-05-03T20:04:18Z

aeschimannr
May 3, 2023

Hello,

First of all I want to thank you for your work! It's really incredible! I come from the world of mcu STM and I'm not used to have such a quality in writing libraries =)

I'm doing a small project of portable object allowing a group of people to communicate.

I am successfully using your implementation of espnow and the SBC codec with an i2s microphone as a source and an i2s amplifier for the output.
I currently have half duplex communication. I am looking for a full duplex implementation but the attempts were not conclusive.

I think right now the i2s stream is continuously push data to the espnow and overloads it. I would have wanted to have a way to detect the presence of human communication in order to push on the espnow only if needed.

I have searched a lot but I did not find anything on the subject. Would anyone be able to direct me to a possible solution?

Here is my current implementation of my half duplex communication:

#include "AudioTools.h"
#include "AudioLibs/Communication.h"
#include "AudioCodecs/CodecSBC.h"

/*
  1 --> Emmitter
  2 --> Receiver 
  */
#define MENUCHOICE      2

#if MENUCHOICE == 1
const char *peers[] = {"A8:48:FA:0B:93:01"};
int menuChoice = 1;
#elif MENUCHOICE == 2
const char *peers[] = {"A8:48:FA:0B:93:02"};
int menuChoice = 2;
#endif 

#define CHANNELS               1

#define BITS_PER_SAMPLE        16
#define SAMPLE_RATE            16000

uint8_t switchMode = 1;
const int32_t max_buffer_len = 512;
int32_t buffer[max_buffer_len][2];


I2SStream i2sOut; // access I2S as stream
I2SStream i2sIn; 

ESPNowStream now;

// audio decode/encode stream
EncodedAudioStream encoder(&now, new SBCEncoder()); // encode and write to esp-now
EncodedAudioStream decoder(&i2sOut, new SBCDecoder(256)); // decode and write to I2S - esp-now is limited to 256 bytes

// esp send/receive stream implementation
StreamCopy espSend(encoder, i2sIn); // encode sound and send it to esp-now

// Variable to store if sending data was successful
String success;


// Callback when data is received
void OnDataRecv(const uint8_t * mac, const uint8_t *incomingData, int len) 
{
  // write audio data to the decoder
  decoder.write(incomingData, len);
}

void setup()
{
  Serial.begin(115200);
  AudioLogger::instance().begin(Serial, AudioLogger::Warning); // Debug, Info

  // setup esp-now
  auto config_esp = now.defaultConfig();
  config_esp.recveive_cb = OnDataRecv;

  // setup i2s send to speaker
  auto config_out = i2sOut.defaultConfig(TX_MODE);
  config_out.i2s_format = I2S_STD_FORMAT; // or try with I2S_LSB_FORMAT | I2S_STD_FORMAT
  config_out.is_master = true;
  config_out.port_no = 1;
  config_out.pin_bck = 18;
  config_out.pin_ws = 19;
  config_out.pin_data = 23;
  config_out.bits_per_sample = BITS_PER_SAMPLE;
  config_out.channels = CHANNELS;
  config_out.sample_rate = SAMPLE_RATE;
  i2sOut.begin(config_out);

  // setup i2s mic input for transmit to esp-now
  auto config_in = i2sIn.defaultConfig(RX_MODE);
  config_in.i2s_format = I2S_STD_FORMAT; // or try with I2S_LSB_FORMAT | I2S_STD_FORMAT
  config_in.is_master = true;
  config_in.port_no = 0;
  config_in.pin_bck = 14;
  config_in.pin_ws = 15;
  config_in.pin_data = 22;
  config_in.bits_per_sample = BITS_PER_SAMPLE;
  config_in.channels = CHANNELS;
  config_in.sample_rate = SAMPLE_RATE;
  i2sIn.begin(config_in);

  // start encoding from sound converted
  encoder.begin(config_in);

  // start decode
  decoder.begin();

  switch(menuChoice)
  {
    case 1:  // Emmitter

      // setup esp-now
      config_esp.mac_address = "A8:48:FA:0B:93:02";
      now.begin(config_esp);
      now.addPeers(peers);

      break;
    case 2: // Receiver

      // setup esp-now
      config_esp.mac_address = "A8:48:FA:0B:93:01";
      now.begin(config_esp);
      now.addPeers(peers);

      break;
  }
}

void loop()
{
  espSend.copy();
}`

Answered by pschatzmann

May 4, 2023

You can use the VolumeOutput to determine if you have a signal and based on this, switch the forwarding to the ESPNow on or off.
I just added a new class OnOffOutput, so you can use the following output chain:

                                      -> VolumeOutput   
I2S - copy -> MultiOutput   
                                      -> OnOffOutput -> ESPNowStream

To figure out if you have a capacity problem, I suggest to switch to the L8 codec and reduce the sample rate e.g to 8000. This should produce less data and you are more flexible to play with the sample rate. Maybe using 8 bits might even be good enough for the final solution if it is just for voice.

I have never tested any duplex…

View full answer

pschatzmann · 2023-05-04T01:12:07Z

pschatzmann
May 4, 2023
Maintainer

You can use the VolumeOutput to determine if you have a signal and based on this, switch the forwarding to the ESPNow on or off.
I just added a new class OnOffOutput, so you can use the following output chain:

                                      -> VolumeOutput   
I2S - copy -> MultiOutput   
                                      -> OnOffOutput -> ESPNowStream

To figure out if you have a capacity problem, I suggest to switch to the L8 codec and reduce the sample rate e.g to 8000. This should produce less data and you are more flexible to play with the sample rate. Maybe using 8 bits might even be good enough for the final solution if it is just for voice.

I have never tested any duplex processing but I would have tried with 2 separate StreamCopy: one for reading from the ESPNow and one for writing.

1 reply

aeschimannr May 4, 2023
Author

Thank you for your quick answer and for the additional class OnOffOutput.

While trying to compile with your solution, I have the following error:

.pio/libdeps/esp32dev/audio-tools/src/AudioTools/AudioOutput.h:647:48: error: void value not ignored as it ought to be bool begin() override { return setActive(true); }

pschatzmann · 2023-05-04T08:34:38Z

pschatzmann
May 4, 2023
Maintainer

Oops, my mistake: after testing, I thought that it might make sense to support the begin() and end() methods as well.
The correction has been committed...

0 replies

aeschimannr · 2023-05-04T09:11:49Z

aeschimannr
May 4, 2023
Author

Thanks for the quick fix, it's compiling now and I have full duplex audio but the quality is pretty bad.
As you suggest, I think reducing the bandwidth will help. What do you mean by L8 codec?

0 replies

pschatzmann · 2023-05-04T09:51:52Z

pschatzmann
May 4, 2023
Maintainer

https://pschatzmann.github.io/arduino-audio-tools/classaudio__tools_1_1_decoder_l8.html
https://pschatzmann.github.io/arduino-audio-tools/classaudio__tools_1_1_encoder_l8.html

1 reply

aeschimannr May 4, 2023
Author

Just try i have produce the following sketch. Am I doing something wrong? The sound quality is really not good, and not usable even for voice conversation.

EncodedAudioStream dec(&out, new DecoderL8());
EncodedAudioStream enc(&dec, new EncoderL8());
StreamCopy i2sInEncOut(enc, in);

// start I2S in
auto config_in = in.defaultConfig(RX_MODE);
config_in.sample_rate = 16000;
config_in.bits_per_sample = 16;
config_in.channels = 1;
config_in.i2s_format = I2S_STD_FORMAT;
config_in.is_master = true;
config_in.port_no = 0;
config_in.pin_bck = 14;
config_in.pin_ws = 15;
config_in.pin_data = 22;
in.begin(config_in);

// start I2S out
auto config_out = out.defaultConfig(TX_MODE);
config_out.sample_rate = 16000;
config_out.bits_per_sample = 16;
config_out.channels = 1;
config_out.i2s_format = I2S_STD_FORMAT;
config_out.is_master = true;
config_out.port_no = 1;
config_out.pin_bck = 18;
config_out.pin_ws = 19;
config_out.pin_data = 23;
out.begin(config_out);

enc.begin(config_in);

loop()
{
i2sInEncOut.copy();
}

pschatzmann · 2023-05-04T11:12:54Z

pschatzmann
May 4, 2023
Maintainer

Here is my test case that was producing a perfect sound: https://github.com/pschatzmann/arduino-audio-tools/tree/main/examples/tests/codecs/test-codec-l8

Maybe you have a problem with your mic! Did you test it w/o codec ?

0 replies

aeschimannr · 2023-05-04T11:54:16Z

aeschimannr
May 4, 2023
Author

in the code presented above I had forgotten to call dec.begin(config_in).
But this did not change the quality of the sound.

I tested your case, it produces a clear sound as well. I tried my mic without codec, the sound is clean, so it doesn't seem to come from there. I have captured in csv a sound from the mic without codec and another from the mic with codec. it is clearly different and the resolution is perfectible on the decoded one.

Without encoder

With encoder

4 replies

pschatzmann May 4, 2023
Maintainer

This is pretty difficult to interpret since I don't know your recorded sound. I suggest that you record a sine wave e.g. from https://onlinetonegenerator.com/

aeschimannr May 4, 2023
Author

I understand, i was saying "1, 2 Test" in the mic =)

I record a 440Hz procduce by the website.

record it from the mic and push it to the csv plotter. here is the result:

without codec :

With codec:

pschatzmann May 4, 2023
Maintainer

Hmm, something seems to be wrong with the value range: L8 is converting between bits_per_sample 8 and bits_per_sample 16.
Are you sure you work with int16_t values ?

aeschimannr May 4, 2023
Author

no i'm not sure, this is my config from the i2s mic, (INMP441)

auto config_in = in.defaultConfig(RX_MODE);
config_in.sample_rate = 16000;
config_in.bits_per_sample = 16;
config_in.channels = 1;
config_in.i2s_format = I2S_STD_FORMAT;
config_in.is_master = true;
config_in.port_no = 0;
config_in.pin_bck = 14;
config_in.pin_ws = 15;
config_in.pin_data = 22;
in.begin(config_in);

pschatzmann · 2023-05-04T12:36:39Z

pschatzmann
May 4, 2023
Maintainer

yes: config_in.bits_per_sample = 16;
So the values will be between -32768 and 32767!
Your csv plotter does not make any sense: did you define it as CSVOutput<int16_t> ?

14 replies

pschatzmann May 4, 2023
Maintainer

Can you share the sketch, so that I can reproduce your issue ?

aeschimannr May 4, 2023
Author

Yes of course.

#include <Arduino.h>
#include "AudioTools.h"
#include "AudioCodecs/CodecL8.h"

int16_t sample_rate_mic=16000;
int8_t bits_per_sample_mic=16;

int16_t sample_rate_speaker=16000;
int8_t bits_per_sample_speaker=16;

int8_t channels = 1;

I2SStream in;
I2SStream out;

AudioInfo info(16000, 1, 8);
CsvStream<int16_t> csvStream(Serial, 1);
EncodedAudioStream decoder(&csvStream, new DecoderL8()); // encode and write
EncodedAudioStream encoder(&decoder, new EncoderL8()); // encode and write
StreamCopy i2sEncCSV(encoder, in);

void setup(void) {
// Open Serial
Serial.begin(115200);
while(!Serial);
AudioLogger::instance().begin(Serial, AudioLogger::Warning);

// start I2S in
auto config_in = in.defaultConfig(RX_MODE);
config_in.sample_rate = sample_rate_mic;
config_in.bits_per_sample = bits_per_sample_mic;
config_in.channels = 1;
config_in.i2s_format = I2S_STD_FORMAT;
config_in.is_master = true;
config_in.port_no = 0;
config_in.pin_bck = 14;
config_in.pin_ws = 15;
config_in.pin_data = 22;
in.begin(config_in);

// start I2S out
auto config_out = out.defaultConfig(TX_MODE);
config_out.sample_rate = sample_rate_speaker;
config_out.bits_per_sample = bits_per_sample_speaker;
config_out.channels = 1;
config_out.i2s_format = I2S_STD_FORMAT;
config_out.is_master = true;
config_out.port_no = 1;
config_out.pin_bck = 18;
config_out.pin_ws = 19;
config_out.pin_data = 23;
out.begin(config_out);

decoder.begin(info);
encoder.begin(config_in);
}

void loop()
{
i2sEncCSV.copy();
}

pschatzmann May 4, 2023
Maintainer

I guest that's what you get if you look at a signal with 16000Hz using Data format Auto 2600Hz ...
You need to increase the resolution

aeschimannr May 4, 2023
Author

i did the test with forcing the data format to 10000Hz (the plotter can't go higher).

Here is the result:

without codec:

with codec:

pschatzmann May 4, 2023
Maintainer

Hmm, I think this looks pretty good. If you can't go higher with the plotter you can decrease the frequency of the audio signal to 8000. This is still pretty good for speetch

pschatzmann · 2023-05-04T15:36:21Z

pschatzmann
May 4, 2023
Maintainer

Tried to do some test, and I confirm that is seems that something is wrong.
I did not figure out yet what it is...

0 replies

pschatzmann · 2023-05-04T18:12:33Z

pschatzmann
May 4, 2023
Maintainer

I just committed a correction. Now both the signed

EncodedAudioStream decoder(&out, new DecoderL8(true)); // encode and write
EncodedAudioStream encoder(&decoder, new EncoderL8(true)); // encode and write

and unsigned

EncodedAudioStream decoder(&out, new DecoderL8(false)); // encode and write
EncodedAudioStream encoder(&decoder, new EncoderL8(false)); // encode and write

should work. Unsigned is default and that was the version which had issues. There was also a problem with the result length of the write() in the codec

6 replies

aqildad-create May 9, 2023

@aeschimannr Pls share the code and Buffers values if possible .

aeschimannr May 9, 2023
Author

sure here it is:
#include "AudioTools.h"
#include "freertos-all.h"
#include "AudioLibs/Communication.h"
#include "AudioCodecs/CodecSBC.h"

/* Configure both esp32 with different MAC address
1 --> Peripheral
2 --> Central
*/
#define MENUCHOICE 2

#if MENUCHOICE == 1
const char *peers[] = {"A8:48:FA:0B:93:01"};
#elif MENUCHOICE == 2
const char *peers[] = {"A8:48:FA:0B:93:02"};
#endif

#define CHANNELS 1

#define BITS_PER_SAMPLE 16
#define SAMPLE_RATE 16000

I2SStream i2sOut; // going to the speaker
I2SStream i2sIn; // coming from the mic

TaskHandle_t TaskProcessOut;
TaskHandle_t TaskProcessIn;

ESPNowStream now; // esp now stream

// audio decode/encode stream
EncodedAudioStream encoder(&now, new SBCEncoder()); // encode and write to esp-now

EncodedAudioStream decoder(&i2sOut, new SBCDecoder(256)); // decode and write to I2S - esp-now is limited to 256 bytes
// send i2s from mic to the multi output
StreamCopy i2sInNow(encoder, i2sIn);
StreamCopy nowI2sOut(decoder, now);

void task_processOut(void * parameter)
{
for(;;)
{ // infinite loop
i2sInNow.copy();
}
}

void task_processIn(void *parameter)
{
for(;;)
{
nowI2sOut.copy();
}

}

void setup()
{
// Serial.begin(115200);
// AudioLogger::instance().begin(Serial, AudioLogger::Warning); // Debug, Info, Warning

// setup esp-now
auto config_esp = now.defaultConfig();

// setup i2s send to speaker
auto config_out = i2sOut.defaultConfig(TX_MODE);
config_out.i2s_format = I2S_STD_FORMAT; // or try with I2S_LSB_FORMAT | I2S_STD_FORMAT
config_out.is_master = true;
config_out.port_no = 1;
config_out.pin_bck = 18;
config_out.pin_ws = 19;
config_out.pin_data = 23;
config_out.bits_per_sample = BITS_PER_SAMPLE;
config_out.channels = CHANNELS;
config_out.sample_rate = SAMPLE_RATE;
i2sOut.begin(config_out);

// setup i2s mic input for transmit to esp-now
auto config_in = i2sIn.defaultConfig(RX_MODE);
config_in.i2s_format = I2S_STD_FORMAT; // or try with I2S_LSB_FORMAT | I2S_STD_FORMAT
config_in.is_master = true;
config_in.port_no = 0;
config_in.pin_bck = 14;
config_in.pin_ws = 15;
config_in.pin_data = 22;
config_in.bits_per_sample = BITS_PER_SAMPLE;
config_in.channels = CHANNELS;
config_in.sample_rate = SAMPLE_RATE;
i2sIn.begin(config_in);

xTaskCreatePinnedToCore(
task_processOut, // Function that should be called
"process Out", // Name of the task (for debugging)
10000, /* Stack size of task /
NULL, / parameter of the task /
1, / priority of the task /
&TaskProcessOut, / Task handle to keep track of created task /
1 / pin task to core 1 */
);

xTaskCreatePinnedToCore(
task_processIn, // Function that should be called
"udp In", // Name of the task (for debugging)
10000, /* Stack size of task /
NULL, / parameter of the task /
1, / priority of the task /
&TaskProcessIn, / Task handle to keep track of created task /
0 / pin task to core 0 */
);

// start encoding from sound converted
encoder.begin(config_in);

// start decode
decoder.begin(config_in);

#if MENUCHOICE == 1
// setup esp-now
config_esp.mac_address = "A8:48:FA:0B:93:02";
now.begin(config_esp);
now.addPeers(peers);
#endif

#if MENUCHOICE == 2
// setup esp-now
config_esp.mac_address = "A8:48:FA:0B:93:01";
now.begin(config_esp);
now.addPeers(peers);
#endif
}

void loop()
{
vTaskDelay(1000);
}

aqildad-create May 9, 2023

can you share the audio quality please ? i am getting not good quality at 16000 sampling ..
@44100 audio quality is good but getting buffering issue .

How is your audio quality and you are satisfy with audio quality ?

aeschimannr May 9, 2023
Author

have you disable all printf? (Serial.print)

aqildad-create May 9, 2023

Yes as far i remember all printf were disabled .

How far distance you test with full duplex ? Does Audio quality decrease with distance b/w 2 ESPS ?

Is your ESP with external antenna or not ?

Uh oh!

speech communication detection #822

Uh oh!

Uh oh!

aeschimannr May 3, 2023

Replies: 9 comments · 26 replies

Uh oh!

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

aeschimannr May 4, 2023 Author

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

Uh oh!

pschatzmann May 4, 2023 Maintainer

Uh oh!

aqildad-create May 9, 2023

Uh oh!

aeschimannr May 9, 2023 Author

Uh oh!

aqildad-create May 9, 2023

Uh oh!

aeschimannr May 9, 2023 Author

Uh oh!

Uh oh!

aeschimannr
May 3, 2023

Replies: 9 comments 26 replies

pschatzmann
May 4, 2023
Maintainer

aeschimannr May 4, 2023
Author

pschatzmann
May 4, 2023
Maintainer

aeschimannr
May 4, 2023
Author

pschatzmann
May 4, 2023
Maintainer

aeschimannr May 4, 2023
Author

pschatzmann
May 4, 2023
Maintainer

aeschimannr
May 4, 2023
Author

pschatzmann May 4, 2023
Maintainer

aeschimannr May 4, 2023
Author

pschatzmann May 4, 2023
Maintainer

aeschimannr May 4, 2023
Author

pschatzmann
May 4, 2023
Maintainer

pschatzmann May 4, 2023
Maintainer

aeschimannr May 4, 2023
Author

pschatzmann May 4, 2023
Maintainer

aeschimannr May 4, 2023
Author

pschatzmann May 4, 2023
Maintainer

pschatzmann
May 4, 2023
Maintainer

pschatzmann
May 4, 2023
Maintainer

aeschimannr May 9, 2023
Author

aeschimannr May 9, 2023
Author