speech communication detection #822
-
Hello, First of all I want to thank you for your work! It's really incredible! I come from the world of mcu STM and I'm not used to have such a quality in writing libraries =) I'm doing a small project of portable object allowing a group of people to communicate. I am successfully using your implementation of espnow and the SBC codec with an i2s microphone as a source and an i2s amplifier for the output. I think right now the i2s stream is continuously push data to the espnow and overloads it. I would have wanted to have a way to detect the presence of human communication in order to push on the espnow only if needed. I have searched a lot but I did not find anything on the subject. Would anyone be able to direct me to a possible solution? Here is my current implementation of my half duplex communication: #include "AudioTools.h"
#include "AudioLibs/Communication.h"
#include "AudioCodecs/CodecSBC.h"
/*
1 --> Emmitter
2 --> Receiver
*/
#define MENUCHOICE 2
#if MENUCHOICE == 1
const char *peers[] = {"A8:48:FA:0B:93:01"};
int menuChoice = 1;
#elif MENUCHOICE == 2
const char *peers[] = {"A8:48:FA:0B:93:02"};
int menuChoice = 2;
#endif
#define CHANNELS 1
#define BITS_PER_SAMPLE 16
#define SAMPLE_RATE 16000
uint8_t switchMode = 1;
const int32_t max_buffer_len = 512;
int32_t buffer[max_buffer_len][2];
I2SStream i2sOut; // access I2S as stream
I2SStream i2sIn;
ESPNowStream now;
// audio decode/encode stream
EncodedAudioStream encoder(&now, new SBCEncoder()); // encode and write to esp-now
EncodedAudioStream decoder(&i2sOut, new SBCDecoder(256)); // decode and write to I2S - esp-now is limited to 256 bytes
// esp send/receive stream implementation
StreamCopy espSend(encoder, i2sIn); // encode sound and send it to esp-now
// Variable to store if sending data was successful
String success;
// Callback when data is received
void OnDataRecv(const uint8_t * mac, const uint8_t *incomingData, int len)
{
// write audio data to the decoder
decoder.write(incomingData, len);
}
void setup()
{
Serial.begin(115200);
AudioLogger::instance().begin(Serial, AudioLogger::Warning); // Debug, Info
// setup esp-now
auto config_esp = now.defaultConfig();
config_esp.recveive_cb = OnDataRecv;
// setup i2s send to speaker
auto config_out = i2sOut.defaultConfig(TX_MODE);
config_out.i2s_format = I2S_STD_FORMAT; // or try with I2S_LSB_FORMAT | I2S_STD_FORMAT
config_out.is_master = true;
config_out.port_no = 1;
config_out.pin_bck = 18;
config_out.pin_ws = 19;
config_out.pin_data = 23;
config_out.bits_per_sample = BITS_PER_SAMPLE;
config_out.channels = CHANNELS;
config_out.sample_rate = SAMPLE_RATE;
i2sOut.begin(config_out);
// setup i2s mic input for transmit to esp-now
auto config_in = i2sIn.defaultConfig(RX_MODE);
config_in.i2s_format = I2S_STD_FORMAT; // or try with I2S_LSB_FORMAT | I2S_STD_FORMAT
config_in.is_master = true;
config_in.port_no = 0;
config_in.pin_bck = 14;
config_in.pin_ws = 15;
config_in.pin_data = 22;
config_in.bits_per_sample = BITS_PER_SAMPLE;
config_in.channels = CHANNELS;
config_in.sample_rate = SAMPLE_RATE;
i2sIn.begin(config_in);
// start encoding from sound converted
encoder.begin(config_in);
// start decode
decoder.begin();
switch(menuChoice)
{
case 1: // Emmitter
// setup esp-now
config_esp.mac_address = "A8:48:FA:0B:93:02";
now.begin(config_esp);
now.addPeers(peers);
break;
case 2: // Receiver
// setup esp-now
config_esp.mac_address = "A8:48:FA:0B:93:01";
now.begin(config_esp);
now.addPeers(peers);
break;
}
}
void loop()
{
espSend.copy();
}`
|
Beta Was this translation helpful? Give feedback.
Replies: 9 comments 26 replies
-
You can use the VolumeOutput to determine if you have a signal and based on this, switch the forwarding to the ESPNow on or off.
To figure out if you have a capacity problem, I suggest to switch to the L8 codec and reduce the sample rate e.g to 8000. This should produce less data and you are more flexible to play with the sample rate. Maybe using 8 bits might even be good enough for the final solution if it is just for voice. I have never tested any duplex processing but I would have tried with 2 separate StreamCopy: one for reading from the ESPNow and one for writing. |
Beta Was this translation helpful? Give feedback.
-
Oops, my mistake: after testing, I thought that it might make sense to support the begin() and end() methods as well. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the quick fix, it's compiling now and I have full duplex audio but the quality is pretty bad. |
Beta Was this translation helpful? Give feedback.
-
https://pschatzmann.github.io/arduino-audio-tools/classaudio__tools_1_1_decoder_l8.html |
Beta Was this translation helpful? Give feedback.
-
Here is my test case that was producing a perfect sound: https://github.com/pschatzmann/arduino-audio-tools/tree/main/examples/tests/codecs/test-codec-l8 Maybe you have a problem with your mic! Did you test it w/o codec ? |
Beta Was this translation helpful? Give feedback.
-
in the code presented above I had forgotten to call dec.begin(config_in). I tested your case, it produces a clear sound as well. I tried my mic without codec, the sound is clean, so it doesn't seem to come from there. I have captured in csv a sound from the mic without codec and another from the mic with codec. it is clearly different and the resolution is perfectible on the decoded one. |
Beta Was this translation helpful? Give feedback.
-
yes: config_in.bits_per_sample = 16; |
Beta Was this translation helpful? Give feedback.
-
Tried to do some test, and I confirm that is seems that something is wrong. |
Beta Was this translation helpful? Give feedback.
-
I just committed a correction. Now both the signed EncodedAudioStream decoder(&out, new DecoderL8(true)); // encode and write
EncodedAudioStream encoder(&decoder, new EncoderL8(true)); // encode and write and unsigned EncodedAudioStream decoder(&out, new DecoderL8(false)); // encode and write
EncodedAudioStream encoder(&decoder, new EncoderL8(false)); // encode and write should work. Unsigned is default and that was the version which had issues. There was also a problem with the result length of the write() in the codec |
Beta Was this translation helpful? Give feedback.
You can use the VolumeOutput to determine if you have a signal and based on this, switch the forwarding to the ESPNow on or off.
I just added a new class OnOffOutput, so you can use the following output chain:
To figure out if you have a capacity problem, I suggest to switch to the L8 codec and reduce the sample rate e.g to 8000. This should produce less data and you are more flexible to play with the sample rate. Maybe using 8 bits might even be good enough for the final solution if it is just for voice.
I have never tested any duplex…