Add Pause between words of audio file Calculate the Energy of the audio samples using STFT, and then add silent between captured words. This is quite basic approach. To better handle this, should use ML for ASR(automatic speech recognition), and then deal with it. :)