Convert G722 stream into raw pcm s16le stream #1915

msehnout · 2025-06-10T06:54:11Z

msehnout
Jun 10, 2025

Hello,

I'm trying to build a telco app and I need to convert audio codecs used in telco to raw pcm s16le for processing using voice-activity-detection, speech-to-text and other technologies. I managed to build a converter using libav from ALaw to PCM like this:

# working solution
    def convert_audio_chunk_alaw_to_pcm16(chunk: bytes | None) -> Iterator[bytes]:
        decoder: av.AudioCodecContext = av.codec.CodecContext.create("pcm_alaw", "r")  # pyright: ignore
        decoder.sample_rate = 8000
        decoder.layout = "mono"
        resampler = av.AudioResampler(format="s16", layout="mono", rate=16000)

        if chunk:
            packet = av.packet.Packet(chunk)  # pyright: ignore
        else:
            packet = None

        for frame in decoder.decode(packet):
            resampled: list[av.AudioFrame] = resampler.resample(frame)

            for sample in resampled:
                yield bytes(sample.planes[0])

I'm not sure why, but I had to strip few bytes of the end of each packet:

# working solution
return bytes(output_buffer[:-128]) if output_buffer else None

But it works and the output audio is clear and VAD/STT works without any issue.

Now I'm trying to build the same for G722 codec, but I haven't been able to get the code to work. My first attempt was simply replacing the decoder in the code above, but that did not work. So I tried to use LLM to generate the code for me, but that also resulted in non working solution:

# non-working solution
 container = av.open(io.BytesIO(chunk), format="g722")
 stream = container.streams.audio[0]
 resampler = AudioResampler(format='s16', layout='mono', rate=16000)

 for packet in container.demux(stream):
     for frame in packet.decode():
         resampled_frames = resampler.resample(frame)
         for resampled in resampled_frames:
             pcm = resampled.to_ndarray().astype(np.int16).tobytes()
             yield pcm

Any ideas how to get this to work?

I also tried running ffmpeg as a subprocess and simply sending audio to STDIN and getting it back from STDOUT like this

ffmpeg -f g722 -ac 1 -i - -f s16le -ar 16000 -ac 1 -

But that caused delay and I haven't been able to figure out where the delay comes from.

Thanks for any ideas 🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Convert G722 stream into raw pcm s16le stream #1915

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Convert G722 stream into raw pcm s16le stream #1915

Uh oh!

msehnout Jun 10, 2025

Replies: 0 comments

msehnout
Jun 10, 2025