websocket Twilio媒体流通过网络套接字是非常静态保存到文件时

l7wslrjt  于 6个月前  发布在  其他
关注(0)|答案(1)|浏览(51)

我有一个Python程序,它可以从Twilio电话中启动一个WebSocket,并将电话中的音频保存到文件系统中的WAV文件中。它可以工作!我的程序切换到一个WebSocket,我能够将音频缓冲到一个字节数组中,并将该数组保存到WAV文件中。一切都很好,但当我试图播放该音频文件时,这是非常staticy和听起来像是非常低的质量。我不确定这是否是实际质量的音频流通过WebSocket或如果我做了一些错误的接收或保存音频。我包括程序在这里。

# Program to accept a phone call via Twilio and save what the speaker says to a WAVE file
# Need to have ngrok, FastAPI and Twilio all setup properly

import os
import json
import base64
import wave
from fastapi import FastAPI, WebSocket, Request, WebSocketDisconnect
from fastapi.responses import HTMLResponse, Response
from jinja2 import Template

app = FastAPI()

# Global variables
wsserver = []

# Set the filename for writing the media stream
output_filename = "output.wav"

# Our buffer where we will queue up all the streamed audio
pcmu_data = bytearray()

# Default values, but will be overriden when we get our "start" message
channels = 1  # Mono
sample_width = 1  # 8-bit samples for PCMU
frame_rate = 8000  # Typical frame rate for PCMU

@app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
    global wsserver
    await websocket.accept()
    wsserver.append(websocket)
    while True:
        message = await websocket.receive_text()
        await on_message(websocket, message)

async def on_message(websocket, message):
    global wsserver
    global frame_rate
    global channels
    global pcmu_data
    global sample_width

    try:
        msg = json.loads(message)
        event = msg.get("event")

        if event == "connected":
            print("A new call has connected.")

        elif event == "start":
            print(f"Starting Media Stream {msg.get('streamSid')}")
            print(msg)

            # Override our default values with what our start message tells us
            channels = msg['start']['mediaFormat']['channels']
            frame_rate = msg['start']['mediaFormat']['sampleRate']

        # The event that carries our audio stream
        elif event == "media":
            payload = msg['media']['payload']
        
            if payload:
                # Decode base64-encoded media data
                media_bytes = base64.b64decode(payload)

                # Add the data onto the end of our byte array
                pcmu_data.extend(media_bytes)

        elif event == "stop":
            print("Call Has Ended")

            # How long was our mulaw stream
            frames = len(pcmu_data)

            # Create a WAV file with an initial number of frames set to 0
            with wave.open(output_filename, 'w') as wav_file:
                wav_file.setnchannels(channels)
                wav_file.setsampwidth(sample_width)
                wav_file.setframerate(frame_rate)
                wav_file.setnframes(frames)
                wav_file.writeframes(pcmu_data)

    except WebSocketDisconnect as e:
        print("WebSocket disconnectedL {e}")
        wsserver.remove(websocket)

@app.post("/")
async def post(request: Request):
    host = request.client.host
    print("Post call - host=" + host)
    xml = Template("""
    <Response>
        <Start>
            <Stream url="wss://83b4-73-70-107-57.ngrok-free.app/ws"/>
        </Start>
        <Say>Please state your message</Say>
        <Pause length="60" />
    </Response>
    """).render(host=host)
    return Response(content=xml, media_type="text/xml")

if __name__ == "__main__":
    print("Listening at Port 8080")
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8080)

字符串
这成功地工作,但保存到文件系统的音频流非常静态。

pb3skfrl

pb3skfrl1#

我从使用wave切换到使用pywav库,代码的编写方式如下,结果非常清晰:

data_bytes = b"".join(pcmu_data)
wave_write = pywav.WavWrite(output_filename, 1, 8000, 8, 7)  # 1 stands for mono channel, 8000 sample rate, 8 bit, 7 stands for MULAW encoding
wave_write.write(data_bytes)
wave_write.close()

字符串

相关问题