Chapter 25:

VOICE CHAT WITH CHATGPT

Setting Up Voice Chat on Raspberry Pi with ChatGPT and Wake Word "Porcupine"

This guide will help you set up a Raspberry Pi-based voice chat system using ChatGPT, with the wake word "Porcupine" for activation.

Requirements

  • Raspberry Pi with Raspbian OS

  • USB microphone and audio output device (speakers/headphones)

  • Internet connection

  • Basic Python programming knowledge

Step 1: Configure Audio Devices

  1. List Audio Devices:

    • Identify your audio output and input devices:

      aplay -l # Lists output devices arecord -l # Lists input devices

    • Note the card number and device number for each.

  2. Create .asoundrc Configuration:

    • Edit the .asoundrc file in your home directory:

      sudo nano /home/pi/.asoundrc

    • Include your noted card and device numbers in this configuration:

      pcm.!default {

      type asym

      capture.pcm "mic"

      playback.pcm "speaker"

      }

      pcm.mic {

      type plug

      slave {

      pcm "hw:2,0"

      rate 48000

      }

      }

      pcm.speaker {

      type plug

      slave {

      pcm "hw:1,0"

      }

      }

  3. Test Audio Setup:

    • Test speaker output:

      speaker-test -t wav

    • Test microphone recording:

      arecord --format=S16_LE --duration=5 --rate=16000 --file-type=raw out.raw

      aplay --format=S16_LE --rate=16000 out.raw

Step 2: Install Dependencies for Voice Chatbot

  1. Install Necessary Packages:

    • Run the following commands to install required packages:

      sudo apt install espeak espeak-ng flac

      pip3 install soundfile pyttsx3 sounddevice scipy openai

      CODE:

      import time

      import os

      import openai

      import sounddevice as sd

      import soundfile as sf

      import pyttsx3

      import speech_recognition as sr

      from scipy.io import wavfile

      # Set up OpenAI API credentials

      openai.api_key = "YOUR_OPENAI_API_KEY"

      # Define function to interact with ChatGPT and return its response

      def ask_chatbot(prompt):

      # Set up parameters for the API request

      model_engine = "text-davinci-002"

      prompt = f"{prompt}\nChatbot:"

      max_tokens = 1024

      temperature = 0.7

      # Send the prompt to the API and wait for the response

      response = openai.Completion.create(

      engine=model_engine,

      prompt=prompt,

      max_tokens=max_tokens,

      temperature=temperature,

      )

      time.sleep(1) # Wait a bit to avoid rate limiting

      return response.choices[0].text.strip()

      # Define function to convert text to speech

      def speak_response(response):

      engine = pyttsx3.init()

      engine.setProperty("rate", 150) # You can adjust the speech rate (words pe$

      engine.say(response)

      engine.runAndWait()

      # Define function to record audio from USB microphone

      def record_audio(file_name):

      sample_rate = 16000

      duration = 5 # Set the duration for recording (in seconds)

      print("Recording audio...")

      audio = sd.rec(int(duration * sample_rate), samplerate=sample_rate, channels=1, dtype=”float32”)

      sd.wait() # Wait until recording is finished

      print("Finished recording.")

      # Save the recorded audio to a file

      sf.write(file_name, audio, sample_rate)

      # Define function to transcribe audio to text

      def transcribe_audio(file_name):

      r = sr.Recognizer()

      with sr.AudioFile(file_name) as source:

      audio_data = r.record(source)

      text = r.recognize_google(audio_data)

      return text

      # Start voice chat with the Chatbot

      while True:

      record_audio("user_input.wav")

      user_input_text = transcribe_audio("user_input.wav")

      if user_input_text.lower() in ["exit", "quit", "bye"]:

      break

      bot_response = ask_chatbot(user_input_text)

      print("Chatbot:", bot_response)

      speak_response(bot_response)

      Change "YOUR OPENAI API KEY" to your actual API key.

  2. Install Porcupine for Wake Word Detection:

Step 3: Set Up ChatGPT and Voice Functions

  1. Set Up OpenAI API Credentials:

    • Replace "YOUR_API_KEY" in the script with your OpenAI API key.

  2. Implement Voice Chat Functions:

    • Use the provided Python script to define functions for:

      • Interacting with ChatGPT (ask_chatbot).

      • Converting text to speech (speak_response).

      • Recording audio from the USB microphone (record_audio).

      • Transcribing audio to text (transcribe_audio).

Step 4: Implement the Main Voice Chat Loop

  1. Set Up Porcupine Wake Word Detection:

    • Use the Porcupine instance to listen for the wake word "Porcupine".

  2. Voice Chat Process:

    • When "Porcupine" is detected, record the user's speech.

    • Transcribe the speech to text and send it to ChatGPT.

    • Use the ChatGPT response and convert it to speech.

  3. Start the Voice Chat:

    • The script runs in a loop, waiting for the wake word to initiate interaction.

Finalizing

  • Test the entire setup by saying "Porcupine" followed by your message.

  • Ensure your Raspberry Pi is connected to the internet for API access.

    CODE:

    Voice Chat with hot word and custom duration of question:

    import time

    import struct

    import openai

    import sounddevice as sd

    import soundfile as sf

    import pyaudio

    import pyttsx3

    import pvporcupine

    import speech_recognition as sr

    from pydub import AudioSegment

    from pydub.silence import split_on_silence

    import RPi.GPIO as GPIO

    GPIO.setmode(GPIO.BOARD)

    GPIO.setwarnings(False)

    led = 16

    GPIO.setup(led, GPIO.OUT)

    GPIO.output(led, GPIO.LOW)

    porcupine = None

    pa = None

    audio_stream = None

    # Set up OpenAI API credentials

    openai.api_key = "YOUR_OPENAI_API_KEY"

    # Define function to interact with ChatGPT and return its response

    def ask_chatbot(prompt):

    # Set up parameters for the API request

    model_engine = "text-davinci-002"

    prompt = f"{prompt}\nChatbot:"

    max_tokens = 1024

    temperature = 0.7

    # Send the prompt to the API and wait for the response

    response = openai.Completion.create(

    engine=model_engine,

    prompt=prompt,

    max_tokens=max_tokens,

    temperature=temperature,

    )

    time.sleep(1) # Wait a bit to avoid rate limiting

    return response.choices[0].text.strip()

    # Define function to convert text to speech

    def speak_response(response):

    engine = pyttsx3.init()

    engine.setProperty("rate", 150) # You can adjust the speech rate (words per minute)

    engine.say(response)

    engine.runAndWait()

    # Define function to record audio from USB microphone

    def record_audio(file_name):

    sample_rate = 16000

    duration = 10 # Set the initial duration for recording (in seconds)

    min_silence_length = 1000 # Minimum silence length for end-of-sentence detection (in milliseconds)

    silence_threshold = -40 # Silence threshold for end-of-sentence detection (in dB)

    print("Recording audio...")

    audio = sd.rec(int(duration * sample_rate), samplerate=sample_rate, channels=1, dtype="float32")

    sd.wait() # Wait until recording is finished

    print("Finished recording.")

    # Save the recorded audio to a file

    sf.write(file_name, audio, sample_rate)

    # Load the recorded audio file

    audio_segment = AudioSegment.from_file(file_name, format="wav")

    # Split the audio based on silence (end-of-sentence detection)

    chunks = split_on_silence(

    audio_segment,

    min_silence_len=min_silence_length,

    silence_thresh=silence_threshold

    )

    # Determine the longest chunk (assuming it contains the main question)

    main_chunk = max(chunks, key=len)

    # Export the longest chunk as the final audio file

    main_chunk.export(file_name, format="wav")

    # Define function to transcribe audio to text

    def transcribe_audio(file_name):

    r = sr.Recognizer()

    with sr.AudioFile(file_name) as source:

    audio_data = r.record(source)

    text = r.recognize_google(audio_data)

    return text

    # Start voice chat with the Chatbot

    try:

    porcupine = pvporcupine.create(access_key="e6zujir/A8i0cDt+7q8uUkD3QQORoKM+lqK6v20vYf2lpmjJHn76Ag==", keywords= ["picovoice", "blueberry"])

    pa = pyaudio.PyAudio()

    audio_stream = pa.open(

    rate=porcupine.sample_rate,

    channels=1,

    format=pyaudio.paInt16,

    input=True,

    frames_per_buffer=porcupine.frame_length)

    while True:

    pcm = audio_stream.read(porcupine.frame_length)

    pcm = struct.unpack_from("h" * porcupine.frame_length, pcm)

    keyword_index = porcupine.process(pcm)

    if keyword_index >= 0:

    GPIO.output(led, GPIO.HIGH)

    print("Hotword Detected")

    record_audio("user_input.wav")

    GPIO.output(led, GPIO.LOW)

    user_input_text = transcribe_audio("user_input.wav")

    if user_input_text.lower() in ["exit", "quit", "bye"]:

    break

    bot_response = ask_chatbot(user_input_text)

    print("Chatbot:", bot_response)

    speak_response(bot_response)

    finally:

    if porcupine is not None:

    porcupine.delete()

    if audio_stream is not None:

    audio_stream.close()

    if pa is not None:

    pa.terminate()

    https://github.com/Picovoice/Porcupine#python-demos

    https://picovoice.ai/docs/quick-start/porcupine-python/

Note: This guide requires familiarity with Python and basic Raspberry Pi operations. Always test each component (audio, wake word detection, ChatGPT interaction) individually before running the full system. Adjust the microphone and speaker settings according to your specific hardware.