Video Call with ChatGPT
Learn from our proof of concept that allows you to have a video call with ChatGPT. We added an avatar to the Large Language Model (LLM) to enable users to interact via chat on multiple topics.
OpenAI's ChatGPT is recognized for its advanced text-based interactions among the expanding language learning models (LLMs) family. However, numerous UI clients for LLMs, including ChatGPT, are currently limited to text-based interactions.
We strive to enhance this experience by incorporating an avatar to humanize LLMs and enabling voice interactions. Our goal is to infuse the LLM with more personality and elevate conversations to a more enjoyable level, even though it already possesses an impressive personality!
What’s New?
- Voice Interactions: Now, you can speak to ChatGPT, and it will respond in a human-like voice.
- Avatar Integration: The avatar adds personality to your interactions, making conversations more natural and engaging.
How It Works
- To make this possible, we integrated several technologies:
- Speech-to-Text: Converts your spoken words into text for LLM to process.
- Text-to-Speech: Transforms LLM’s text responses into audio to respond to you.
- 3D Animation: Synchronizes the avatar’s facial movements with the speech (audio output) for a realistic effect.
This is the outcome!
https://youtu.be/MiiJ8rcU11A
Tech Stack
In this Proof of Concept (PoC), we used a combination of SaSS models, APIs, and open-source tools:
- ChatGPT: As the core Language Model.
- Azure Speech-to-Text: For converting user audio into text.
- Azure Text-to-Speech: For transforming text responses into audio.
- ThreeJS: For animating the 3D model.
Looking Ahead
We're excited about future developments. We can enhance its capabilities with tools like LangChain, AutoGPT, or BabyAGI.
- Enhanced English Teaching: Adding speech analysis for pronunciation feedback.
- Broader Knowledge Integration: Incorporating Internet and YouTube search capabilities for tailored lesson recommendations.
Explore the full blog article for an in-depth look at our journey, challenges, solutions, and future plans. => https://www.codelink.io/blog/post/video-call-with-chat-gpt
Stay tuned for more updates!