Kyutai Labs Launches Moshi AI Chatbot with Real-Time Voice Features to Rival GPT-4

July 11, 2024
Moshi AI Chatbot
542
Views

Kyutai Labs has introduced Moshi AI Chatbot, a cutting-edge chatbot capable of real-time verbal responses. The French company has revealed that Moshi’s comprehensive audio language model was created entirely in-house. The chatbot can modulate its voice to convey emotions and adapt to different speaking styles. Moshi AI is publicly accessible at no cost, with a current conversation limit of five minutes. Meanwhile, OpenAI has also revealed plans for similar speech features in their upcoming GPT-4o release, which is still pending.

According to the company, the AI model was created in just six months by a team of eight developers. During the launch event in Paris, Kyutai Labs emphasized that Moshi is not merely an AI assistant but a versatile prototype designed for developing various tools. The chatbot is now publicly accessible, allowing users to join the queue by entering their email. 

The platform’s interface boasts a minimalist design, featuring an intuitive AI layout. Users can monitor their voice loudness while speaking. Responses from the AI appear in a dedicated text box, and a separate box at the top displays technical details such as audio duration, latency, and any missed audio segments.

A button at the top allows users to disconnect the call, which currently has a maximum duration of five minutes. The description page emphasizes that Moshi can simultaneously think, speak, and listen, enhancing the flow of conversations. The latency is impressively low, with the AI often responding almost instantly. However, there are occasional instances where response times can exceed 10-15 seconds, likely due to heavy server load. Additionally, there are times when verbal prompts are not registered, even when the volume meter is nearly full.

The AI model is capable of responding with an expressive voice, utilizing various styles and modulations. It is also connected to the Internet, enabling it to fetch information for web-based queries. Notably, the chatbot exclusively supports voice interaction, with no option for text input. Kyutai Labs has announced plans to open-source the AI model. However, the model weights and code have not yet been made available on a portal. Once released, users will be able to download, install, and run the AI locally on an offline device.

Article Tags:
· · · · ·
Article Categories:
Tech News

Leave a Reply

Your email address will not be published. Required fields are marked *

The maximum upload file size: 256 MB. You can upload: image, audio, video, document, spreadsheet, interactive, text, archive, code, other. Links to YouTube, Facebook, Twitter and other services inserted in the comment text will be automatically embedded. Drop file here