XPENG integrates Microsoft Text-to-Speech in P7 Smart Sedan
XPENG has upgraded its auto-grade voice assistant using Microsoft custom neural voice capability, based on Neural Text-to-Speech (TTS), a feature of Azure AI.
XPENG installed the new voice assistant functionality via a major over-the-air (OTA) upgrade for its P7 smart sedan customers in China.
Microsoft research breakthroughs in speech, natural language and machine translation have helped significantly advance the fluency, quality, fidelity and naturalness of voice assistant technology over the past several years. These innovations have been integrated into commercially-available speech and language capabilities within Azure Cognitive Services and other Microsoft products, so that companies like XPENG can bring richer, more engaging experiences to their customers.
XPENG worked with Microsoft to overcome several key challenges to create the new cutting-edge voice assistant integration. To deal with telecommunication network jitter while the car is moving, while reducing data traffic consumption and hardware burden, and ensuring continuous high-quality speech, XPENG introduced context-specific multi-level caches, caching high-quality sound in advance and distributing it to minimize reliance on the network.
To deliver natural-sounding high-fidelity speech, XPENG uses Microsoft Azure with caching and compression to deliver XPENG’s high-quality voice sampling rate of 24K Hz and quantization level of 16 bits, without overburdening the data network or the car’s own CPU. XPENG also worked with Microsoft to minimize ambiguity and to optimize accuracy in voice assistant speech.
As a result, the new voice assistant function has achieved new levels of lifelike voice fidelity, functionality, and scenario-specific applicability. With these new capabilities, XPENG can deploy voice assistance in even more usage scenarios, making voice assistance an integral part of the intuitive driving experience.
“This is a cutting-edge exploration of vehicle voice interaction in the auto industry,” said Hao Chao, a Senior Expert with XPENG Automotive AI Products. “It required months of dedicated work by our team to overcome the challenges, and now delivers a whole new level of natural speech. With a deep understanding of urban mobility, we are finding many more scenarios to leverage AI technology for a high level of driver-machine intuition.”
“With advancements in research and technology, Azure Cognitive Services like vision and speech, will play a pivotal role in defining unique in-vehicle experiences,” said Sanjay Ravi, General Manager, Automotive, Mobility, and Transportation Industry at Microsoft. “With speech as a primary interaction tool within the vehicle, Microsoft’s custom neural voice services enable automakers to develop their own differentiated and authentic branded experiences.”
XPENG has already rolled out the new voice assistant technology to P7 customers across China via OTA upgrades. In due course the company plans to introduce future generations of the upgraded voice assistant into other production models.