This first week of the New Year 2017, we’d like to extend our very best wishes and share our excitement as we count down the hours leading up to the 50th CES® conference – opening in Las Vegas on January 5th, 2017.
As we announced at the end of last year, we’ll be showcasing rSpeak at CES® and are looking forward to some exciting conversations about how speech-enabling content with lifelike TTS voices makes multimedia applications, embedded or mobile devices and platforms more engaging and relatable. If you’re planning to be in Las Vegas, let’s meet at the show! You’ll find us in the Holland Startup Pavilion at Booth 51431 in Eureka Park, Sands Level 1, Hall G.
Before we set off for one of the most influential tech appointments on the planet, we’d like to take you on a behind-the-scenes discovery of how best of breed rSpeak voices are created. The origins of this family of cutting edge text-to-speech voices are easy to trace, but there is quite a lot of work involved!
To produce an outstanding text-to-speech voice, the voice development team at rSpeak Technologies carefully selects professional voice talents available for each TTS language they are developing. Each stage of the process is extremely thorough to guarantee top quality. The initial selection involves a meticulous assessment of the artistic, commercial and technical characteristics of a speaker’s voice – and monitoring a comprehensive set of criteria for each of these. The voice talents rSpeak Technologies works with are typically actors or trained professional speakers.
Individual performance parameters for each voice candidate are measured and compared and a shortlist is drawn up. Once a voice actor has been chosen, he or she works alongside rSpeak staff over a period of several weeks. During these sessions, the voice talent records a demanding script that captures a multitude of characteristics needed to cover the sound patterns of the language in fine detail. The recording process is closely monitored to detect the slightest deviation in pronunciation, accentuation, or style.
The outcome of this process is a massive collection of voice recordings, which have been constantly monitored for consistency. If even the slightest glitch is detected, it’s back to the recording studio. That’s the first phase for you, in a nutshell. Stay tuned for the rest of the story…