At the Ignite conference, Microsoft announced a service for creating photorealistic avatars of people with lip animation according to a given text. It also showed a tool for voice cloning by audio sample.
Here's What We Know
The new Azure AI Speech text to speech avatar service allows you to upload a photo of a person and compose a script. A video of a speaking avatar is then generated based on this.
The digital doppelgangers can speak several languages. In scripts, they can use artificial intelligence models such as OpenAI's GPT-3.5 to answer customer questions outside of scripts.
Another Personal voice feature can recreate a user's voice in seconds. It requires a one-minute audio recording.
The company suggests using Personal voice to create personalised voice assistants, dubbing content into different languages and creating custom narration for stories, audiobooks and podcasts.
According to Microsoft, both tools will be available to a limited number of users and only for certain scenarios. In addition, customers must give explicit consent for their voice and image to be used.
This is intended to limit the potential misuse of technology to create dipfakes without people's knowledge. Microsoft says it is taking a responsible approach to AI ethics.
AFFILIATE DISCLOSURE Some posts may contain affiliate links. Gagadget.com is a participant in the Amazon Services LLC Associates
Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking