OpenAI has unveiled a tool for voice cloning

By: Bohdan Kaminskyi | 01.04.2024, 16:33

Mariia Shalabaieva/Unsplash

OpenAI has unveiled Voice Engine, a voice cloning tool that can essentially duplicate someone's speech based on a 15-second audio sample.

Here's What We Know

Voice Engine is based on an existing text-to-speech API and has been in development since 2022. OpenAI is already using a version of this toolkit to work with preset voices.

The technology could find applications in areas such as reading aloud, translating languages and helping people with speech impairments. As an example, OpenAI described a pilot project at Brown University where a voice engine clone was created for a patient with a speech disorder based on previously recorded audio.

Despite the potential benefits, there are concerns about the potential misuse of the technology to create fake audio content. As such, OpenAI is not yet ready for a full public release of Voice Engine and is focused on addressing privacy and security concerns.

The company said it is incorporating feedback from partners across various industries, including government, media and civil society, to ensure a safe launch of the product. All pre-test participants must adhere to a usage policy that prohibits impersonating another person without consent.

OpenAI is also implementing security measures such as watermarking to trace the origin of audio, proactively monitoring system usage, and creating a "banned voices list" to prevent cloning of known personalities.

Price & When We Can Expect It

Exact official release dates and final pricing have not yet been announced. According to TechCrunch, Voice Engine could cost $15 per million characters, making it a more budget-friendly option compared to its competitors. An "HD" version with a higher price tag is also mentioned.

Source: Engadget

Artificial Intelligence