Sesame releases its basic model of artificial intelligence

0
235
Sesame releases its basic model of artificial intelligence

Sesame, an artificial intelligence company, has released a basic model that powers the strikingly realistic Maya voice assistant.

The model, which has 1 billion parameters (“parameters” are the individual components of the model), is licensed under the Apache 2.0 license, which means that it can be used for commercial purposes with few restrictions. The model, called CSM-1B, generates “RVQ audio codes” from text and audio inputs, according to Sesame’s description on the Hugging Face AI development platform.

RVQ stands for “residual vector quantization,” a technique for encoding audio into discrete markers called codes. RVQ is used in a number of recent AI audio technologies, including Google’s SoundStream and Meta‘s Encodec.

The CSM-1B uses a model from Meta’s Llama family as a basis paired with an audio “decoder” component. According to Sesame, the revised version of CSM powers Maya.

“The open-source model is a first-generation model,” Sesame writes in the CSM-1B Hugging Face and GitHub repositories. “It is capable of generating a variety of voices, but it has not been fine-tuned to any particular voice […] The model has some potential for non-English languages due to data contamination in the training data, but it will likely not perform well.”

It is unclear what data Sesame used to train CSM-1B. The company did not disclose it.

It is worth noting that the model has no real guarantees to speak of. Sesame has an honor system and simply urges developers and users not to use the model to imitate a person’s voice without their consent, create misleading content such as fake news, or for “malicious” or “malicious” activities.

I tried the demo on Hugging Face and it took less than a minute to clone my voice. After that, it was easy to generate a speech of my own choosing, including on controversial topics like the election and Russian propaganda.

Recently, Consumer Reports warned that many of the popular AI-based voice cloning tools on the market lack “meaningful” safeguards to prevent fraud or abuse.

Sesame, co-founded by Oculus co-creator Brendan Iribe, went viral in late February with its assistant technology that came close to clearing the mystery valley. Maya and Sesame’s other assistant, Miles, takes breaths and speaks intermittently and can be interrupted during a conversation, similar to OpenAI’s voice mode.

Sesame has raised an undisclosed amount of capital from Andreessen Horowitz, Spark Capital, and Matrix Partners. In addition to creating voice assistants, the company says it is developing a prototype of artificial intelligence glasses “designed to be worn all day” that will be equipped with its custom models.

LEAVE A REPLY

Please enter your comment!
Please enter your name here