Meta Introduces Audiocraft - Generative AI Tool for Music

Published: August 03, 2023

Meta just launched a new generative AI tool that lets users generate high-quality audio and music from text prompts. Launched on Wednesday, AudioCraft is the tech giant's latest effort to incorporate AI into its services.

"With AudioCraft, we simplify the overall design of generative models for audio compared to prior work in the field — giving people the full recipe to play with the existing models that Meta has been developing over the past several years while also empowering them to push the limits and develop their own models," the company wrote in a blog post.

The new tool consists of three different models, each with a corresponding use.

MusicGen is the model that generates audio from text prompts, and was trained using Meta-owned and specifically licensed music. Meanwhile, the second model AudioGen was trained using public sound effects, and can generate audio from text prompts.

Though similar to MusicGen, AudioGen serves a different purpose. In an example provided by Meta, the text "lo-fi song with organic samples, saxophone solo" was used for MusicGen to create music output, while "sirens and a humming engine approach and pass" was used for AudioGen to generate sound output. 

"We’re also releasing our pre-trained AudioGen models, which let you generate environmental sounds and sound effects like a dog barking, cars honking, or footsteps on a wooden floor," the company shared.

Lastly, the third model EnCodec decoder allows music to be generated in higher quality with fewer artifacts.

All AudioCraft models will be open-sourced by the company to give researchers and practitioners access to train their own models using their own datasets.

While Meta has been busy working on other projects like a new AI-powered chatbot scheduled for September, AudioCraft marks the first time the company delved into the space of audio-making using generative AI.

"While we’ve seen a lot of excitement around generative AI for images, video, and text, audio has seemed to lag a bit behind. There’s some work out there, but it’s highly complicated and not very open, so people aren’t able to readily play with it," Meta said commenting on the launch of Audiocraft.

Edited by Nikola Djuric

