Meta has released the source code for its text-to-music generative AI called AudioCraft. The framework for development, AudioCraft, consists of three models: MusicGen, AudioGen, and EnCodec.
MusicGen is capable of generating music based on textual user inputs, as it was trained with Meta-owned and licensed music. AudioGen, on the other hand, can create audio from text inputs and can be trained using public sound effects. Lastly, EnCodec is an AI-driven encoder, quantizer, and decoder.
With the release of a new version of the EnCodec decoder, Meta has improved the quality of music generation with fewer artifacts. They have also included the pre-trained AudioGen model, which can generate environmental sounds and sound effects like a dog barking or footsteps on a wooden floor. Researchers and practitioners can utilize these models to train their own models and contribute to the field.
AudioCraft, with its intuitive interface, allows users to produce professional-grade sound. It uses the EnCodec neural audio codec to extract meaningful information from raw audio data. Then, an autoregressive language model, trained with a predetermined vocabulary of musical samples, generates tokens based on textual descriptions. These tokens are fed back to the EnCodec decoder to synthesize audio and music.
Meta highlights the uniqueness of AudioGen compared to conventional AI music generators. Instead of using symbolic representations of music, AudioGen uses self-supervised audio representation learning and multiple hierarchical models to generate music with longer-range structures. Although there is room for improvement, AudioGen produces good sound.
In line with Responsible AI principles, Meta is making the AudioGen and MusicGen model cards available to the research community. They have also released the audio research framework and training code under the MIT license. Meta believes that these models can be useful for both amateur and professional musicians, especially with further development of sophisticated controls.
To learn more about AudioCraft and its models, you can check out the GitHub repository and Meta AI Blog.