Computer chip giant Nvidia entered the AI music race on Tuesday (November 26) by announcing a new model called Fugatto. The company calls Fugatto, which stands for Foundational Generative Audio Transformer Opus 1, “a Swiss Army knife for sound.”
Fugatto generates new music at the click of a button using text or audio prompts, and allows you to modify existing audio, including removing or adding instruments from a song or changing vocal accents and emotions. Edit in seconds.
With Fugatto, Nvidia aims to challenge today’s top AI music models, including Suno, Udio, and more. Although Fugatto is a latecomer in the race to create the best music AI models, it appears to have crisp audio quality and a host of features that could change the music production process for producers and composers.
According to an announcement on Nvidia’s blog, “One of the most challenging parts of this effort was generating a mixed dataset containing millions of audio samples used for training.” He said he worked for more than a year to get it right. “The team employs a multifaceted strategy to generate data and instructions that significantly expand the range of tasks that the model can perform, while achieving more accurate performance and allowing new tasks to be performed without the need for additional data. The company says its models are trained on open source datasets under a Creative Commons license and comply with copyright laws.
Nvidia suggests many use cases for Fugatto, including generating scores for visual media. Edit specific parts of your score. You can also change your voice to have different accents, emotions, and tones. “Fugatto makes trumpet calls and saxophone sounds. Anything a user can describe can be created in a model,” says Rafael Valle, manager of applied audio research at Nvidia.
“The history of music is also the history of technology,” says Ido Zmishlany, producer/songwriter, co-founder of One Take Audio, and member of Nvidia Inception, a cutting-edge startup program. “With AI, we are writing the next chapter of music. We have new instruments, new tools to make music, and it’s so exciting.”
Nvidia claims this is the first AI music model to exhibit “emergent properties, the ability to emerge from the interaction of different trained abilities and the ability to combine free-form instructions.” Valle adds that Fugatto is “our first step toward a future where unsupervised multi-task learning in audio synthesis and transformation emerges from the scale of data and models.”
There’s just one catch. The company has so far said this is an internal research project and not available to the public.