Nvidia says its new AI music editor can create “never-heard-before sounds, like a blaring trumpet.” The tool, called Fugatto, can generate music, sounds, and speech using untrained text and voice input.
As shown in this video embedded below, this allows Fugatto to put together songs based on wild prompts such as “Create electronic music where the saxophone howls, barks, and then the dog barks.” You can.
Other examples shared by the company include “a deep, rumbling bass pulse combined with intermittent high-pitched digital chirps, like the sound of a giant sentient machine waking up,” which is unique based on the description. Contains the ability to generate unique sound effects.
You can even change the sound of someone’s voice to change their accent or give them a different tone, such as angry or calm. There are also ways to edit music. Fugatto also lets you isolate vocals in your songs, add instruments, and even change the melody by replacing the piano with an opera singer.
A paper published with the announcement lists all the datasets Nvidia says it used to train Fugatto, one of which includes the BBC’s sound effects library.
To build Fugatto, Nvidia said researchers had to assemble a dataset containing millions of audio samples. They then created instructions that “significantly expand the range of tasks that the model can perform, while achieving more accurate performance and enabling new tasks without the need for additional data.” Nvidia has not said when or if the tool will be widely available.