Nvidia has unveiled a groundbreaking AI music editor called Fugatto, which the company claims can generate “sounds never heard before,” such as a trumpet that meows. This innovative tool can create music, sound effects, and speech using text and audio inputs it has not previously encountered, offering users a new level of creative possibilities.
In a demonstration video, Nvidia showcases Fugatto’s ability to compose music based on imaginative prompts. One example includes creating a piece that transitions from a howling and barking saxophone to electronic music interwoven with barking dogs. The tool’s versatility extends beyond music, allowing it to generate unique sound effects from descriptive inputs. For instance, it can produce audio described as “deep, rumbling bass pulses paired with intermittent, high-pitched digital chirps, like the sound of a massive sentient machine waking up.”
Fugatto’s functionality also includes the ability to modify voices. It can alter someone’s accent or adjust their tone to sound calm or angry. Additionally, it offers music editing capabilities, such as isolating vocals, introducing new instruments, or even changing a melody. For example, it can replace a piano with an opera singer, demonstrating its ability to transform compositions creatively.
The development of Fugatto involved training the AI on an extensive range of datasets, including a library of sound effects from the BBC, as detailed in a research paper accompanying the announcement. Nvidia’s researchers compiled a dataset containing millions of audio samples and developed specialized instructions to expand the model’s capabilities. This approach enabled Fugatto to perform a wider array of tasks with improved accuracy and even handle new challenges without the need for additional data.
Fugatto’s introduction comes at a time when numerous AI audio tools are emerging, with competitors such as Stability AI, OpenAI, Google DeepMind, ElevenLabs, and Adobe offering their own solutions. However, Nvidia sets Fugatto apart by claiming it can generate entirely new and unprecedented sounds. While other tools focus on remixing or mimicking existing audio, Fugatto emphasizes originality and creative flexibility.
The rise of AI audio tools has not been without controversy. Several AI startups are facing copyright lawsuits over their music creation technologies. Additionally, a recent report revealed that Nvidia and other companies have trained AI models on subtitles extracted from thousands of YouTube videos. Despite these challenges, the potential for tools like Fugatto to revolutionize the creative industry remains significant.
Nvidia has not disclosed a release timeline or confirmed whether Fugatto will be made widely available. Nevertheless, the tool’s capabilities suggest a bold leap forward in AI-driven audio production. By enabling users to generate entirely new sounds and transform existing ones, Fugatto could open up new possibilities for musicians, sound designers, and content creators alike.