How are sounds selected for generation? The input prompt and Mubert API tags are both encoded to latent space vectors of a transformer neural network. Then the closest tags vector is selected for each prompt and corresponding tags are sent to our API for music generation.
All sounds (separate loops for bass, leads etc.) are created by musicians and sound designers, they are not synthesized by any neural network. Our paradigm is “from creators to creators”. We are musicians ourselves and it is important for us that musicians stay in the equation.
Check it out here:
I’m gonna mix some beats and see how it goes! I’ll share them here afterwards.
If you play around it with, share your tunes here too.
This is absolutely fascinating and immediately one of my favorite new tools.
I thought of a use-case, immediately, for Yahoo News. You know how Yahoo News and other media outlets like to play pictures of news with captions and music in the background, because we as a country are losing our attention span and journalists sometimes don’t even want to bother with writing articles? Well, with prompt engineering and AI generated music, they can turn headlines into music that actually matches the theme of what’s going on in the article which they’ve already turned into pictures.
I’ve done a couple of news-story related tests, and they work out great. My favorite, and least political, however, has nothing to do with the news and everything to do with Kubernetes.
My prompt was: “trying to secure a Kubernetes cluster, dark opera”
You can listen to the generated audio here: