Archive for December, 2022

Carol Of The Bells, arranged for Piano and Synthesizer

 

CJ Thomas shared his version of Ukrainian composer Mykola Leontovych’s Carol Of The Bells, arranged for piano and synthesizer.

 

“On a last-minute whim, I decided to create a version of Carol Of The Bells, originally composed by Mykola Leontovych. It’s one of my favorite pieces of holiday music. Hope you enjoy!”

 

 

New Project, Riffusion is a free web app that uses AI to create music from your text prompts

 

If you can type it, the robot can play it

You may well be aware of Stable Diffusion, the much-discussed open-source AI model that can generate images from text. Well, as a “hobby project”, developers – Seth Forsgren and Hayk Martiros – have now created Riffusion, which uses the same model to turn text into music.

With tools like Stable Diffusion, you can generate original images by supplying text prompts, like “photograph of an astronaut riding a horse”, as shown in the above image.

This artificial intelligence – so the program is not intentionally drawing a picture of an astronaut on a horse – it’s generating an image with qualities that are similar to the qualities of images that it’s indexed for ‘astronaut’ and ‘horse’.

This is an important difference, and helps explain why these AI images can be amazing, but are very likely to have some weirdness to them.

 

Did you notice that the horse only has three legs?

 

Riffusion is a new project that builds on the success of recent AI image generation work, but applies it to sound.

Riffusion works by generating images from spectograms, which are then converted into audio clips. We’re told that it can generate infinite variations of a text prompt by varying the ‘seed’.

Riffusion’s creators explain that a spectogram can be computed from audio using what’s known as the Short-time Fourier transform (STFT), which approximates the audio as a combination of sine waves of varying amplitudes and phases. But the process is limited by a small index of spectrograms, compared to the 2.3 billion images used to train Stable Diffusion. And it’s limited by the resolution of the spectrograms, which give the resulting audio a lofi quality.

However, in the case of Riffusion, the STFT is inverted so that the audio can be reconstructed from a spectogram. Here, the images from the AI model only contain the amplitude of the sine waves and not the phases – these are appromixmated by something called the Griffin-Lim algorithm when reconstructing the audio clip.

The web app enables you to type in prompts and will keep on generating interpolated content in realtime for as long as you let it, while giving you a visual 3D representation of the spectrogram. You can also skip immediately to the next prompt; if there isn’t one, Riffusion will interpolate between different seeds of the same prompt.

The approach shows potential, though. It’s currently up to the task of generating disturbing sample fodder – similar to the way AI image generation, even 6 months ago, was limited to generating lofi creepy images. This suggests that – with a much larger index, and higher resolution spectrograms – it’s likely that AI audio generation could make similar leaps in quality in the next year.

What do you think? Is there a future in music for AI directly synthesizing audio? Share your thoughts in the comments!

via John Lehmkuhl

 

 

Sinevibes introduces Vague Binaural Time Diffusion Processor

 

Sinevibes has introduced Vague, a binaural time diffusion processor for Mac and Windows.

 

What Sinevibes say about it:

“In its core, the signal travels through a virtual space comprised of 16 all-pass comb filter stages, with their delay times progressively increasing from start to end. The more stages the signal passes through, the more blurred and smeared it becomes, losing its definition and transient sharpness.

But Vague goes much further than this: it offers binaural expansion via alternating, stereo-opposed time offsets in the individual all-pass stages, which adds quite a unique, three-dimensional vibe. Another very special feature is the four output snapshots taken as the signal passes through the virtual space: smoothly crossfading between them not only changes the diffusion density, but also makes the timeline granulated.

Moreover, Vague can run multiple independent LFOs to modulate its main parameters, creating a wide variety of rich and charismatic effects: time diffusion and dimensional expansion, blurred unison, granular scrubbing, psychedelic reverberation, and much more”.

 

 

Features:

  • Diffused space made of 16 chained all-pass comb filters with smooth crossfade between four output snapshots
  • Unique binaural expansion implemented via alternating and stereo-opposing time offsets
  • Two-way pre-delay line, with an ability for the wet signal to precede the dry signal
  • Two main LFOs (sine wave) and four additional chaos LFOs (random triangle) enable intricate parameter modulation
  • Lag filters on all continuous parameters for smooth, click-free adjustment
  • Supports mono › mono, mono › stereo, and stereo › stereo channel configurations

 

Vague is available now for $29.