by regenschutz
1 subcomments
- The Fast demo model is already very impressive. It was way better than expected, but still required being a bit verbose since it didn't seem to understand rarer words ("sauna" didn't get me anything pleasant, "hot sauna" did).
The generated palette seem to be a great indicator of whether the model understood the prompt or not.
I Haven't checked out the Python SDK yet, but it seems very interesting!
I'm curious to know if there is any reason for why you picked Gemma 1B for the Expressive model. Did it generate more cohesive parameters than other 1B models? Or was it just the first one you picked?
by Brajeshwar
1 subcomments
- I tried, “In the mood for country cowboy-ish music played for someone like John Wick bleeding out on a cold, snow-covered park bench.”
I ended up with kinda shrill. I was hoping for something that would sound like I’m listening to something while the coffee gets cold in a cabin.
by hackingonempty
1 subcomments
- I get clicks and pops every few seconds, using Librewolf.
But otherwise very cool!
by blasphemous_dev
1 subcomments
- I kinda liked how well you can fine-tune parameters of the music. Could be useful as dynamic soundtracks for games in low resource settings
- I really like the idea. But my one attempt was disappointing. "playful energetic urban fantasy at night" ended up set to "very slow" by default.
I would really like to be able to run this on my phone. Use my Brilliant smart glasses to periodically take a picture, ask a model to describe the mood/setting, and get an ambient stream to match the mood.
by namnnumbr
1 subcomments
- I like the concept, but it's not picking up what I'm putting down...
'"epic vampiric doom" suggests a bright and uplifting soundscape'? I'm not so sure about that
... I was hoping for something more like nightwish.
by saranshmahajan
1 subcomments
- Really elegant approach - mapping sentence embeddings to a deterministic synth feels more like building an instrument than generating content, and the instant playback makes it great for flow.
Would love to know if the same prompt always yields the same sound (reproducibility could be powerful), and whether you’ve considered semantic morphing between two moods over time.
by cprecioso
1 subcomments
- Server is down :(