Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon
216 points by MediaSquirrel
by LuxBennu
3 subcomments
I run whisper large-v3 on an m2 max 96gb and even with just inference the memory gets tight on longer audio, can only imagine what fine-tuning looks like. Does the 64gb vs 96gb make a meaningful difference for gemma 4 fine-tuning or does it just push the oom wall back a bit? Been wanting to try local fine-tuning on apple silicon but the tooling gap has kept me on inference only so far.
by conception
0 subcomment
I’m pretty excited about the edge gallery ios app with gemma 4 on it but it seems like they hobbled it, not giving access to intents and you have to write custom plugins for web search, etc. Does anyone have a favorite way to run these usefully? ChatMCP works pretty well but only supports models via api.
by sails
0 subcomment
> Accent, dialect, and low-resource language adaptation — adapt a base Gemma model to underrepresented voices and languages with your own labeled audio.
Is this for TTS? Have been looking for something to do a local fine tune to get a specific accent
by craze3
0 subcomment
Nice! I've been wanting to try local audio fine-tuning. Hopefully it works with music vocals too
by mandeepj
1 subcomments
> I had 15,000 hours of audio data
do you really need that much data for fine-tuning?
by dsabanin
1 subcomments
Thanks for doing this. Looks interesting, I'm going to check it out soon.
by yousifa
0 subcomment
This is super cool, will definitely try it out! Nice work
by m3kw9
0 subcomment
will it work for 32gb?
by neonstatic
2 subcomments
Just a heads up, that I found NVIDIA Parakeet to be way better than Whisper - faster, uses less compute, the output is better, and there are more options for the output. I am using parakeet-mlx from the command line. Check it out!