by devinprater
9 subcomments
- Apple has a video understanding model too. I can't wait to find out what accessibility stuff they'll do with the models. As a blind person, AI has changed my life.
by RobotToaster
2 subcomments
- The license[0] seems quite restrictive, limiting it's use to non commercial research. It doesn't meet the open source definition so it's more appropriate to call it weights available.
[0]https://github.com/apple/ml-starflow/blob/main/LICENSE_MODEL
- Looking at text to video examples (https://starflow-v.github.io/#text-to-video) I'm not impressed. Those gave me the feeling of the early Will Smith noodles videos.
Did I miss anything?
by gorgoiler
3 subcomments
- It’s not really relevant to this release specifically but it irks me that, in general, an “open weights model” is like an “open source machine code” version of Microsoft Windows. Yes, I guess I have open access to view the thing I am about to execute!
This Apple license is click wrap MIT with the rights, at least, to modify and redistribute the model itself. I suppose I should be grateful for that much openness, at least.
- From the paper, this is a research model aimed at dealing with the runaway error common in diffusion video models - the latent space is (proposed to be) causal and therefore it should have better coherence.
For a 7b model the results look pretty good! If Apple gets a model out here that is competitive with wan or even veo I believe in my heart it will have been trained with images of the finest taste.
by summerlight
0 subcomment
- This looks interesting. This project has some novelty as a research and actually delivered a promising PoC but as a product it implies that its training was severely constrained by computing resources, which correlates well with the report that their CFO overruled CEO's decision on ML infra investment.
JG's recent departure and follow up massive reorg to get rid of AI, rumors on Tim's upcoming step down in early 2026... All of these signals indicate that those non-ML folks have won corporate politics to reduce the in-house AI efforts.
I suppose this was a part of serious efforts to deliver in-house models but the directional changes on AI strategy made them to give up. What a shame... At least the approach itself seem interesting and hope others to take a look and use it for building something useful.
- > STARFlow-V is trained on 96 H100 GPUs using approximately 20 million videos.
They don’t say for how long.
- Title is wrong, model isn’t released yet. Title also doesn’t appear in the link - why the editorializing?
by satvikpendem
2 subcomments
- Looks good. I wonder what use case Apple has in mind though, or I suppose this is just what the researchers themselves were interested in, perhaps due to the current zeitgeist. I'm not really sure how it works at big tech companies with regards to research, are there top down mandates?
- > Model Release Timeline: Pretrained checkpoints will be released soon. Please check back or watch this repository for updates.
> The checkpoint files are not included in this repository due to size constraints.
So it's not actually open weights yet. Maybe eventually once they actually release the weights it will be. "Soon"
by nothrowaways
1 subcomments
- Where do they get the video training data?
by giancarlostoro
0 subcomment
- I was upset the page didnt have videos immediately available, then I realized I have to click on some of the tabs. One red flag on their github is the license looks to be their own flavor of MIT (though much closer to MS-PL).
- The number of video models that are worse than Wan 2.2 and can safely be ignored has increased by 1.
- Interesting that this is an autoregressive ("causal") model rather than a diffusion model.
by camillomiller
0 subcomment
- Hopefully this will make into some useful feature in the ecosystem and not contribute to having just more terrible slop. Apple has saved itself from the destruction of quality and taste that these model enabled, I hope it stays that way.
by Invictus0
1 subcomments
- Apple's got to stop running their AI group like a university lab. Get some actual products going that we can all use--you know, with a proper fucking web UI and a backend.
by Barry-Perkins
0 subcomment
- [dead]
by ai_updates
1 subcomments
- [flagged]
- <joke> GGUF when? </joke>
- "VAE: WAN2.2-VAE" so it's just a Wan2.2 edit, compressed to 7B.