FRESH

Hacker News

Show HN: Lance – image/video generation and understanding in one model

62 points by cleardusk

by embedding-shape

0 subcomment

Video understanding is kind of new, especially if done well, and hopefully working well with UI and UX, that'd be great. Current agents already struggle a bit with 2D space with normal screenshots of unconventional UIs, wonder if this model would do better with actual recordings of navigating and using applications, feels like it could help a bunch with understanding UX at least hopefully. Will be fun to play around with :)

by wxw

0 subcomment

What’s SOTA for video understanding? AFAIK most video search is powered by transcription and not the actual video. This seems impressive.

by nkvdev

0 subcomment

by bguberfain

1 subcomments

by Tsarp

0 subcomment

Nice work. Wish they had picked another name given how popular lance/lancedb is.

by popalchemist

1 subcomments

Seems like the video output is crippled. Resolution is low (720 or so), as is the frame rate. The samples are shown up-scaled and frame-interpolated.
Why do that? Seems strange to be building sub-hd resolution video models in 2026.

by vaporaviatorlab

0 subcomment

by CrzyLngPwd

2 subcomments

by asadm

1 subcomments