FRESH

Hacker News

Home

Apple releases open-source model that instantly turns 2D photos into 3D views

387 points by SG-

by bertili

2 subcomments

HN discussion 11 days ago: https://news.ycombinator.com/item?id=46284658

by neom

1 subcomments

Examples: https://apple.github.io/ml-sharp/
Paper: https://arxiv.org/abs/2512.10685

by RobotToaster

15 subcomments

https://raw.githubusercontent.com/apple/ml-sharp/refs/heads/...
"Exclusively for research purposes" so not actually open source.

by rcarmo

1 subcomments

This is a dupe. A couple of weeks ago I forked it and got the rendering to work in MPS: https://github.com/rcarmo/ml-sharp

by chmod775

2 subcomments

Big day for VR pornography!
I'm not kidding. That's going to be >80% of the images/videos synthesized with this.

by coffeecoders

4 subcomments

I feel like being in a time loop. Every time a big company releases a model, we debate the definition of open source instead of asking what actually matters. Apple clearly wants the upside of academic credibility without giving away commercial optionality, which isn't unsurprising.
Additionally, we might need better categories. With software, flow is clear (source, build and binary) but with AI/ML, the actual source is an unshippable mix of data, infra and time, and weights can be both product and artifacts.

by d_watt

1 subcomments

I’ve been using some time off to explore the space and related projects StereoCrafter and GeometryCrafter are fascinating. Applying this to video adds a temporal consistency angle that makes it way harder and compute intensive, but I’ve “spatialized” some old home videos from the Korean War and it works surprisingly well.
https://github.com/TencentARC/StereoCrafter https://github.com/TencentARC/GeometryCrafter

by cromulent

0 subcomment

Previous discussion: https://news.ycombinator.com/item?id=46284658

by analog31

1 subcomments

I wonder if it helps that a lot of people take more than one picture of the same thing, thus providing them with effectively stereoscopic images.

by jtrn

5 subcomments

I was thinking of testing it, but I have an irrational hatred for Conda.

by bdelmas

0 subcomment

I’m so sad I had this idea at least 6 years ago but I didn’t have the connections to make it happen. But that’s nice that they released the project. Apple open sourcing their tech?

by victormustar

1 subcomments

Hugging Face model: https://huggingface.co/apple/Sharp and demo: https://huggingface.co/spaces/ronedgecomb/ml-sharp

by yalogin

1 subcomments

Is this already integrated into the latest iOS? If so it’s not good. It only works on a few images and for the most part the rendering feels fake and somehow incoherent

by dmos62

0 subcomment

Anyone's aware of something similar for making interactive (or video) tours of apartments from photos?

by gjsman-1000

4 subcomments

Is this the same model as the “Spatial Scenes” feature in iOS 26? If so, it’s been wildly impressive.

by jokoon

3 subcomments

does it make a mesh?
doesn't seem very accurate, no idea of the result with a photo of large scene, that could be useful for level designers

by ww520

0 subcomment

Is the model in ONNX format or PyTorch format?

by lvl155

3 subcomments

I don’t know when Apple turned evil but hard for me to support them further after nearly four decades. Everything they do now is directly opposite of what they stood for in the past.

by backtogeek

0 subcomment

License arguments aside, pretty cool.

by hermitcrab

2 subcomments

"Sharp Monocular View Synthesis in Less Than a Second"
"Less than a second" is not "instantly".

by vednig

0 subcomment

facebook worked on a similar project almost 5 years back

by bbstats

0 subcomment

would love a multi-image version of this.

0 subcomment

by burnt-resistor

0 subcomment

Damn. I recall UC Davis was working on this sort of problem for CCTV footage 20 years ago, but this is really freakin' progress now.

by darig

0 subcomment

[dead]

0 subcomment

by Invictus0

2 subcomments

Apple is not a serious company if they can't even spin up a simple frontend for their AI innovations. I should not have to install anything to test this.

by b112

2 subcomments

Ah great. Easier for real estate agents to show slow panning around a room, with lame music.
I guess there are other uses?? But this is just more abstracted reality. It will be innacurate just as summaried text is, and future peoples will again have no idea as to reality.

by pcurve

11 subcomments

[flagged]