It's genuinely astonishing how much clearer this is than a traditional satellite map -- how it has just the right amount of complexity. I'm looking at areas I've spent a lot of time in, and getting an even better conceptual understanding of the physical layout than I've ever been able to get from satellite (technically airplane) images. This hits the perfect "sweet spot" of detail with clear "cartoon" coloring.
I see a lot of criticism here that this isn't "pixel art", so maybe there's some better term to use. I don't know what to call this precise style -- it's almost pixel art without the pixels? -- but I love it. Serious congratulations.
Feels like something is missing... maybe just a pixelation effect over the actual result? Seems like a lot of the images also lack continuity (something they go over in the article)
Overall, such a cool usage of AI that blends Art and AI well.
> I spent a decade as an electronic musician, spending literally thousands of hours dragging little boxes around on a screen. So much of creative work is defined by this kind of tedious grind. ... This isn't creative. It's just a slog. Every creative field - animation, video, software - is full of these tedious tasks. Of course, there’s a case to be made that the very act of doing this manual work is what refines your instincts - but I think it’s more of a “Just So” story than anything else. In the end, the quality of art is defined by the quality of your decisions - how much work you put into something is just a proxy for how much you care and how much you have to say.
Great insights here, thanks for sharing. That opening question really clicked for me.
If you had instead drawn this, there would be charm and fun details EVERYWHERE. Little buildings you know would have inside jokes, there would be references snuck into everything. Who YOU are would come through, but it would also be much smaller.
This is HUGE, and the zoomed out view is actually an insanely useful map. It's so cool to see reality shifted into a style like this, and there's enough interesting things in "real new york" to make scrolling around this a fun thing to do. I have no impression of you here other than a vague idea you like the older sim city games BUT I have a really interesting impression of NYC.
IMO, that's two totally different pieces of art, with two totally different goals. Neither takes away from the other, since they're both making something impactful rather than one thing trying to simulate the impact of the other. Really nice job with this.
Also, does someone have an intuition for how the "masking" process worked here to generate seamless tiles? I sort of grok it but not totally.
Maybe, though a guy did physically carve/sculpt the majority of NYC: https://mymodernmet.com/miniature-model-new-york-minninycity...
I know you'll get flak for the agentic coding, but I think it's really awesome you were able to realize an idea that otherwise would've remained relegated to "you know what'd be cool.." territory. Also, just because the activation energy to execute a project like this is lower doesn't mean the creative ceiling isn't just as high as before.
Firefox, Ubuntu latest.
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://isometric-nyc-tiles.cannoneyed.com/dzi/tiles_metadat.... (Reason: CORS header ‘Access-Control-Allow-Origin’ missing). Status code: 429.
Edit: i see now, the error is due to the cloudflare worker being rate limited :/ i read the writeup though, pretty cool, especially the insight about tool -> lib -> application
Absolutely loved zooming around to see:
- my old apartment
- places I've worked
- restaurants and rooftop lounges I've been to etc
The explanation of how this was put together was even cooler: https://cannoneyed.com/projects/isometric-nyc
I wonder if for almost any bulk inference / generation task, it will generally be dramatically cheaper to (use fancy expensive model to generate examples, perhaps interactively with refinements) -> (fine tune smaller open-source model) -> (run bulk task).
Transparency also exists, e.g. GPT Image does it, and Nano Banana Pro should have it supported soon as well.
We have a blog post on a similar workflow here: https://www.oxen.ai/blog/how-we-cut-inference-costs-from-46k...
On the inference cost and speed: we're actively working on that and have a pretty massive upgrade there coming soon.
gemini 3.5 pro reverse engineered it - if you use the code at the following gist, you can jump to any specific lat lng :-)
https://gist.github.com/gregsadetsky/c4c1a87277063430c26922b...
also, check out https://cannoneyed.com/isometric-nyc/?debug=true ..!
---
code below (copy & paste into your devtools, change the lat lng on the last line):
const calib={p1:{pixel:{x:52548,y:64928},geo:{lat:40.75145020893891,lng:-73.9596826628078}},p2:{pixel:{x:40262,y:51982},geo:{lat:40.685498640229675,lng:-73.98074283976926}},p3:{pixel:{x:45916,y:67519},geo:{lat:40.757903901085726,lng:-73.98557060196454}}};function getAffineTransform(){let{p1:e,p2:l,p3:g}=calib,o=e.geo.lat*(l.geo.lng-g.geo.lng)-l.geo.lat*(e.geo.lng-g.geo.lng)+g.geo.lat*(e.geo.lng-l.geo.lng);if(0===o)return console.error("Points are collinear, cannot solve."),null;let n=(e.pixel.x*(l.geo.lng-g.geo.lng)-l.pixel.x*(e.geo.lng-g.geo.lng)+g.pixel.x*(e.geo.lng-l.geo.lng))/o,x=(e.geo.lat*(l.pixel.x-g.pixel.x)-l.geo.lat*(e.pixel.x-g.pixel.x)+g.geo.lat*(e.pixel.x-l.pixel.x))/o,i=(e.geo.lat*(l.geo.lng*g.pixel.x-g.geo.lng*l.pixel.x)-l.geo.lat*(e.geo.lng*g.pixel.x-g.geo.lng*e.pixel.x)+g.geo.lat*(e.geo.lng*l.pixel.x-l.geo.lng*e.pixel.x))/o,t=(e.pixel.y*(l.geo.lng-g.geo.lng)-l.pixel.y*(e.geo.lng-g.geo.lng)+g.pixel.y*(e.geo.lng-l.geo.lng))/o,p=(e.geo.lat*(l.pixel.y-g.pixel.y)-l.geo.lat*(e.pixel.y-g.pixel.y)+g.geo.lat*(e.pixel.y-l.pixel.y))/o,a=(e.geo.lat*(l.geo.lng*g.pixel.y-g.geo.lng*l.pixel.y)-l.geo.lat*(e.geo.lng*g.pixel.y-g.geo.lng*e.pixel.y)+g.geo.lat*(e.geo.lng*l.pixel.y-l.geo.lng*e.pixel.y))/o;return{Ax:n,Bx:x,Cx:i,Ay:t,By:p,Cy:a}}function jumpToLatLng(e,l){let g=getAffineTransform();if(!g)return;let o=g.Ax*e+g.Bx*l+g.Cx,n=g.Ay*e+g.By*l+g.Cy,x=Math.round(o),i=Math.round(n);console.log(` Jumping to Geo: ${e}, ${l}`),console.log(` Calculated Pixel: ${x}, ${i}`),localStorage.setItem("isometric-nyc-view-state",JSON.stringify({target:[x,i,0],zoom:13.95})),window.location.reload()};
jumpToLatLng(40.757903901085726,-73.98557060196454);100 people built this in 1964: https://queensmuseum.org/exhibition/panorama-of-the-city-of-...
One person built this in the 21st century: https://gothamist.com/arts-entertainment/truckers-viral-scal...
AI certainly let you do it much faster, but it’s wrong to write off doing something like this by hand as impossible when it has actually been done before. And the models built by hand are the product of genuine human creativity and ingenuity; this is a pixelated satellite image. It’s still a very cool site to play around with, but the framing is terrible.
Oh man...
I especially appreciated the deep dive on the workflow and challenges. It's the best generally accessible explication I've yet seen of the pros and cons of vibe coding an ambitious personal project with current tooling. It gives a high-level sense of "what it's generally like" with enough detail and examples to be grounded in reality while avoiding slipping into the weeds.
Feature idea: For those of us who aren't familiar with The City, could you allow clicks on the image to identify specific landmarks (buildings, etc.)? Or is that too computationally intensive? I can identify a few things, but it would sure be nice to know what I'm looking at.
Upvote for the cool thing I haven’t seen before but cancelled out by this sentiment. Oof.
> If you can push a button and get content, then that content is a commodity. Its value is next to zero.
> Counterintuitively, that’s my biggest reason to be optimistic about AI and creativity. When hard parts become easy, the differentiator becomes love.
Love that. I've been struggling to succinctly put that feeling into words, bravo.
Cool project!
an awesome takeaway from this is that self-hosted models are the future! can't wait for hardware to catch up and we can do much more experiments on our laptops!
I am especially impressed with the “i didn’t write a single line of code” part, because I was expecting it to be janky or slow on mobile, but it feels blazing fast just zooming around different areas.
And it is very up to date too, as I found a building across the street from me that got finished only last year being present.
I found a nitpicky error though: in Brooklyn downtown, where Cadman Plaza Park is, your webite makes it looks like there is a large rectangular body of water there (e.g., a pool or a fountain). In reality, there is no water at all, it is just a concrete slab area.
Makes me feel insane that we're passing this off as art now.
That's what we call a monk's job in Holland. Kudos
One thing I would suggest is to also post-process the pixel art with something like this tool to have it be even sharper. The details fall off as you get closer, but running this over larger patch areas may really drive the pixel art feel.
Maybe you can use that^ to snap the pixels to a perfect grid
It would be neat if you could drag and click to select an area to inpaint. Let's see everyone's new Penn Station designs!
Would guess it'd have to be BYOK but it works pretty well:
https://i.imgur.com/EmbzThl.jpeg
Much better than trying to inpaint directly on Google Earth data
Edit: this submission has a few links that could be what I had in mind but most of them no longer work: https://news.ycombinator.com/item?id=2282466
It's as if NYC was built in Transport Tycoon Deluxe.
I'll be honest, I've been pretty skeptical about AI and agentic coding for real-life problems and projects. But this one seems like the final straw that'll change my mind.
Thanks for making it, I really enjoy the result (and the educational value of the making-of post)!
As you say: software engineering doesn’t go away in the age of AI - it just moves up the ladder of abstraction ! At least in the mid term :)
Very cool work and great write up.
The 3D/street view version is an obvious and natural progression from here, but from what I've read in your dev log, it's also probably a lot of extra work.
I must admit I spent way too much time finding landmarks I visited when I last holidayed from Australia. Now I'm feel nostalgic.
Thanks so much for sharing!
Could even extend it to add a "night mode" too, though that'd require extensive retexturing.
Reminds me of https://queensmuseum.org/exhibition/panorama-of-the-city-of-...
SF/Mountain View etc don't even have one! you get a little piece of the NYC brand just for you!
At first I thought this was someone working thousands of hours putting this together, and I thought: I wonder if this could be done with AI…
You probably need to adjust how caching is handled with this.
Nicely put.
To me, the appeal of pixel art is that each pixel looks deliberately placed, with clever artistic tricks to circumvent the limitations of the medium. For instance, look at the piano keys here [1]. They deliberately lack the actual groupings of real piano keys (since that wouldn't be feasible to render at this scale), but are asymmetrically spaced in their own way to convey the essence of a keyboard. It's the same sort of cleverness that goes into designing LEGO sets.
None of these clever tricks are apparent in the AI-generated NYC.
On another note, a big appeal of pixel art for me is the sheer amount of manual labor that went into it. Even if AI were capable of rendering pixel art indistinguishable from [0] or [1], I'm not sure I'd be impressed. It would be like watching a humanoid robot compete in the Olympics. Sure, a Boston Dynamics bot from a couple years in the future will probably outrun Usain Bolt and outgymnast Simone Biles, but we watch Bolt and Biles compete because their performance represents a profound confluence of human effort and talent. Likewise, we are extremely impressed by watching human weightlifters throw 200kg over their heads but don't give a second thought to forklifts lifting 2000kg or 20000kg.
OP touches on this in his blog post [2]:
I spent a decade as an electronic musician, spending literally thousands of hours dragging little boxes around on a screen. So much of creative work is defined by this kind of tedious grind. [...] This isn't creative. It's just a slog. Every creative field - animation, video, software - is full of these tedious tasks. In the end, the quality of art is defined by the quality of your decisions - how much work you put into something is just a proxy for how much you care and how much you have to say.
I would argue that in some case (e.g. pixel art), the slog is what makes the art both aesthetically appealing (the deliberately placed nature of each pixel is what defines the aesthetic) but also awe-inspiring (the slog represents an immense amount of sustained focus).[0] https://platform.theverge.com/wp-content/uploads/sites/2/cho...
[1] https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2Fu...
It really makes me not have any interest in the actual thing you're showing.
Imagine if Wendy Carlos wrote an article about how it's cool she could use this new piece of technology called a Moog Modular synthesizer to create a recording of Bach, and now harpsichords are dead, acoustic instruments suck, and all Baroque music will be made on modular synthesizers forever more.
Unfortunately with AI it's much worse because it's more like if Wendy Carlos has slurped up all Baroque music ever recorded by everyone and then somehow regurgitated that back out as slop with some algorithmic modifications.
I'm sorry, I'm unable to accept that's where we're at now.
Nope, It was Stalin who said that in regards to his "meatwave" strategy.