You see the exact same patterns. AI uses more code to accomplish the same thing, less efficiently.
I'm not even an AI hater. It's just a fact.
The human then has to go through and cleanup that code if you want to deliver a high-quality product.
Similarly, you can slap that AI generated 3D model right into your game engine, with its terrible topology and have it perform "ok". As you add more of these terrible models, you end up with crap performance but who cares, you delivered the game on-time right? A human can then go and slave away fixing the terrible topology and textures and take longer than they would have if the object had been modeled correctly to begin with.
The comparison of edge-loops to "high quality code" is also one that I mentally draw. High quality code can be a joy to extend and build upon.
Low quality code is like the dense mesh pictured. You have a million cross interactions and side-effects. Half the time it's easier to gut the whole thing and build a better system.
Again, I use AI models daily but AI for tools is different from AI for large products. The large products will demand the bulk of your time constantly refactoring and cleaning the code (with AI as well) -- such that you lose nearly all of the perceived speed enhancements.
That is, if you care about a high quality codebase and product...
Because they all use latent diffusion, and many techniques use voxelized intermediate representations of 3d models, often generated from images, topology is bound to be bad.
There is a lot of ongoing research around getting better topology. I expect these critiques to still be valid as much as 2 years from now, but the economics of modeling will change drastically as the models get better
> AI models generate meshes using "isosurface extraction" or similar volume-to-mesh techniques
This creates the "lumpiness", the inability to capture sharp or flat features, and the over-refinement. Noisy surface is also harder to clean up. How do you define what's a feature and what's noise when there's no ground truth beyond the mesh itself?
Implicit surface methods are expensive (versus if-everything-goes-right of the parametric alternative), but they have the advantage of being robust and simple to implement with much fewer moving parts. So it's a pragmatic choice, why not.
3D generative algorithms might become much better once they can rely on parametric surfaces. Then you can do things like symmetry, flatness, curvature that makes sense, much more naturally. And the mesh generation on top will produce very clean meshes, if it succeeds. That is a crucial missing piece: CAD to mesh is hardly robust with human-generated CAD, so I can't imagine what it'd be with AI-generated CAD. An interesting challenge to be sure.
That’s why you see a a lot of hype around setups and benchmarks but not a lot of well polished products.
This article make it clear for 3d modeling, but also applies for code. Human touch is necessary for a commercial product. Otherwise it’s nothing more than a prototype.
It is actually much more difficult to maintain Ai code and 3d models than to just make your own.
Either AI can oneshot without human intervention or it becomes a pain really quickly
3D AI models are usually made by a grab bag of techniques, for example you use diffusion to output color and depth maps from multiple angles, then you try to put the results into a voxel grid with 3D reconstruction techniques, after which you run an algorithm like Marching Cubes to get a crappy mesh, and then you do decimate the mesh via some algorithm.
So yeah, the outputted model is built out nothing like a 3D artist would have, might look okay, but you won't find any structure in the output.
I don't think it's worth using them as anything in an artist pipleline except maybe as reference for retopo.
>On the AI model, I cannot. There are no loops. I would have to sculpt it like clay, destroying the texture in the process. It is actually faster to rebuild the entire model from scratch than to try and fix the AI's topology.
To play devil's advocate for a second, it seems like you didn't provide a requirement to the AI on how the handle should be made, then got frustrated that the result doesn't conform to unspoken norms. If I made you this model by just starting with a sphere and sculpting it in ZBrush, you'd get frustrated by the same problem too.
On the other hand, I would expect that the AI could perform the task if you just elongated the handle in the reference image. The same procedure would probably work if the client wanted to add cat ears to the top to make a Mario Tennis clone game, while it might be a whole new commission for human modelers.
Now, would the material mapping still be poor, and would it be a questionable use of electricity? Guilty on both counts, but it's exciting to anyone who just wants to make 3D printed items or low-fidelity video games/mods.
https://taoofmac.com/space/til/2026/02/16/1334
Claude Opus was able to perfectly replicate an angular/functional part without decimating it, so I would expect the next step to be explicitly instructing AI to clean up meshes.
Indeed, for now generative models generate triangle soup without much thought. The same was true for 2D illustrations where generative models like Deep Dream came up with horrendous images with eyes all over, dogs with multitudes of heads and oh did I mention the eyes? That was about 10 years ago. Things changed, models improved, the eyes were tamed. Yes, people had too many or too few fingers but that also changed. From nightmare fuelling imagery with many-eyed dog heads sticking out where you don't want them to fully animated hi-res video only took a decade and things are still speeding up. The triangle soup of current 3D generative models is like the eye soup of Deep Dream, something to remember somewhat fondly which is no longer relevant now.
Modeling is just so insanely time consuming and requires such a broad range of skills to do competently that the last day I have to look at a UV map I'm never going back.
On the plus side, I like the informal writing of the post. You can be serious about business and still be human
Edit: firefox reader mode works wonders on this article
“Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation Hypothesis“
Since the author can enumerate the problems and describe them, it’d be interesting to just use the one-shot pickleball racket model as a starting point. Generate it, look at the problems, then ask an agent to build “fixers” for each problem - small scripts (that they don’t need to build themselves!) which address each problem in turn. Then send the first pass AI output through a pipeline of fix scripts to get something far better but not quite there - and do final human tuneups on the result.
How good mesh can a human produce in the time that it took for the gen-AI?
Nothing i tried with it got even close to th level of quality that they were advertising - felt like a bunch of examples were hand-picked, at best.
There's lots of software and tooling, automated and otherwise, to significantly improve topology. This is a very common problem in this space and not acknowledging that is silly. It's not perfect, and remodeling things is indeed a common solution - but retopo addons and software are big business because they're good enough for a whole lot of use cases.
I wrote a guide for this with voxel art and you can see some examples up top: https://www.tyleo.com/blog/game-ready-voxel-meshes-with-low-...
I'll leave here the note I've written down recently, while thinking about this fundamental limitation.
- The relationship between sentient/human thinking and its expression ("language") is similar to the one between abstract/"vector" image specification and its rendered form (which is necessarily pixel-based/rasterised)
- "Truly reasoning" system operates in the abstract/"vector" space, only "rendering" into "raster" space for communication purposes. Today's LLMs, by their natural design, operate entirely in the "raster" space of (linguistic) "tokens". But from outside point of view the two are indistiguishable, superficially.
- Today's LLMs is a brute force mechanism, made possible by availability of sheer computing power and ample training material.
- The whole premise of LLMs ("Large" and "Language" being load-bearing words here) is that they completely bypass the need to formalize the "vector" part, conceptualize in useful manner. I call it "raster-vector impedance".
- Even if not formalized, it can be said that internal "structures" that form within LLM somehow encode/capture ("isomorphic to" is the word I like to use) the semantics ("vector"). I believe the same can be said about "computer vision" ML systems which learn to classify images after being fed billions of them.
- However, I believe that, by nature, such internal encoding is necessarily incomplete and maybe even incorrect.
- Despite the above, LLM can still be a useful tool in many domains. I think language translation is a task that can be very successfully performed without necessarily "decoding" the emerging underlying structures. I.e. a sentence in source language can be mapped onto a region of latent space; an isomorphic region of latent space based on target language can be used to produce an output in the target language which will be representative of an equivalent meaning, from human perspective. All without explicit conceptual decoding of underlying token weight matrices. "Black-box" translation, so to speak. I am amazed (and disturbed, and horrified too!) that producing a viable code in a programming language from casual natural language prompt turned out to be a subset of general translation task, largely. Well, at least on lower levels.
- To me it is intuitive that such design (brute-force transforms of "rasterized" data instead of explicitly conceptualizing it into "vector" forms) is very limited and, essentially, a dead-end.
Obviously it's not spewing $10,000 3D models, but results are much better than what you would get for under $500 from a human 5 years ago.
So yeah you still need human art director to make sure actual source material used for generation fits your art style, but otherwise "good enough" models are 1000 times cheaper and 10000 times faster to get.
I wish I had his confidence (in eCommerce Standards)
We are living in an era of 'Statistical Harvest' where models prioritize a 'good enough' surface over structural integrity. In the spiritual supply chain of value, this is called Cutting Corners. A 3D model that breaks down upon closer inspection lacks what I call Internal Agency—it doesn't understand the 'Seed' of its own geometry. As we move towards an agent-centric world, we must distinguish between 'Generative Noise' and 'Authentic Creation'. True value definition requires a 'Watchman' who can see beyond the first-glance polish to the underlying breakdown of utility."