FRESH

Hacker News

Home

Vera: a programming language designed for machines to write

111 points by unignorant

by danpalmer

14 subcomments

> The empirical literature shows that models are particularly vulnerable to naming-related errors like choosing misleading names, reusing names incorrectly, and losing track of which name refers to which value.
I think Vera might be missing something here. In my experience, LLMs code better the less of a mental model you need, vs the more is in text on the page.
Go – very little hidden, everything in text on the page, LLMs are great. Java, similar. But writing Haskell, it's pretty bad, Erlang, not wonderful. You need much more of a mental model for those languages.
For Vera, not having names removes key information that the model would have, and replaces it with mental modelling of the stack of arguments.

by still_grokking

1 subcomments

Why would anybody use a vibe-coded and vibe-desinged language which effectively does not exist yet instead of an established one with such features, like Scala?
https://arxiv.org/html/2510.11151v1

by eranation

2 subcomments

There are many problems we will need to address in the future. A programming language that is easy for machines to write but hard for humans to read isn’t one of them.

by solomonb

1 subcomments

I think Hindley Milner (for decidability) + Linear Types (for resource management) + Refinement Types (for lightly asserting invariants) + Delimited Continuation based Effects (for tracking effectful code) + Unison style Content Addressability (for corralling code changes, documentation, and tests) would make a really nice language for an LLM.

by rtpg

1 subcomments

The lack of naming seems to indicate a fundamental misunderstanding of how LLM coding agents are successful, and just makes me doubt anything about this project being useful and workable.

by Taikonerd

0 subcomment

> The signature declares types, preconditions, postconditions, and effects. The compiler verifies the contract via SMT solver.
This reminds me of Dafny: https://dafny.org/
Actually, that's an interesting question: how good are LLMs at writing Dafny?

by offbyone42

0 subcomment

I feel like this misses how LLMs work.
Yes, you’re adding this layer of verification, but LLMs don’t think in ASTs or use formal logic.
They are statistical predictors, just predicting what the next token will be.
There is a reason they perform best with TS/PY and not Haskell. The difference in size of the code corpus for each language.
The premise behind this seems to ignore all of that.

by misja111

0 subcomment

> Every function is a specification that the compiler can verify against its implementation.
This has been tried so many times already. It works nice for functions that only do some arithmetic. But in any real life system that pushes data around over the network or to databases, most things will happen inside effects which leaves the compiler clueless as to whether the function implementation does what it's supposed to do or not.
Don't get me wrong, I'm a big fan of using the compiler to improve productivity and I also believe strong typing leverages LLM power. But this kind of function specification is a dead end IMO.

by signorovitch

1 subcomments

> The evidence suggests the biggest problem models face isn't syntax
So then why is the first mentioned and most obvious difference from other languages
> There are no variable names. @Int.0 is the most recent Int binding
LLMs are trained on code written by humans. They are most “familiar” with popular programming languages, have large datasets of examples and idioms to draw on. I don’t see the advantage of inventing a new language the machine must “learn” with syntax unlike anything it’s been trained on.
Validation and testing are also already things we do with human written code, too.

by ginko

1 subcomments

Is there any evidence that using structural references rather than names allows large language models to generate better code? This bit just feels like obfuscation for obfustcation’s sake.

by anilgulecha

0 subcomment

I think this is the wrong path in LLM and SWE optimizations:
1) Programming language training happens by volume, and the amount of JS/TS/python out there, and the rate it's growing at - is causing a training effects loop, which means for a few generations of models, these will be the best performing languages. Will be hard for a contender to spin up.
2) At some point, if we plateau on productivity - then efficiency improvements will happen, which will open a door for programming languages that maintains productivity, but is 10x cheaper on cost.
3) I think more immediate gains are at the cloud level. IMO, one of the reasons Google cloud is performing better(along with firebase) is much better overall CLI experience, leading to a pleasurable experience developing against it. This part of the market is ripe - whoever builds a most LLM friendly cloud has a shot of shooting up. Hence projects like exe.dev, and whatever cloudflare and vercel are trying. It would be good to have some shakeup in the cloud world.
Anyway, this is where my thoughts are currently.

by hyperhello

1 subcomments

> Division by zero is not a runtime error — it is a type error. The compiler checks every call site to prove the divisor is non-zero.
Elaborate a little here.

by unignorant

1 subcomments

This isn't my project, but I shared it here because it has a few important ideas I've been thinking about in my own work. Effect type systems in particular are a really good fit for LLMs because they allow you to reason very precisely about a program's capabilities before runtime (basically, using the type system for capability proofs). This helps you trust agent-created code (for example, you know it can't do IO), or, if the code does require certain capabilities, run it in a sandbox (e.g., mock network or filesystem). This kind of language design also provides a safer foundation for complex meta-systems of agents-that-create-agents, depending on how the runtime is implemented, though Vera may be somewhat limited in that particular respect.
The major design decision I'm a little skeptical about is removing variable names; it would be interesting to see empirical data on that as it seems a bit unintuitive. I would expect almost the opposite, that variable names give LLMs some useful local semantics.

by DonHopkins

1 subcomments

This is exactly the wrong approach. LLMs are good at writing programming languages they already know, that are well represented in the training data, not at writing programming languages that they have never seen before, so that you have to include the entire programming language manual and lots of example code in every prompt.

by rs545837

0 subcomment

I agree 100% with this thinking approach, I've been working in this domain for quite a few months now.
The right granularity for agents isn't files or lines, it's entities: functions, classes, methods. That's how both humans and agents actually think about code.
We built sem(Ataraxy-Labs/sem) which extracts entities from 30+ languages via tree-sitter and builds a cross-file dependency graph, so building semantic version control and semantic diff. weave (same org) takes it further and does git merges at entity level. Matches functions by name, merges their bodies independently.
The dependency graph also answers questions LLMs can't. I love the analysis based on ASTs.

by econ

0 subcomment

I just renamed a dozen variables from short rather poor descriptions to good long ones and replaced array offsets with vars. The code changed from somewhat confusing code into easily readable. I'm usually not a fan of giantCamelCaseVarNames but if I have to map two dozen things to other things in my head my brain starts to lag and the limit of my context window makes me hallucinate.
I do applaud the lang design effort as there are countless routes of accepting Jesus as your savior.

by firebot

1 subcomments

Why not prolog or one of the other logic languages? It's really old, should be lots of good training data for it and the declarative nature would seem to be a great fit for llms.

by tasuki

0 subcomment

> Traditional compilers produce diagnostics for humans: expected token '{'. Vera produces instructions for the model that wrote the code. Every error includes what went wrong, why, how to fix it with a concrete code example, and a spec reference.
Is this a thing for the llms? As a human, I also prefer being told what went wrong and why and how to fix it, rather than `expected {`

by MarceliusK

0 subcomment

The strongest idea here, IMO, is not the syntax but the feedback loop

by weddpros

0 subcomment

It feels wrong to dump identifiers to save tokens: now they're devoid of semantics, and can't be grep'ed or mapped to concepts. CPUs are good with numbers, but LLMs are good with words.

by hahahacorn

0 subcomment

I think the best language for LLMs is going to be as close to English as you can get with the compiler guarantees offered by Vera (or something similar).
Seemingly opposing forces.

by hybrid_study

0 subcomment

I love the ## Why README section! Every repo should have one :-)

by tasuki

0 subcomment

> Models struggle with maintaining invariants across a codebase, understanding the ripple effects of changes, and reasoning about state over time.
I do, too!

by 2001zhaozhao

4 subcomments

> There are no variable names. @Int.0 is the most recent Int binding; @Int.1 is the one before.
You already lost me here. There's a reason variable names are a thing in programming, and that's to semantically convey meaning. This matters no matter whether a human is writing the code or a LLM.

by rickcarlino

0 subcomment

Reminds me of http://cobra-language.com/

by kshri24

0 subcomment

Providing a blackbox to the blackbox to reason. We are screwed

by sas41

0 subcomment

I find the claims regarding LLMs and their mistake prone nature around variable names very confusing.
It appears that me and creator have had vastly different experiences with LLMs and their capabilities with complex code bases and complicated business logic.
My observations point to LLMs being much more successful when variables and methods have explicit, detailed names, it's the best way to keep them on track and minimize the chance of confusion, next closest thing being explicit comments and inline documentation.
Poorly named and poorly documented things in a codebase only cause it to reason more on what it could be, often reaching a (wrong) conclusion, wasting tokens, wasting time.
Perhaps this diversion in philosophy is due to fundamental differences in how we view the tool at hand.
I do not trust the machine, as such I review it's output, and if the variables lacked names, that would be significantly harder. But if I had a "Jesus, take the wheel!" attitude, perhaps I'd care far less.

by eddythompson80

0 subcomment

I’d ask for a refund on the tokens tbh

by peter_d_sherman

0 subcomment

Related:
Nanolang: A tiny experimental language designed to be targeted by coding LLMs
https://news.ycombinator.com/item?id=46684958
https://github.com/jordanhubbard/nanolang
SudoLang: A Powerful Pseudocode Programming Language for LLMs:
https://medium.com/javascript-scene/sudolang-a-powerful-pseu...
Programming Without People: Designing a Language for LLMs:
(ALaS (AI Language Specification)):
https://dshills.medium.com/programming-without-people-design...
LMQL ("LMQL is a programming language for LLMs."):
https://lmql.ai/
Which language is best for AI code generation? The answer might surprise you:
https://revelry.co/insights/artificial-intelligence/which-la...

by steffs

0 subcomment

[dead]

by ConanRus

0 subcomment

[dead]