FRESH

Hacker News

Home

Launch HN: Morph (YC S23) – Apply AI code edits at 4,500 tokens/sec

217 points by bhaktatejas922

by deepdarkforest

9 subcomments

> 1) Raw inference speed matters more than incremental accuracy gains for dev UX—agree or disagree?
I know you are trying to generate some controversy/visibility, but i think if we are being transparent here, you know this is wrong. People prefer using larger (or reasoning) models, with much bigger diff in tok/sec just for quality in coding, it comes first. Even if i have a big edit to apply, like 5k tokens, 200-300ms of difference in edit time are nothing. Edit speed is definitely not a bottleneck for dev UX, quality is. A dev who wants to save 200ms every code change over quality is someone who well, i cannot relate. If im using 1-2 agents in parallel, most of the time the edits are already applied while im reviewing code from the other agents. But again maybe that's just me.
Speaking of quality, how do you measure it? Do you have any benchmarks? How big is the difference in error rate between the fast and large model?

by laborcontract

2 subcomments

Really like this. I've been trying microsoft's copilot and it's so clunky, particularly when applying edits. One would assume they have the resources to train the model..
Request: please provide a system prompt in the docs to help the llm generate the diff format that performs best w/ your models. LLMs frequently change the way they present diffs on upgrades and I don't want to be guessing which format is best.
EDIT: Please clarify your privacy policy. If my interpretation is correct, paying users will have their data retained and trained on? Is there any way to pay to use the service (w/o picking up the phone) and not have my data trained on?
```
  4.1 Use of Service Data

  Depending on your subscription tier:

  Free Tier: We may use your submitted code data to train our models, improve our Services, and develop new features.
  Engineer Tier: We may use your submitted code data to train our models, improve our Services, and develop new features, subject to the confidentiality provisions in your service agreement.
  Enterprise Tier: We do not use your submitted code data for any purpose other than processing your immediate request. Your code data is never used for model training or service improvement.
```
[0] https://morphllm.com/privacy

by weird-eye-issue

1 subcomments

Seems completely broken.
I used the provided HTML example on https://morphllm.com/dashboard/playground/apply. Without editing anything at all, I pressed apply.
Your model added a bunch of CSS even though that wasn't in the update instructions at all. It also added a contact section, which again, wasn't in the update instructions that your demo provided.

by Workaccount2

1 subcomments

Just for clarification here because I am a bit confused,
Morph is a tool for integrating the output of other LLMs and not an LLM itself? It doesn't generate 4500 tok/sec, it can edit 4500 tok/sec?

by Kamshak

1 subcomments

It's more expensive than Gemini flash which can actually write pretty decent code (not just apply a diff). Fast AI edit application is definitely great but that's pretty expensive
Morph v3 fast: Input: $1.20 / M tokens, Output $2.70 / M tokens
Gemini 2.5 Flash: $0.30 / M tokens, Output $2.50 / M tokens
(Source: OpenRouter)

by seanw265

2 subcomments

Last time I looked into Morph, I noticed you weren’t yet on OpenRouter. I see that’s changed, but it looks like only an older model is listed. Any plans to be more active there?
Also, are there any benchmarks comparing your fast apply models to others like Relace or even Llama via Cerebras? I’m particularly interested in output accuracy.

by bijection

1 subcomments

How does this compare to relace, which I believe is also a YC company? They seem to have very similar functionality [0]
[0] https://www.relace.ai/

by krishvs

1 subcomments

Really impressive. I'm in the market for such a solution for our internal AI coding systems - how do you compare to the opensource https://huggingface.co/osmosis-ai/Osmosis-Apply-1.7B?
I am assuming your models are not opensource/openweights?

by nico

1 subcomments

Would be awesome to have a browser extension that could create a bridge between ChatGPT and VSCode, applying Morph in between (or Claude instead of ChatGPT). Essentially use the web interface, instead of the APIs for agentic coding

by scottpersinger

2 subcomments

I’d just like to put a pitch in here for someone to do “smart rebase+merge” with AI. Now THAT would really speed up development, if my AI was intelligently merging code from different users in the background, based on understanding the intent behind each conflicting change.

by zackangelo

1 subcomments

For anyone more curious about how this works, Fireworks wrote a blog post about it last year (I think):
https://fireworks.ai/blog/cursor

by w10-1

1 subcomments

(warning: outside, naive perspective)
> 1) Raw inference speed matters [most] for dev UX—agree or disagree?
Or maybe incremental content-assist and full-file problem-solving are two significantly different uses, though they're both dev UX use cases.
Because they're confusingly similar, comparing them (and denigrate full-file solutions) wastes time/energy. You muddy your own message.
Just concentrate on showing the value of what you do where and when. To wit...
In the inference case, you're really using context to provide affordances -- next steps. In the full-file case, you're starting instead from a goal statement, with context providing constraints.
I think where you want to go is to show when the tool anticipates where you *should* go; i.e., the extent to which it can lead junior developers to the next step, and senior developers to the next constraint/issue they're ignoring.
I believe just as "attention is all you need" surprised people, this kind of bottom-up approach has more legs than people expect.
I understand the naked probability model is trained on world code corpus; what would interest me is whether you can also create a model that learns the developer's biases.
Then the work is to see the issues in the context, but address them in the order and manner that the developer would. Lock-in would occur because, well, the system understands me. And it would be particularly nice when Programmer A wants to code like Programmer B. If your assistant has a model of Programmer B, the assistant could guide Programmer A in that direction.

by FridgeSeal

0 subcomment

> Raw inference speed matters more than incremental accuracy gains for dev UX
Now I can be wrong, faster!

by csomar

1 subcomments

1- I agree that speed is intelligence[+], but you're suggesting we can reduce accuracy, increase speed, and get better results. I don't buy it.
2- I'm confused. Claude Code and a Neovim plugin I used both do edits/diffs. Are you saying they're actually rewriting entire files instead?
3- Aren't "simple tasks" just things you train the model on? If so, are you solving a bunch of simple tasks or offering custom training?
> No more slow full-file rewrites or brittle search-and-replace hacks.
Here's the thing - LLMs are already blazing fast. I commented the other day that you could probably write Chrome's entire code base in a couple months at average speed. The bottleneck isn't speed, it's accuracy; that's of course my opinion.
+: https://omarabid.com/claude-magic

by simonw

2 subcomments

This uses an OpenAI-compatible endpoint, so got this working with my https://llm.datasette.io/ CLI tool.
First I added their models to my ~/Library/Application Support/io.datasette.llm/extra-openai-models.yaml file:
```
  - model_id: morph-auto
    model_name: auto
    api_base: https://api.morphllm.com/v1
    api_key_name: morph
```
Then I added the API key like this:
```
  llm keys set morph
  # Paste in API key from https://morphllm.com/api-keys
```
Then I saved an LLM template with their prompting pattern:
```
  llm -m morph-auto '<code>$code</code><update>$update</update>' --save morph
```
Now I can run operations like this:
```
  llm -t morph -p code "$(cat orig.txt)" -p update "$(cat update.txt)"
```
The -t option is the template I named when I ran --save. The -p name value options then set the content for the template $code and $update variables.
Example transcript here: https://gist.github.com/simonw/de67818603d448a3fee788ace2976...
One thing that worries me: since it's using XML-style tags <code> and <update>, if my own source code contains those tags I expect it may get confused.

by z3ugma

3 subcomments

How do I start using this on a codebase on my local computer? I'm quite confused by the quickstart. Do I use a VSCode extension? One of the Claude Code like clones but with this as a custom model?

by joshmlewis

1 subcomments

I'm going to test implementing this for my project https://promptslice.com and see how it does with text based edits. I assume it will do ok.
I'm also really curious about the XML tool calls in the documentation. I have not heard of this being the norm for tools like Cursor. Is that still the case? I feel like I'm pretty in the know about this stuff but must have missed that trend.

by kordlessagain

0 subcomment

This is what I use and it's free for anyone to use individually: https://github.com/kordless/gnosis-evolve/blob/main/contrib_...

by michaelneale

1 subcomments

Have been using morph for a while (I am one of the authors of goose) and was surprised when introduced at the boost it gave me (much less iteration with the main expensive LLM, and I can even make the editing process simpler to take a load off the agent). Used it with claude 3.5, 3.7, 4 and currently with a o3/openai and anthropic/claude4 + morphllm combo today.

by furyofantares

1 subcomments

Does Claude Code have a similar apply model? It does create diffs for you to accept/reject but then I feel like it's always using a find/replace tool to apply it rather than a model that rewrites the whole file. I don't know how the speed of this approach compares but the accuracy feels great.

by golergka

0 subcomment

I don't think this makes any sense as a standalone product, but I wish I had it in my claude code and aider as an intermediate step right away. The latter tool, while less popular, already supports _exactly_ this workflow in the architect mode — a good candidate to be your first integration, I think.

by handfuloflight

3 subcomments

Is there anyway to bring this into Claude Code?

by strogonoff

0 subcomment

The most important question for any new company based around generative ML: how did you source training data, and did you observe the licesning (e.g., never use code under GPL or default copyright conditions)?
Very few companies can or are willing to answer that.

by dsp_person

1 subcomments

2. Sometimes watching Claude do inline edits I cringe watching it delete almost the entire file line by line. I think when early in exploring a problem it's best to torch the previous version and fully re-write than to assume incremental.

by helsinki

1 subcomments

Doesn’t Apple’s new model do the same thing? https://huggingface.co/apple/DiffuCoder-7B-cpGRPO

0 subcomment

by amelius

1 subcomments

Can't you ask these LLMs to simply output a patch file?
https://man7.org/linux/man-pages/man1/patch.1.html

by elzbardico

1 subcomments

1) Raw inference speed matters more than incremental accuracy gains for dev UX—agree or disagree?
Yeah, I love reviewing and debugging thousands of lines of buggy and dirty AI generated code. Who cannot love it?

by boleary-gl

0 subcomment

Kilo Code team member here
Would love to chat about integrating the models into Kilo Code if you’re interested
You can contact me at brendan [at] kilocode [dot] ai

by agrippanux

1 subcomments

How does this compare to Google Diffusion? Diffusion writes out at seemingly the speed of thought.

by orge

1 subcomments

It would be great to have an integration with Aider or OpenCode.

by pcwelder

1 subcomments

Impressive demo! Is there any rate limit (TPM)?

by bigtimegangstar

1 subcomments

Gotta try this, maybe even a obsidian plugin?

by rs186

1 subcomments

Sounds interesting, but I imagine all the big players (Cursor, Windsurf, and maybe even OpenAI/Anthropic) will achieve something similar very quickly in their tools first-party, which will decimate the company. And I don't get the API part of this -- at the end of the day people use those IDEs, and I don't see developers/companies want to send their code to yet another endpoint.

by shifald

1 subcomments

Nice! Will give it a try. Congrats on the launch

by lastdong

1 subcomments

Is this similar to Gemini Diffusion? Thanks

by callamdelaney

0 subcomment

Yeah sounds like exactly what we need

by b0a04gl

0 subcomment

[dead]

by nimchimpsky

0 subcomment

[dead]

by kruxigt

0 subcomment

[dead]

by saltserv

0 subcomment

[dead]

by Bluestein

3 subcomments

"Let's be wrong faster! It's a feature, we swear!"

by kaycey2022

0 subcomment

[flagged]

by eabeezxjc

0 subcomment

why not ruby?
because ruby no need corecting. It works.

by Qerbz

2 subcomments

Heard some insane rumors of the efficacy increase of this in action even though I don't know how you do it