FRESH

Hacker News

Home

Nobody likes lag: How to make low-latency dev sandboxes

103 points by mnazzaro

by tuhgdetzhh

1 subcomments

I’m experiencing a similar issue hosting MCP Server on Cloud Run with scale-to-zero for cost optimization. As far as I know, Cloud Functions v2 and Cloud Run both are container-based, and they tend to have noticeable startup times.
In contrast, AWS Lambdas, which run on Firecracker, have sub-second startup latency, often just a few hundred milliseconds.
Is there anything comparable on GCP that achieves similar low latency cold starts?

by nicolaslecomte

1 subcomments

Thanks for sharing. Makes a lot of sense that removing that routing layer would improve e2e latency.
We had a similar bottleneck building out our sandbox routing layer, where we were doing a lookup to a centralized db to route the query. We found that even with a fast KV store, that lookup still added too much overhead. We moved to encoding the routing logic (like region, cluster ID, etc) directly into the subdomain/hostname. This allowed to drop the db read entirely on the hot path and rely on Anycast + latency-based DNS to route the user to the exact right regional gateway instantly. Also, if you ever find yourselves outgrowing standard HTTP proxies for those long-lived agent sessions, I highly recommend looking at Pingora. It gave us way more control over connection lifecycles than NGINX.
For the compute aspect doing sandbox pooling is cool but might kill your unit economics, especially if at some point each tenant has different images. Have you looked into memory snapshots (that way you only have storage costs not full VMs)?

by jasonjmcghee

0 subcomment

If you don’t have control over all these pieces and just need terminal (ssh / remote vim etc), highly recommend https://mosh.org/

by iterateoften

2 subcomments

Why is there all the sudden an explosion of sandbox related posts and tools? Llms and agents always needed sandboxes… was it just the collective conscious decided all at once that it mattered and the area to focus building tools?

by jpalepu33

1 subcomments

Great write-up on the evolution of your architecture. The progression from 200ms → 14ms is impressive.
The lesson about "delete code to improve performance" resonates. I've been down similar paths where adding middleware/routing layers seemed like good abstractions, but they ended up being the performance bottleneck.
A few thoughts on this approach:
1. Warm pools are brilliant but expensive - how are you handling the economics? With multi-region pools, you're essentially paying for idle capacity across multiple data centers. I'm curious how you balance pool size vs. cold start probability.
2. Fly's replay mechanism is clever, but that initial bounce still adds latency. Have you considered using GeoDNS to route users to the correct regional endpoint from the start? Though I imagine the caching makes this a non-issue after the first request.
3. For the JWT approach - are you rotating these tokens per-session? Just thinking about the security implications if someone intercepts the token.
The 79ms → 14ms improvement is night and day for developer experience. Latency under 20ms feels instant to humans, so you've hit that sweet spot.

by rbbydotdev

0 subcomment

With so many apps in need of these sandboxes I wonder if a browser plugin could be built which provisions a sandbox on the users computer. A type of infra which could be utilized by different providers. The security implications are a little tough, but the attack surface could be likely reduced with the right practices

by barishnamazov

1 subcomments

Not directly related but can't read the text on my phone. It's too thin, maybe you could increase the font weight a bit?

by imiric

0 subcomment

So they used edge servers? How is this novel or insightful?
This article reads like a thinly veiled ad. Certainly not the best way to start a technical blog. If you didn't have the technical insight to know that physics is a factor in latency, why should I trust you with the problems your product actually solves?

by mlhpdx

2 subcomments

Interesting. It seems to me that client side prediction and lag compensation (aka the basics for games in similar situations) would have been a viable alternative.

by nickandbro

0 subcomment

Interesting, I use cloudflare containers and it takes roughly 6-7 seconds to boot up using a very lightweight image.

by sam_lowry_

1 subcomments

Should't we stop sending 100 IP packets on every keystroketo start with?

by hinkley

1 subcomments

When Covid hit I wasn’t the only one working remotely at my company, but I was the only one working remotely in North America, and apparently the only one trying to Work Smarter. By then there were a handful of feature toggles I had implemented that I quickly set to always on in development, but chief among them was that gzip service calls were a net loss in AWS but very very handy while working from home.
I also had switched a head of line service call that was, for reasons I never sorted out, costing us 30ms TTFB per request for basically fifty bytes of data, to use a long poll in Consul because the data was only meant to be changed at most once every half hour and in practice twice a week. So that latency was hidden in dev sandbox except for startup time, where we had several consul keys being fetched in parallel and applied in order, so one more was hardly noticeable.
The nasty one though was that Artifactory didn’t compress its REST responses, and when you have a CI/CD pipeline that’s been running for six years with half a hundred devs that response is huge because npm is teh dumb. So our poor UI lead kept having npm install timeout and the UI team’s answer for “my environment isn’t working” started with clearing your downloaded deps and starting over.
They finally fixed it after we (and presumably half of the rest of their customers) complained but I was on the back 9 of migrating our entire deployment pipeline to docker and so I had nginx config fairly fresh in my brain and I set them up a forward proxy to do compression termination. It still blew up once a week but that was better than him spending half his day praying to the gods of chaos.

by alooPotato

1 subcomments

@mnazzaro have you seen fly.io's new sprites.dev offering?

by globular-toast

0 subcomment

This is a problem that doesn't need to exist. Just run stuff locally on your dev machine with 12 cores and 32Gi of memory. What the hell has happened to need an entire computing cluster and all the network infrastructure between just to write software?

by hackomorespacko

0 subcomment

[dead]

by yellow_lead

0 subcomment

> TL;DR: If you want low latency sandboxes, cut out the middlemen and put your servers next to your users.
Valuable insight /s