FRESH

Hacker News

Home

Why I stopped using JSON for my APIs

178 points by barremian

by dfabulich

6 subcomments

> With JSON, you often send ambiguous or non-guaranteed data. You may encounter a missing field, an incorrect type, a typo in a key, or simply an undocumented structure. With Protobuf, that’s impossible. Everything starts with a .proto file that defines the structure of messages precisely.
This deeply misunderstands the philosophy of Protobuf. proto3 doesn't even support required fields. https://protobuf.dev/best-practices/dos-donts/
> Never add a required field, instead add `// required` to document the API contract. Required fields are considered harmful by so many they were removed from proto3 completely.
Protobuf clients need to be written defensively, just like JSON API clients.

by pyrolistical

6 subcomments

Compressed JSON is good enough and requires less human communication initially.
Sure it will blow up in your face when a field goes missing or value changes type.
People who advocate paying the higher cost ahead of time to perfectly type the entire data structure AND propose a process to do perform version updates to sync client/server are going to lose most of the time.
The zero cost of starting with JSON is too compelling even if it has a higher total cost due to production bugs later on.
When judging which alternative will succeed, lower perceived human cost beats lower machine cost every time.
This is why JSON is never going away, until it gets replaced with something with even lower human communication cost.

by pzmarzly

2 subcomments

> With Protobuf, that’s impossible.
Unless your servers and clients push at different time, thus are compiled with different versions of your specs, then many safety bets are off.
There are ways to be mostly safe (never reuse IDs, use unknown-field-friendly copying methods, etc.), but distributed systems are distributed systems, and protobuf isn't a silver bullet that can solve all problems on author's list.
On the upside, it seems like protobuf3 fixed a lot of stuff I used to hate about protobuf2. Issues like:
> if the field is not a message, it has two states:
> - ...
> - the field is set to the default (zero) value. It will not be serialized to the wire. In fact, you cannot determine whether the default (zero) value was set or parsed from the wire or not provided at all
are now gone if you stick to using protobuf3 + `message` keyword. That's really cool.

by davedx

8 subcomments

"Ultra-efficient"
Searched the article, no mention of gzip, and how most of the time all that text data (html, js and css too!) you're sending over the wire will be automatically compressed to...... an efficient binary format!
So really, the author should compare protobufs to gzipped JSON

by codewritero

2 subcomments

I love to see people advocating for better protocols and standards but seeing the title I expected the author to present something which would be better in the sense of supporting the same or more use cases with better efficiency and/or ergonomics and I don't think that protobuf does that.
Protobuf has advantages, but is missing support for a tons of use cases where JSON thrives due to the strict schema requirement.
A much stronger argument could be made for CBOR as a replacement for JSON for most use cases. CBOR has the same schema flexibility as JSON but has a more concise encoding.

by brabel

10 subcomments

Mandatory comment about ASN.1, a protocol from 1984, already did what Protobuf does, with more flexibility. Yes, it's a bit ugly but if you stick to the DER encoding it's really not worse than Protbuf at all. Check out the Wikipedia example:
https://en.wikipedia.org/wiki/ASN.1#Example_encoded_in_DER
Protobuf is ok but if you actually look at how the serializers work, it's just too complex for what it achieves.

by written-beyond

2 subcomments

Idk I built a production system and ensured all data transfers, client to server and server to client were proto buf and it was a pain.
Technically, it sounds really good but the actual act of managing it is hell. That or I need a lot of practice to use them, at that point shouldn't I just use JSON and get on with my life.

by JoshMock

3 subcomments

Protobuf is a great format with a lot of benefits, but it's missing one that I wish it could support: zero-copy. The ability to transport data between processes, services and languages with effectively zero time spent on serialization and deserialization.
It appears possible in some cases but it's not universally the case. Which means that similar binary transport formats that do support zero-copy, like Cap'n Proto, offer most or all of the perks described in this post, with the addition of ensuring that serialization and deserialization are not a bottleneck when passing data between processes.

by morshu9001

2 subcomments

Protos are great. Last time I did a small project in NodeJS, I set up a server that defines the entire API in a .proto and serves each endpoint as either proto or json, depending on the content header. Even if the clients want to use json, at least I can define the whole API in proto spec instead of something like Swagger.
So my question is, why didn't Google just provide that as a library? The setup wasn't hard but wasn't trivial either, and had several "wrong" ways to set up the proto side. They also bait most people with gRPC, which is its own separate annoying thing that requires HTTP/2, which even Google's own cloud products don't support well (e.g App Engine).
P.S. Text proto is also the best static config language. More readable than JSON, less error-prone than YAML, more structure than both.

by recursivecaveat

4 subcomments

My dream binary format is schema driven, as compact and efficient as Capt Proto or such, but just optionally embeds the entire schema into the message. Then we can write a vim plugin that just opens the file in human readable form without having to fish for the schema. Whenever I am using binary formats, it's because I have a list of millions of objects of the same types. Seems to me that you may as well tack 1KB of schema onto a 2GB message and make it self-describing so everyone's life is easier.

by mjmas

0 subcomment

> The same message in Protobuf binary
> → About 23 bytes.
This appears to be a nice ai-generated result. There are already at least 23 bytes of data there (number 42 (1 byte) + string of 5 chars + string of 17 chars + 1 boolean), so that plus field overhead will be more.

by rock_artist

2 subcomments

I feel few points weren’t addressed in the article.
1. Size, biggest problem with JSON can happen when things gets too big. So here other formats might be better. Yet, as a reminder JSON has the binary version named BSON.
2. Zero batteries. JSON is readable by humans but also almost self explanatory format. Most languages has built in or quick drop in for json. Still, it’s easy to implement a limited JSON parser from scratch when in need (eg. Pure on func in C on a tiny device).
Working with Protobuf and MsgPack in the past, You have much more tooling involved especially if data passes between parts written in different languages.
3. Validation, JSON is simple. But there are solutions such as JSON Schema.

by wg0

1 subcomments

Protos don't work out of the box in any browsers as far as I checked last time unless you're willing to deploy a proxy in front to do the translation and it requires extra dependency on the browser as well.
Plus - tooling.
JSON might not be simpler or strict but it gets the job done.
If JSON's size or performance is causing you to go out of business, you surely have bigger problems than JSON.

by socalgal2

0 subcomment

https://github.com/ajv-validator/ajv
138 million downloads from npm in the last week. Yes, you can validate your JSON

by PunchyHamster

1 subcomments

"I've used this optimization technique to make app faster"
The app 20req/sec
The app after optimizations: 20req/sec (It waits for db query anyway)

by madFlasher

0 subcomment

The author should check first the performance of protobuf serialization/deserialiazation in browsers
Due to very native nature of JSON in browsers and node backend it usually also the fastest data format
If for example you have C++ on backend and C++ on frontend - you'll definitely have some performance boost
But for browsers usage the goal is not so obvious

by coffeeaddict1

1 subcomments

> protobuf
As an aside, like all things Google, their C++ library is massive (14mb dll) and painful to build (takes nearly 10 minutes on my laptop).

by wilg

1 subcomments

One of the best parts of Protobuf is that there's a fully compatible JSON serialization and deserialization spec, so you can offer a parallel JSON API with minimal extra work.

by wallaconno

0 subcomment

My criticism is that protobuf is a premature optimization for most projects. I want to love protobuf, but it usually slows down my development pace and it’s just not worth it for most web / small data projects.
Distributing the contract with lock-step deploys between services is a bit much. JSON parsing time and size are not usually key factors in my projects. Protobuf doesn’t pose strict requirements for data validation, so I have to do that anyway. Losing data readability in transit is a huge problem.
Protobuf seems like it would be amazing in situations where data is illegible and tightly coupled such as embedded CAN or radio.

by gethly

0 subcomment

Reading the first few paragraphs and immediately seeing PB made me instantly think of "Every master was once a beginner.".
When you'll go through your own journey, and inevitably end back with json, do write another blog post :) ... we've all been there.

by cheema33

0 subcomment

I use GraphQL. It has a higher learning curve. But it addresses the shortcomings listed by the referenced blog article. It offers type safety, efficiency and modern tooling. And it is also human readable.
If you use good tooling, you can have a mutation change a variable type in the database and that type change is automatically reflected in the middleware/backend and the typescript UI code. Not only that libraries like HotChocolate for asp.net come with built-in functions for filtering, pagination, streaming etc.

by Jemaclus

3 subcomments

"Better than JSON" is a pretty bold claim, and even though the article makes some great cases, the author is making some trade-offs that I wouldn't make, based on my 20+ year career and experience. The author makes a statement at the beginning: "I find it surprising that JSON is so omnipresent when there are far more efficient alternatives."
We might disagree on what "efficient" means. OP is focusing on computer efficiency, where as you'll see, I tend to optimize for human efficiency (and, let's be clear, JSON is efficient _enough_ for 99% of computer cases).
I think the "human readable" part is often an overlooked pro by hardcore protobuf fans. One of my fundamental philosophies of engineering historically has been "clarity over cleverness." Perhaps the corollary to this is "...and simplicity over complexity." And I think protobuf, generally speaking, falls in the cleverness part, and certainly into the complexity part (with regards to dependencies).
JSON, on the other hand, is ubiquitous, human readable (clear), and simple (little-to-no dependencies).
I've found in my career that there's tremendous value in not needing to execute code to see what a payload contains. I've seen a lot of engineers (including myself, once upon a time!) take shortcuts like using bitwise values and protobufs and things like that to make things faster or to be clever or whatever. And then I've seen those same engineers, or perhaps their successors, find great difficulty in navigating years-old protobufs, when a JSON payload is immediately clear and understandable to any human, technical or not, upon a glance.
I write MUDs for fun, and one of the things that older MUD codebases do is that they use bit flags to compress a lot of information into a tiny integer. To know what conditions a player has (hunger, thirst, cursed, etc), you do some bit manipulation and you wind up with something like 31 that represents the player being thirsty (1), hungry (2), cursed (4), with haste (8), and with shield (16). Which is great, if you're optimizing for integer compression, but it's really bad when you want a human to look at it. You have to do a bunch of math to sort of de-compress that integer into something meaningful for humans.
Similarly with protobuf, I find that it usually optimizes for the wrong thing. To be clear, one of my other fundamental philosophies about engineering is that performance is king and that you should try to make things fast, but there are certainly diminishing returns, especially in codebases where humans interact frequently with the data. Protobufs make things fast at a cost, and that cost is typically clarity and human readability. Versioning also creates more friction. I've seen teams spend an inordinate amount of effort trying to ensure that both the producer and consumer are using the same versions.
This is not to say that protobufs are useless. It's great for enforcing API contracts at the code level, and it provides those speed improvements OP mentions. There are certain high-throughput use-cases where this complexity and relative opaqueness is not only an acceptable trade off, but the right one to make. But I've found that it's not particularly common, and people reaching for protobufs are often optimizing for the wrong things. Again, clarity over cleverness and simplicity over complexity.
I know one of the arguments is "it's better for situations where you control both sides," but if you're in any kind of team with more than a couple of engineers, this stops being true. Even if your internal API is controlled by "us," that "us" can sometimes span 100+ engineers, and you might as well consider it a public API.
I'm not a protobuf hater, I just think that the vast majority of engineers would go through their careers without ever touching protobufs, never miss it, never need it, and never find themselves where eking out that extra performance is truly worth the hassle.

by brunoluiz

0 subcomment

I am curious why the author did not consider ConnectRPC (http://connectrpc.com/), which could be a great middle ground since it is compatible with both Protobuf and JSON served APIs. It is developed by Buf, which has been a leader Protobuf tooling.

by asa400

0 subcomment

Whenever people bring up binary formats I have to bring up Erlang External Term Format (ETF)[0].
It's the native format Erlang nodes use to serialize data to communicate with each other, but it's just a simple Type-Length-Value binary format, so anything can implement it.
It's small enough that you can create a reasonably complete and fast implementation of it for your language in an afternoon. It's self-describing, so if you can read ETF you can read any ETF message, as there are no out of band schemas. I love it.
[0] - https://www.erlang.org/doc/apps/erts/erl_ext_dist.html

by lenkite

0 subcomment

There is no mention in this thread of how the author is using Dart and Shelf for his REST API's. Code is rather readable and elegant. This is not a combo I have every tried before. Does anyone have any experience of how it compares versus REST services written in Go/Rust/Python ?

by knallfrosch

0 subcomment

I have never encountered the use case where the data sent by the backend over the normal CRUD operations was the bottleneck.
But we have built protobuf into a web server that handles 2 requests per second. Why? We wanted to learn about it on the job.
I think that's 99% of Protobuf usage.

by teleforce

1 subcomments

Kudos to the poster and the author of this article. I think this is by far the most insightful technical post I've read this year on HN.
>If you develop or use an API, there’s a 99% chance it exchanges data encoded in JSON.
Just wondering if the inherent defiencies of JSON can somewhat be improved by CUE lang since the former is very much pervasive and the latter understand JSON [1],[2].
[1] Configure Unify Execute (CUE): Validate, define, and use dynamic and text‑based data:
https://cuelang.org/
[2] Cue – A language for defining, generating, and validating data:
https://news.ycombinator.com/item?id=20847943

by beders

0 subcomment

Know the consumer of your API.
If that is just your team, use whatever tech gets you there quick.
However, if you need to provide some guarantees to a second or third party with your API, embrace standards like JSON, even better, use content negotiation.

by globular-toast

0 subcomment

> An API (Application Programming Interface) is a set of rules that allow two systems to communicate. In the web world, REST APIs ... are by far the most widespread.
I too had this overly restrictive view of "APIs" for too long. One I started to think about it as the interface between any teo software components it really changed the way I did programming. In other words, a system itself is composed and that composition is done via APIs. There's no point treating "the API" as something special.

by pjacotg

0 subcomment

On the human readability concern, we use protobuf converted to text format. It looks JSON like so very readable and comes with all the other benefits of protobuf.

by cryptonector

0 subcomment

If you're gonna switch from JSON to... PB, then you might as well switch to flatbuffers before you realize it's better than PB and save yourself a lot of trouble.

by keithnz

0 subcomment

CBOR is a pretty good middle ground

by umvi

0 subcomment

Protobufs are a pain to debug and maintain compared to json and modern browsers support zstd compression making json "efficient"

by protobuf-ai

0 subcomment

I started working with protobuf on a project, and just went down the path of MCP. If anybody would like to try it out, it's here: https://www.protobuf.ai/. Just a lightweight MCP server for schemas that plugs into a schema registry.

by zabil

0 subcomment

I have a slight dislike for JSON+REST for API's.
The design overhead involved in determining the correct URL and HTTP method adds a layer of subjectivity to the design and bike shedding arguments.
I’m not a huge fan of Protobuf/GRPC either, if there’s a better alternative I believe RPC is the right approach for exposing APIs.

by sergiotapia

1 subcomments

Personally, I looked into protobuf for our Elixir/React Native wombocombo but the second I realized we would have to deploy app updates when we added or removed a field from the response structure it became a non-starter.
I can't imagine using protobuf when you're in the first 5 years of a product.

by stanfordkid

0 subcomment

They didn't even mention MessagePack. Also there is a huge amount of developer over-head for using things like ProtoBuf. You can always validate your API responses with Zod or JSONSchema so that is a bit of a moot point!

by undefeated

0 subcomment

Complaining about JSON and then proceeding to write your API in Dart is... interesting.

by baquero

0 subcomment

If you want to work with ProtoBuf as with other APIs, have a look at https://github.com/qaware/protocurl

by dhussoe

1 subcomments

I like that JSON parsing libraries (Serde etc.) allow you to validate nullability constraints at parse-time. Protobuf's deliberate lack of support for required fields means that either you kick that down to every callsite, or you need to build another parsing layer on top of the generated protobuf code.
Now, there is a serde_protobuf (I haven't used it) that I assume allows you to enforce nullability constraints but one of the article's points is that you can use the generated code directly and:
> No manual validation. No JSON parsing. No risk of type errors.
But this is not true—nullability errors are type errors. Manual validation is still required (except you should parse, not validate) to make sure that all of the fields your app expects are there in the response. And "manual" validation (again, parse don't validate) is not necessary with any good JSON parsing library, the library handles it.

by jtrn

0 subcomment

I like Python-like indentation, but I usually read Python in an IDE or code blocks. JSON in a non-monospace environment might be problematic with some fonts. Hell, I pass JSON around in emails and word processors all the time.

by whatevaa

0 subcomment

I can use text based API's (like with JSON) with nothing else but text editor and curl. No other tooling actually required. Meanwhile if binary protocol tooling in your stack sucks, then it just sucks.
Had the joy of implementing calling SOAP service with client generated from wsdl in .net core 2/3 times. Tooling was shit poorly undocumented amd with crappy errors at that time. Run only with very specific versions in very specific way. And not much you can do about it, rolling your own SOAP client would be too expensive for our team.
REST with JSON meanwhile is easy and we do it all the time, don't need any client, just give us request/response spec and any docs.

by borplk

0 subcomment

If you have a protobuf API, does it work in the js environment of browsers? Last time I checked (many years ago) the browser story wasn't good.

by lazy_afternoons

0 subcomment

IIRC from the red wild boar book, JSONs biggest win is that it got the adoption.
Getting everyone to agree on a standard was/is/will be the tougher part.

by spagoop

4 subcomments

Is it just me or is this article insanely confusing? With all due respect to the author, please be mindful of copy editing LLM-assisted writing.
There is a really interesting discussion underneath of this as to the limitations of JSON along with potential alternatives, but I can't help but distrust this writing due to how much it sounds like an LLM.

by liampulles

0 subcomment

I know that OpenAPI code gen support is spotty, and that protobuf codegen (in my experience) is quite good, but all of this starts from the idea that the SaaS I'm consuming has actually documented their API properly.
A sizable portion of the integrations I've built have had to be built by hand, because there are inevitable stupid quirks and/or failures I've had to work around. For these usecases, using JSON is preferable, because it is easy for me to see what I have actually been sent, not what the partially up to date spec says I should've been sent.
This is consistent with the idea that communication over the internet should consist of (encrypted and compressed) plain text. It's because human beings are going to have to deal with human reality at the end of the day.

by catchmeifyoucan

3 subcomments

I wonder if we can write an API w/ JSON the usual way and change the final packaging to send it over protobuf.

by _el1s7

0 subcomment

Why I stopped caring about "Why I stopped [insert something widely used here]" click bait articles

by bzmrgonz

0 subcomment

What about TOON op? I understand that's the standard poised to takeover from Json. Your thoughts??

by taco_emoji

0 subcomment

i really hate this blog post format of posting the new thing you discovered as if it's objectively better than the previous thing. no it's not, you just like it better, which is FINE, just own it

by _heimdall

0 subcomment

I need to dig deeper into Protobuf. I've never quite understood the benefit of Protobug over XML.

by firemelt

0 subcomment

this article makes me keep json

by loph

1 subcomments

How many times has this problem been "solved"?
https://en.wikipedia.org/wiki/DCE/RPC
DCE/RPC worked in 1993, and still does today.
Protocol buffers is just another IDL.

by esafak

1 subcomments

It's premature and maybe presumptuous of him to be advertising protobufs when he hasn't heard of the alternatives yet. I'll engage the article after he discovers them...

by rizky05

0 subcomment

[dead]

by volemo

0 subcomment

S-expressions exist since 1960, what more do you need? /s