If you're providing a service that needs to keep running, you need a strategy for handling unexpected errors. It can be as simple as "fail the request" or "reboot the system", or more complicated. But you need to consider system requirements and the recovery strategy for meeting them when you're writing your code.
Long, long ago I worked with some engineers who thought it was just fine that our big piece of (prototype) telecom equipment took half an hour to boot because of poor choices on their part. Target availability for the device was 5 9s, which is 5 minutes of downtime per year. They didn't seem to realize the contradiction.
For example division by zero often indicates an "unexpected" error, but it wouldn't if you were implementing a spreadsheet.
So to me the approach of using different forms of error reporting for the two kinds of error doesn't seem promising: if you imagine you had to implement division yourself, which kind of error should it report? Should you have two variants of every fallible function so the caller can choose?
Part of the problem is devs often can't tell the difference between an error and a negative result. For example, I worked in a code base once that threw errors when a database query came back empty. That's not an error, that's a result! Errors should be _exceptional_ cases, like the connection to the database dropped, or the user provided bad input making the query impossible.
Errors as happy path control flow also drive me nuts.
It led to a lot of boilerplate and as far as I know with hindsight it's seen as a bad design choice because the line is not so clear to draw.
As a first-order approximation it's good to learn about though, if you haven't heard about the concept. But I'd say this idea is introduced usually in the first programming course in an undergraduate college program.
> Unexpected: function must be called with a non-empty string, and someone didn't
These seem like the same thing, I don't get why they are treated differently.
But there are some obvious follow up questions that I do think need better answers:
Why is recovery made so hard in so many languages?
Error recovery really feels like an afterthought. Sometimes that's acceptable, what with "scripting" languages, but the poor ergonomics and design of recovery systems is just a baffling omission. We deserve better options for this type of control flow.
Also, why do so many languages make it so hard to enumerate the possible outcomes of a computation?
Java tried to ensure every method would have in its signature how it could either succeed or fail. That went so poorly we simply put everything under RuntimeException and gave up. Yet resilient production grade software still needs to know how things can fail, and which failures indicate a recoverable situation vs a process crash+restart.
Languages seem to want to treat all failures as categorically similar, yet they clearly are not. Recovery/retry, logging, and accumulation all appear in the code paths production code needs to express when errors occur.
Following programming language development the only major advancements I've noticed myself have been the push to put more of the outcomes into the values of a computation and then further use a type system to constrain those values. That has helped with the enumeration aspect, leaving exceptions to mainly just crash a system.
The other advancement has been in Algebraic Effects. I feel like this is the first real advancement I've observed. Yet this feature is decried as too academic and/or complex. Yes, error handling is complex and writing crappy software is easy.
Maybe AI will help us get past the crabs at the bottom of the bucket called error handling.
"Type I error, or a false positive, is the incorrect rejection of a true null hypothesis in statistical hypothesis testing. A type II error, or a false negative, is the incorrect failure to reject a false null hypothesis"