Just a few minutes ago, while copying 63 GB worth of pics and videos from my phone to my laptop, KDE forwarded me the error "File <hard to retain name.jpg> could not be opened. Retry, Ignore, Ignore all, Cancel".
This was around file 7000 out of 15000. The file transfer stopped until I made a choice.
As a user, what am I supposed to do with such a popup?
It seems like a very good example of "Eror Handling Without Purpose" as the article describes, but at user level.
Except that here, the audience is "a plain user who just dragged a folder to make a copy" and none of the four options (or even the act of stopping the file transfer until an answer is chosen) is actually meaningful for the user.
The "Putting It Together" for this scenario should look like: a non-modal section populates with "file <hard to retain name.jpg> failed due to reason; at the end of the file transfer you'll get a list with all the files that failed, and you'll have an option to retry them, navigate to their source position to double-check, and/or ignore".
Err(report) => {
// For machines: find and handle the structured error
if let Some(err) = find_error::<StorageError>(&report) {
if err.status == ErrorStatus::Temporary {
return queue_for_retry(report);
}
return Err(map_to_http_status(err.kind));
}
They get it right elsewhere when they describe errors for machines as being "flat and actionable." `StorageError` is that, but the outer `Err(report)` is not. You shouldn't be guessing which types of error you might run into; you should be exhaustively enumerating them.I'd rather have something like this:
struct Exn<T> {
trace: Trace,
err: T,
}
impl<T> Exn<T> {
#[track_caller]
fn wrap<U: From<T>>(self, msg: String) -> Exn<U> {
Exn {
trace: self.trace.add_context(Location::caller(), msg),
err: self.err.into(),
}
}
}
That way your `err` field is always a structured error, but you still get a context trace. With a bit more tweaking, you can make the trace tree-shaped rather than linear, too, if you want.I think actionable error types need to be exhaustively matchable, at least for any Rust error that you expect a machine to be handling. Details a human is interested in can be preserved at each layer by the trace, while details the machine cares about will be pruned and reinterpreted at every layer, so the machine-readable info is kept flat, relevant, and matchable.
This seems akin to complaining that the CPU core has only one instruction pointer. There is nothing preventing a struct implementing `Error` from aggregating other errors (such as validation results) and still exposing them via the `Error` trait. The fact of the matter is that the call stack is linear, so the interior node in the tree the author wants still needs to provide the aggregate error reporting that reflects the call stack that was lost with the various returns. Nothing about that error type implementing `Error` prevents it from also implementing another error reporting trait that reflects the aggregate errors in all of the underlying richness with which they were collected.
With regards to context for the programmer, I still think ultimately tracing and color_eyre (see https://docs.rs/color-eyre/latest/color_eyre/) form a good-enough pair for service style applications, with tracing providing the missing additional context. But its nice to see a simpler approach to actionability.
The cause for an error can be upstream or downstream. If a function fails, because the network is down, then this is a downstream error. The user has not done anything wrong (unless they also are responsible for the network infrastructure). In that case a retry after a few moments might be the right approach. However, if the user provides bad function arguments, then the user needs to be informed, that it's them who need to make corrections. However, it is not always clear if that is the case. If a user requests a non-existing file, then there might be different reasons why the file does not exist (yet).
- the ? keyword is replaced either by runtime exceptions and so each function do it transpires you don’t catch it, or by simply stating the raised exception in the signature
- message can be overloaded for humans
- the exception type itself is the structured data, but in practice it seldom contains structured data and most logic depends on the exception type.
Make of this what you will, but I didn’t say it’s great.
https://github.com/upspin/upspin/blob/master/errors/errors.g...
type Error struct {
// Path is the Upspin path name of the item being accessed.
Path upspin.PathName
// User is the Upspin name of the user attempting the operation.
User upspin.UserName
// Op is the operation being performed, usually the name of the method
// being invoked (Get, Put, etc.). It should not contain an at sign @.
Op Op
// Kind is the class of error, such as permission failure,
// or "Other" if its class is unknown or irrelevant.
Kind Kind
// The underlying error that triggered this one, if any.
Err error
// Stack information; used only when the 'debug' build tag is set.
stack
}NotFound should instead have an instruction "create this object first using that SOP" or "stop the transaction from going through"
Ratelimited has an instruction "try again in x ms" or "raise your rate limit following this SOP"
PermissionDenied has an instruction "request permissions here" or "complete this oauth"
as far as the flat error definition i think that rather than simple, its easy. its simpler to have each module define its own errors and have dedicated translation code to the libraries errors, rather than putting the translation and equivalencies between different modules errors within the library in the programmers head and code comments on the big error definition file.
I like errors that are unique and trivially greppable in a codebase. They should be stack efficient and word sized. Maybe a new calling convention where a register is reserved for error code and another register is a pointer to the source location string that is stored in a data segment.
The FP fanboy side of me likes the idea of algebraic effects and ADTs but not at the expense of stack efficiency.
It may be easier to just add the "?" operator everywhere (and we are lazy and will mostly do what is easier), but it often leads to problem explained in the article.