There was an interesting discussion yesterday on the Elixir Slack about how libraries should handle errors. This is a more thought-through and elaborate expression on my views on the matter. In the post, I’ll present an idealised version of how I think a public API for functions that may produce errors should look like.
Errors vs Exceptions
In most applications we can distinguish two kinds of situations where some error occurs:
- actionable errors - i.e. expected errors. When those happen we need to handle
them gracefully, so that our application can continue working correctly. A
good example of this is invalid data provided by the user or data not present
in the database. In general we want to handle this kind of errors by using
tuple return values (
{:ok, value} | {:error, reason}
) - the consumer can match on the value and act on the result. - fatal errors - errors that place our application in undefined state - violate some system invariant, or make us reach a point of no return in some other way. A good example of this is a required config file not present or a network failing in the middle of transmitting a packet. With that kind of errors we usually want to crash and lean on the supervision system to recover - we handle all errors of this kind in a unified way. That’s exactly what the let it crash philosophy is about - it’s not about letting our application burn in case of any errors, but avoiding excessive code for handling errors that can’t be handled gracefully. You can read more about what the “let it crash” means in the excellent article by Fred Hebert of the “Learn You Some Erlang for Great Good” fame - “The Zen of Erlang”.
Now comes the tough realisation - when we’re writing libraries, we often don’t know if something is an actionable or a fatal error - it’s not up to our library to decide, but up to the consumer. We could see this with the absent config file example - an absent file is hardly a fatal error, but an absent config file definitely is. Therefore, we need to design a flexible API that allows the user of the library to handle the errors the way they need to. The rest of the article focuses exactly on that.
Ok/Error tuples
We already mentioned those. They are extremely common in both the Elixir and Erlang libraries - and there’s a reason for that! They are, in a way, an elixir version of the Either or Result types present in many statically-typed languages. I will dare to say that functions returning ok/error tuples should be the main interface of libraries. Why?
- It’s easy to pattern match on them in a
case
expression and act accordingly in case the reported error is of the “actionable” type. - It’s equally easy to convert the tuple return value into a crash if the reported error is, for our application, of the “fatal” type: e.g.
{:ok, value} = YourLibrary.may_fail(foo, bar, baz)
An important thing to consider here, is to not only make the entire return value
easily matchable, but to apply the same rule to the error reasons. It might
happen that one kind of error may be “actionable”, while another one is “fatal”.
An easy way to achieve that is to return atoms or tuples with atoms and variables
instead of returning strings. A string is the least structured type one can
imagine, and it’s generally very hard to do something with it.
Because of that, instead of returning {:error, "unknown value: :foo"}
one
would prefer to return {:error, {:unknown, :foo}}
.
A common approach to solve the problem of returning easily-matchable errors and
providing nice messages at the same time,
that is found in Erlang libraries is for the module to
export an additional format_error/1
function, where we can pass the reason
from the {:error, reason}
tuple and format it as a nice string. Continuing
with the previous example, we would implement a function:
def format_error({:unknown, value}),
do: "unknown value: #{inspect value}"
Deeply recursive situations
I can almost hear some people ready to explode with things like: “Ok/error tuples are tedious in recursive functions” or “The constant wrapping and unwrapping is slow!”. And indeed, I fully agree that the wrapping and unwrapping can become tedious and that there are situations, especially in tight loops, where this could have a significant impact on performance.
Fortunately, Elixir has a feature that can help us here - catch
and throw
.
We can wrap our call in a catch
expression and use throw
from our deeply
nested code to signal errors. This allows us to gain the convenience the
non-local error reporting gives, while maintaining the nice ok/error tuple
interface in our public functions.
Let’s see a practical example -
Ecto.UUID.dump/1
function:
def dump(<< a1, a2, a3, a4, a5, a6, a7, a8, ?-,
b1, b2, b3, b4, ?-,
c1, c2, c3, c4, ?-,
d1, d2, d3, d4, ?-,
e1, e2, e3, e4, e5, e6, e7, e8, e9, e10, e11, e12 >>) do
try do
<< d(a1)::4, d(a2)::4, d(a3)::4, d(a4)::4,
d(a5)::4, d(a6)::4, d(a7)::4, d(a8)::4,
d(b1)::4, d(b2)::4, d(b3)::4, d(b4)::4,
d(c1)::4, d(c2)::4, d(c3)::4, d(c4)::4,
d(d1)::4, d(d2)::4, d(d3)::4, d(d4)::4,
d(e1)::4, d(e2)::4, d(e3)::4, d(e4)::4,
d(e5)::4, d(e6)::4, d(e7)::4, d(e8)::4,
d(e9)::4, d(e10)::4, d(e11)::4, d(e12)::4 >>
catch
:error -> :error
else
binary ->
{:ok, binary}
end
end
def dump(_), do: :error
defp d(?0), do: 0
# many more cases
defp d(_), do: throw(:error)
Each of the d/1
calls could fail - handling this through wrapping and
unwrapping tuples would be a nightmare - using throws gives us an elegant and
efficient solution. Beware! When using throws, it’s important to be specific in
the throws we’re catching and usually contain them locally, within one module.
This allows avoiding accidental errors and lowers the complexity of
understanding non-local returns.
Bang functions
Have you noticed, that until this moment, I have mentioned
neither exceptions nor the bang functions (foo!
vs foo
)? That’s because, I
consider them to be an “additional” API a library can offer. They replace
pattern matching on the return value and MatchError
s that would be raised in
case a {:ok, value} = ...
match, that I proposed couple paragraphs above,
fails.
Provided we already defined the format_error/1
function mentioned above adding
a “bang” version of a function in our public interface is fairly easy:
def foo!(arg1, arg2) do
case foo(arg1, arg2) do
{:ok, value} -> value
{:error, reason} -> raise format_error(reason)
end
end
We could also consider defining a custom exception struct for our library and
instead use raise YourLibrary.Error, format_error(reason)
, or pass the raw
reason to the exception struct and format it only in the exception’s message/1
callback. This keeps the exception struct easily pattern-matchable, should such
a need arise.
defmodule YourLibrary.Error do
defexception [:reason]
def exception(reason),
do: %__MODULE__{reason: reason}
def message(%__MODULE__{reason: reason}),
do: YourLibrary.format_error(reason)
end
Of course that kind of unwrapping of the errors in every function is rather repetitive - I agree with that. But something you need to consider, when devising solutions for this repetition - is the code you’re writing right now is going to be read or written more often? Especially in a library, the code will be read much more often, than it will be written - and some repetition in things like that, while a bit more laborious to write, can make the code much easier to understand when reading.
Ok/error unwrapping macro
So, you still want to avoid that duplication? Let’s see what is, in my opinion,
the least bad way to do this.
It’s important to use a macro here, instead of a function - here’s why. The
unwrapping function is often called in a tail position. Because of Erlang’s tail
call optimisation, this call would remove the current function from the
stacktrace, so when we reach the real raise
- the stacktrace wouldn’t include
the function, where the real error actually happened (!) - only the unwrapping
function.
That’s not really helpful for the user. Using a macro solves that issue.
defmacrop unwrap_or_raise(call) do
quote do
case unquote(call) do
{:ok, value} -> value
{:error, reason} -> raise YourLibrary.Error, reason
end
end
end
# we now can implement foo! in terms of bangify and foo
def foo!(arg1, arg2) do
unwrap_or_raise foo(arg1, arg2)
end
This still leaves a bit of repetition that could be further removed with some more complex macros, but I think this strikes a good balance between clear and not an extremely verbose code.
Composing ok/error tuples in with
One issue you may notice with the ok/error tuples and format_error
function is:
What happens in with
pipelines, when you combine functions from different
modules? How to decide, which format_error
function to call?
That’s a valid concern, though for many use cases, I would say that handling this in the end-user code by wrapping groups of library functions is acceptable. Nonetheless, if you’re concerned about this, there are basically two paths you could take:
- return errors in the shape
{:error, {__MODULE__, reason}}
- this tags the error with the module name, where you can find the formatting function. This is the approach taken, for example, by yecc, leex and rebar3. - return exception structs instead of errors. For example, in the code above, we
would return
{:error, YourLibrary.Error.exception(reason)}
and later use the exception struct to raise, so our unwrapping function would have a clause:{:error, exception} -> raise exception
In this case the responsibility of the formatting function is taken over by themessage/1
callback of the exception module. This is the approach taken, for example, by postgrex and db_connection.
Is this needed, or which approach is better? I’m not sure there’s one good answer that fits all. I leave this decision to you.
Full example
Let’s see how an example library, following all the rules specified above, would look like:
defmodule YourLibrary do
defmodule Error do
defexception [:reason]
def exception(reason),
do: %__MODULE__{reason: reason}
def message(%__MODULE__{reason: reason}),
do: YourLibrary.format_error(reason)
end
def get_one(1),
do: {:ok, "one"}
def get_one(value),
do: {:error, {:not_one, value}}
def get_one!(arg) do
case get_one(arg) do
{:ok, value} -> value
{:error, reason} -> raise Error, reason
end
end
def format_error({:not_one, value}),
do: "#{inspect value} is not one"
end
Of course this silly example omits type-specs and documentation, both of which are required for a good library, but I think it serves as a good example of the ideas provided in this post.
Comment from Andrea Leopardi
Out of the options that Michał outlined, I think the most flexible and Elixir-y way of handling errors is returning {:error, exception}
tuples. This has multiple benefits in my opinion.
-
The returned exception can be formatted in a uniform way with
Exception.message/1
, whatever library this exception comes from. This eliminates the need for various__MODULE__
“hacks” in order to know whichformat_error/1
function to call. -
It plays great with
with
: in awith
pipeline, you can either have just oneelse
clause that matches{:error, exception}
and treatexception
uniformly (for example,Logger.error(Exception.message(exception))
), or handle errors from each library specifically.with {:ok, tokens} <- Redix.command(:redix, ~w(SMEMBERS tokens)), {:ok, _} <- Postgrex.execute(:pg, "SELECT * FROM users", []) do :ok else {:error, %struct{} = exception} when struct in [Redix.Error, Redix.Connection.Error] -> Logger.error "Redis error: #{Exception.message(exception)}" :error {:error, %Postgrex.Error{} = exception} -> Logger.error "Postgres error: #{Exception.message(exception)}" :error end
-
Since exceptions are just structs, they can still be documented publicly alongside their fields. This way, you can still have something like a
:reason
field in your exception that allows users of your library to pattern match on specific reasons (for example,%Redix.Connection.Error{reason: :timeout}
) like Michał mentioned in the article.
All in all, this pattern requires a bit more typing than the {:error, atom}
approach, but it has a few advantages, like a uniform interface and disambiguation (think of {:error, :timeout}
in the with
example above, would it come from Redis or Postgres?).