Some time ago I wrote an article on error handling in Elixir libraries. Today, I’d like to follow up with a piece on an equally important issue - proper handling of configuration.
The status quo
Mix configuration is great - it allows configuring libraries in a single and
unified way. You just add couple config
lines to config/config.exs
and
you’re done. Easy, right? Not so fast. There are couple caveats that you need to
think about. One, the OTP application configuration (mix config is just an interface
to OTP application configuration) is global for the whole application.
The other, mix config requires mix to run (duh!) but there are
situations where mix is not available - most notably releases.
When is mix config evaluated?
So here we arrive at the first misconception. I often hear that mix config is
evaluated at compile-time. This is not true. Mix config is evaluated when mix
starts - when you call mix phoenix.server
or mix release
. So what’s the
problem? With a release, mix config will be evaluated when
you run mix release
and then serialised into a release-compatible sys.config
file that contains plain data. This means any dynamic config (like
System.get_env
calls) has to be evaluated before the release is built and will
be “frozen” inside the release itself.
Reading configuration at compile-time
By far, one of the biggest problems people are hitting with configuration in Elixir stems from reading it at compile-time. Most usually, this involves something similar to:
This has several disadvantages. First, the configuration needs to be available
at compile-time, but it will also be completely frozen and can’t be changed
without recompiling the module in question. Even worse is assigning
the result of System.get_env
to module attributes. Mix will recompile
the whole application when the application configuration changes, but that’s not
the case with system environment since it would be awful to track.
Of course, there are situations where reading config at compile-time makes sense.
Primarily if the configuration affects how a module is compiled - for example
Ecto’s adapter or for performance reasons - for example in mime
package. I
would argue, though, that those cases are rather rare. Reading from application
environment is really fast and shouldn’t be an issue in most cases - don’t
assume it will be an issue for you without measuring.
The :system
tuple
The community arrived at a solution quite quickly. Instead of passing the real
value when dynamic configuration is needed, instead, you could just use a
{:system, "VAR_NAME"}
tuple and the library will call
System.get_env("VAR_NAME")
to get the real value for you at runtime. Sounds
like a perfect solution, right? Unfortunately, once again we hit some issues:
- every library that wants to support it has to implement the support for this pattern explicitly.
- as a user of a library, you need to go spelunking in the docs to find out if the library you want to use supports it on the configuration key you actually need. If not - too bad.
- The system environment can only return strings. Do you need an integer? Too bad.
- this only works for configuration variables coming from system environment. What if you want to load your configuration from, for example, Hashicorp’s Vault? Now you have to scramble for a solution.
The command line and --erl
This is rarely used, in my experience, but is generally a very powerful tool.
You can set application environment variables from the command line when
starting the VM - be it in a release, though mix
or even just running a
script. For example:
will make Application.get_env(:app, :key)
return :value
. This allows setting
any variable to any value - not only to strings. The main problem?
The syntax for value
is obviously Erlang and not Elixir.
If you’re not sure how an elixir value would look like with Erlang syntax, you
can use :io.format("~p~n", [value])
to print it - Erlang’s equivalent to our
IO.inspect(value)
.
Distillery’s REPLACE_OS_VARS
Distillery (the tool for building OTP releases for Elixir applications) and its
predecessor Exrm (through Erlang’s relx), offer an alternative solution. You can
set any configuration
value to the string "${VAR_NAME}"
and when the application is started (and the
“magic” REPLACE_OS_VARS
environment variable is set), those config values will
be replaced with the value of the appropriate system’s variable just before the
release boots. While this solves the problem of universality, it still only
accepts strings and only works with system environment.
Fortunately, we can link this technique with the previous one - passing
variables through the command line. A distillery release includes a vm.args
file that lists any command line arguments to the VM that is included whenever
the release is started. This means we can add
to the vm.args
file of a release and the REPLACE_OS_VARS
mechanism will use
its magic to make the configuration available for us. And now we can use any
type of variable we want - not just strings (we just need to remember and use
Erlang syntax).
The way forward
Let’s get back for a moment from solutions specific to releases to the generic
one and to the :system
tuples. The primary libraries that supported it were
Phoenix & Ecto, but they both started offering an alternative in their
latest versions - the init/2
callback. This mechanism is also marked as
preferred way of achieving dynamic configuration.
Both Phoenix' Endpoint module and Ecto’s Repo module gained a new callback -
init/2
that is called just before the endpoint or repo is started. It works
very much like an init/1
callback in a GenServer
. Its purpose? Provide
configuration and establish the initial state of the subsystem. The callback
is capable of executing any code and thus allows you to read the configuration
from wherever you please - system variables, Vault, file system, you name it.
You can also do any required type conversions. Both Ecto and Phoenix still
defaults to reading from the application environment - the new mechanism is
preferred if you need to make your configuration a bit more dynamic. This means
the default experience is just as streamlined as before, but when you need more
flexibility, it’s there.
Recommendations for library authors
When designing a library, very often you’ll need some sort of configuration. It might be the number of processes you’re going to start or an API key or something completely different. We can discern three different kinds of libraries:
Stateless libraries
This is probably the most common example. These are the libraries that don’t deal with processes at all or only start some short-lived processes (like tasks). Some examples of this category are bag-of-functions modules, API wrappers, and similar.
The simplest way to do configuration, in this case, is most of the time also the best one - just accept it as arguments to the functions. This gives all the power to choose the right configuration approach to the consumer of the library. They can easily read it from the application environment, system environment, or some third party config store. It’s also easy to make it not-too-verbose by wrapping the library functions in their own functions.
If you’re afraid that this will make your library verbose, you could leverage application environment for a default configuration. But I’m convinced accepting configuration for plain functions through plain arguments is a superior approach. It easily allows the library to be used “multiple times” in an application - with a different configuration in different subsystems. It can also make testing much easier.
Let’s take a look at an example interface of an API client library that supports default options in application environment that can be easily changed:
-
The library exports a
client/1
function that reads default configuration from application env and accepts overrides and we wrap it in our ownclient/0
function: -
We set defaults or values that are not dynamic in
config/config.exs
: -
Later when calling the API functions, we call our
client
function and pass it as an argument. If needed we could further customise it to support passing some other options directly at callsite. -
If needed we can define multiple functions like
client/0
in our application or even have aclient/1
that would accept further overrides from the final call-site.
Using an approach similar to what I presented above gives us the best of both worlds - keeps the defaults simple and yet when more flexibility is needed, it’s trivial to achieve because we’re just calling functions.
We also solve one of the biggest problems of configuring libraries globally through application environment - if your application requires using the library in two different contexts (with separate API keys). When all the configuration is loaded directly from application env, you can’t do anything about it, but with an approach similar to what I showed above, you could easily define multiple “client” functions.
Callback-module stateful libraries
The second kind of libraries is similar to Phoenix or Ecto. They deal with
processes, but instead of starting them in their own supervision tree, the
processes are started in the end application’s supervision tree. This means
there’s usually some sort of callback module that can contain an init
callback
or any required configuration can be accepted through a start_link
function
(this is, for example, what Plug does).
Once again, application environment can be leveraged for defaults but overall, the primary configuration mechanism should be passing arguments to functions. This also guarantees maximum of flexibility and power for the user of the library.
Full-blown applications
The last, and arguably hardest to configure, kind of libraries out there are full-blown applications. They start all of the processes in their own supervision tree and as a user, you have very little influence over what the application does. Fortunately, those libraries are very rare.
What makes them hard to configure is the fact they all of the configuration needs to be already there once they start, and if they are a dependency of some other application, all of the initialization code will be executed before the “master” application can even run. Because of those inherent issues, I would generally recommend converting the “full-blown” application libraries into a callback-module style discussed above.
Conclusion
Even though Elixir features a unified and pleasant configuration mechanism, there are some use cases that call for more flexibility. Fortunately, there are usually some solutions that allow achieving the desired outcome.
As to the libraries, my basic recommendation for authors of libraries is very similar to the one I gave in the error handling in Elixir libraries post - leave as much control as possible to the consumer of your library. Don’t assume how it will be used. This will save time both for the users and you with fewer support questions in the future.
Thanks to Wojtek Mach and José Valim for reviewing the draft version of this post.