As I wrote last week, this summer I’m working on bringing the power of NoSQL to Ecto. This week I’d like to share with you what I learned about Ecto adapters alongside some tips on how to implement Ecto adapters for new databases.
Maybe this will be your next week-end project?
What does it take?
Creating adapters for Ecto is really easy. The documentation is great (as in all core projects in Elixir), there are many helpful people, and the APIs you have to implement are straightforward and clear. It’s even easier if you want to create an adapter for a SQL database, as there is already a generic SQL adapter implementing a lot of common behaviour for you.
All you need is a little bit of will, a working driver for your database written in either Elixir or Erlang, and a couple of evenings.
How do I test it?
Ecto comes with a thorough integration suite that covers huge part of what you need to do. The tests are divided into logical groups, each responsible for different aspect of the API. The tests are tagged so you can skip those that test for a functionality that is not possible to implement for your database.
Maintainers of Ecto decided that the adapters should not emulate behaviours that are not natural to the database, because it can (and will) confuse users. It’s better to be explicit then do things automagically. In case there are things you can’t handle you should just raise (remember “Let it crash”?). The user should spot those things in development, if not the Erlang run-time will take care of it.
What you will need to do?
Ecto defines several relevant behaviours for adapters:
Ecto.Adapterwith callbacks for the most important functions like:
delete_allthat eventually get called when users are interacting with the database through the repositories.
Ecto.Adapter.Storagewith callbacks for creation and tear-down of databases.
Ecto.Adapter.Connectionspecifying callbacks for managing connections to the database as Elixir processes.
Ecto.Adapter.Migrationresponsible for migrating your data store with the help of Ecto’s friendly migration syntax.
Ecto.Adapter.Transactionthat handles transactions and specifies the extent of support for them in the specific database.
The last two are optional if your database does not support migrations or transactions.
For SQL databases there is one extra behaviour that effectively replaces last
two and the majority of the main
Ecto.Adapter behaviour -
For a SQL adapter you need to provide at least two modules:
- Your basic adapter module, that will
use Ecto.Adapters.SQL(the generic SQL adapter), and implement
Connectionmodule that will do most of the work, implementing two behaviours:
For a NoSQL adpater you need to do a little bit more of the heavy-lifting yourself, as you need to implement all behaviours on your own. There is no imposed structure of the code, but I strongly suggest that you follow that of the SQL adapter dividing your code at least in two modules:
- Your basic adapter module, that will implement
Ecto.Adapterbehaviour, and any other relevant ones.
Connectionmodule that will implement the
Ecto.Adapter.Connectionbehaviour and handle communication with the driver.
Ecto has built-in support for pooling, so you don’t have to do it yourself. It
provides the benefits of concurrency without the hassle. It recently got
improved significantly thanks to
Ecto.Adapters.Pool module that splits pool handling in two
parts - a generic one, and adapter specific one. Currently there is only one
adapter for the excellent poolboy
library, and support for sborker got
recently merged into master.
For SQL adapters you don’t have to do anything - the generic adapter has you covered.
With NoSQL database once again there’s a little bit more to do, but not very
much - you should start the pool in your
start_link/2 callback, and reuse it
for requests later on. One important thing is to keep as much as you can out of
the transaction (or the time you’re using a worker), to limit the time each worker is
checked out of the pool, and increase concurrency.
Once again I advise to mimic SQL adapter’s design - it will not only be easier
in the beginning, when you don’t have to reinvent the wheel (unfortunately
I did in some places with MongoDB adapter and had to rewrite them later on)
but also in the future when you will be implementing changes to the API.
This will probably be the most difficult task for you. You will need to
handle Elixir AST and transform it into a format consumable by your database. On
each request you will be provided with a normalized
Ecto.Query struct, with
fields for all the clauses you can specify with Ecto - wheres, selects, joins,
orders, group bys, etc. You’ll need to traverse them and generate your requests based on it.
It might sound challenging, but fear not! There are great examples of what you need to do, in order to be able to handle each possible input, in the existing adapters - the PostgreSQL one is the reference one you should always look up to. With databases based on a text protocol (like SQL) query generation will be simpler - you can handle nearly all combinations easily - if something is illegal the database will complain. With databases based on object or binary protocols (like MongoDB) you will need to keep track of what is allowed in each place of the query. It is possible there will be places and syntaxes you won’t be able to handle - you should just raise in those cases. You will also need to cast parameters from Ecto terms to those native for your database. But it’s not as bad as it sounds - Ecto cleanly wraps all the types for you so it’s really easy leveraging pattern matching.
Returning your data back to Ecto
With SQL databases - you guessed it - you’re covered.
You have only one callback that deals with
query/4. It receives the generated SQL string and returns a map with
:num_rows specifying the number of affected rows, and
will contain a list of tuples with the data from the database.
Ecto will handle typecasting and loading models for you.
With NoSQL databases once again you need to do most of the things yourself.
Your database will probably return back some objects instead of simple tuples of
columns, and you will need to
cast them to Elixir’s basic types. You will also need to load models yourself
all/4 callback, but it’s not that difficult once you have the query
generation figured out.
At this point all your integration tests should be passing - you have the basic functionality of your adapter implemented. There are probably some additional things you should handle like calls specific to your database. You should also add logging to help users understand what is going on (SQL logging is once again already done). Once you reach this stage (or maybe even earlier) you should consider documentation - describe things you do support, explicitly state what things are not allowed, provide example usage of database-specific functions, and provide type mapping table. Cover everything that might be confusing. Bad, incomplete or outdated documentation is a bug like every other.
Due to the power of Elixir and clean design of Ecto, implementing custom adapters is a much easier task then it sounds. Ability to work with different databases - both SQL and NoSQL ones - with the same (or similar) syntax is really amazing. Ecto provides useful tools and abstractions for interacting with databases - models, callbacks, validations, pooling, and others. All of this requires relatively small amount of work to port to new environments.
In case you would like to give it a try feel free to contact myself, or one of the maintainers of Ecto. I’m sure you will be welcome.