Creating Ecto Adapters

As I wrote last week, this summer I’m working on bringing the power of NoSQL to Ecto. This week I’d like to share with you what I learned about Ecto adapters alongside some tips on how to implement Ecto adapters for new databases.

Maybe this will be your next week-end project?

What does it take?

Creating adapters for Ecto is really easy. The documentation is great (as in all core projects in Elixir), there are many helpful people, and the APIs you have to implement are straightforward and clear. It’s even easier if you want to create an adapter for a SQL database, as there is already a generic SQL adapter implementing a lot of common behaviour for you.

All you need is a little bit of will, a working driver for your database written in either Elixir or Erlang, and a couple of evenings.

How do I test it?

Ecto comes with a thorough integration suite that covers huge part of what you need to do. The tests are divided into logical groups, each responsible for different aspect of the API. The tests are tagged so you can skip those that test for a functionality that is not possible to implement for your database.

Maintainers of Ecto decided that the adapters should not emulate behaviours that are not natural to the database, because it can (and will) confuse users. It’s better to be explicit then do things automagically. In case there are things you can’t handle you should just raise (remember “Let it crash”?). The user should spot those things in development, if not the Erlang run-time will take care of it.

What you will need to do?

Ecto defines several relevant behaviours for adapters:

Ecto.Adapter with callbacks for the most important functions like: all, update_all or delete_all that eventually get called when users are interacting with the database through the repositories.
Ecto.Adapter.Storage with callbacks for creation and tear-down of databases.
Ecto.Adapter.Connection specifying callbacks for managing connections to the database as Elixir processes.
Ecto.Adapter.Migration responsible for migrating your data store with the help of Ecto’s friendly migration syntax.
Ecto.Adapter.Transaction that handles transactions and specifies the extent of support for them in the specific database.

The last two are optional if your database does not support migrations or transactions.

For SQL databases there is one extra behaviour that effectively replaces last two and the majority of the main Ecto.Adapter behaviour - Ecto.Adapters.SQL.Query.

For a SQL adapter you need to provide at least two modules:

Your basic adapter module, that will use Ecto.Adapters.SQL (the generic SQL adapter), and implement Ecto.Adapter.Storage behaviour,
Your Connection module that will do most of the work, implementing two behaviours: Ecto.Adapter.Connection and Ecto.Adapters.SQL.Query.

For a NoSQL adpater you need to do a little bit more of the heavy-lifting yourself, as you need to implement all behaviours on your own. There is no imposed structure of the code, but I strongly suggest that you follow that of the SQL adapter dividing your code at least in two modules:

Your basic adapter module, that will implement Ecto.Adapter behaviour, and any other relevant ones.
Your Connection module that will implement the Ecto.Adapter.Connection behaviour and handle communication with the driver.

Pool handling

Ecto has built-in support for pooling, so you don’t have to do it yourself. It provides the benefits of concurrency without the hassle. It recently got improved significantly thanks to introduction of Ecto.Adapters.Pool module that splits pool handling in two parts - a generic one, and adapter specific one. Currently there is only one adapter for the excellent poolboy library, and support for sborker got recently merged into master.

For SQL adapters you don’t have to do anything - the generic adapter has you covered.

With NoSQL database once again there’s a little bit more to do, but not very much - you should start the pool in your start_link/2 callback, and reuse it for requests later on. One important thing is to keep as much as you can out of the transaction (or the time you’re using a worker), to limit the time each worker is checked out of the pool, and increase concurrency. Once again I advise to mimic SQL adapter’s design - it will not only be easier in the beginning, when you don’t have to reinvent the wheel (unfortunately I did in some places with MongoDB adapter and had to rewrite them later on) but also in the future when you will be implementing changes to the API.

Query generation

This will probably be the most difficult task for you. You will need to handle Elixir AST and transform it into a format consumable by your database. On each request you will be provided with a normalized Ecto.Query struct, with fields for all the clauses you can specify with Ecto - wheres, selects, joins, orders, group bys, etc. You’ll need to traverse them and generate your requests based on it.

It might sound challenging, but fear not! There are great examples of what you need to do, in order to be able to handle each possible input, in the existing adapters - the PostgreSQL one is the reference one you should always look up to. With databases based on a text protocol (like SQL) query generation will be simpler - you can handle nearly all combinations easily - if something is illegal the database will complain. With databases based on object or binary protocols (like MongoDB) you will need to keep track of what is allowed in each place of the query. It is possible there will be places and syntaxes you won’t be able to handle - you should just raise in those cases. You will also need to cast parameters from Ecto terms to those native for your database. But it’s not as bad as it sounds - Ecto cleanly wraps all the types for you so it’s really easy leveraging pattern matching.

Returning your data back to Ecto

With SQL databases - you guessed it - you’re covered. You have only one callback that deals with data - query/4. It receives the generated SQL string and returns a map with two keys: :num_rows specifying the number of affected rows, and :rows that will contain a list of tuples with the data from the database. Ecto will handle typecasting and loading models for you.

With NoSQL databases once again you need to do most of the things yourself. Your database will probably return back some objects instead of simple tuples of columns, and you will need to cast them to Elixir’s basic types. You will also need to load models yourself in the all/4 callback, but it’s not that difficult once you have the query generation figured out.

Other things

At this point all your integration tests should be passing - you have the basic functionality of your adapter implemented. There are probably some additional things you should handle like calls specific to your database. You should also add logging to help users understand what is going on (SQL logging is once again already done). Once you reach this stage (or maybe even earlier) you should consider documentation - describe things you do support, explicitly state what things are not allowed, provide example usage of database-specific functions, and provide type mapping table. Cover everything that might be confusing. Bad, incomplete or outdated documentation is a bug like every other.

Conclusion

Due to the power of Elixir and clean design of Ecto, implementing custom adapters is a much easier task then it sounds. Ability to work with different databases - both SQL and NoSQL ones - with the same (or similar) syntax is really amazing. Ecto provides useful tools and abstractions for interacting with databases - models, callbacks, validations, pooling, and others. All of this requires relatively small amount of work to port to new environments.

In case you would like to give it a try feel free to contact myself, or one of the maintainers of Ecto. I’m sure you will be welcome.