The Road to CouchDB 2.0

This is the first in a series of blog posts introducing the Apache CouchDB 2.0 release.

C.o.u.c.h.D.B. is famously a backronym for “Cluster of unreliable commodity hardware”. However, the 1.x series of CouchDB has been a single-node database system. While it had been designed to be used in a cluster, and while there are clustering strategies and solutions on top of CouchDB 1.x, none are built-in.

In 2008 the startup Cloudant, founded by CouchDB contributor Adam Kocoloski together with his particle physicist colleagues Mike Miller and Alan Hoffman, started building a proprietary layer on top of CouchDB, using the same core technology, Erlang, turning CouchDB into a clustered database.

The Cloudant founders had built custom data storage systems during the tenure as particle physicists working with CERN’s Large Hadron Collider really big data sets (years before “big data” was thing).

Only one year before, in 2007, Amazon had published a paper about “Dynamo” that outlined Amazon’s solution to scale their database layer to their ever growing needs.

Building on the principles on the Dynamo paper, Cloudant’s clustering layer turned CouchDB into a genuine Big Data capable database.

Cloudant’s core business model is a managed database service using the CouchDB clustering technology they have developed. But their key value proposition is the management of that database, not the code itself, so in 2010 Cloudant made their clustering technology available as the open source project BigCouch.

And in summer 2013 they donated the BigCouch project to the Apache Software Foundation to be integrated into Apache CouchDB proper, fulfilling CouchDB’s original promise of supporting clusters of unreliable commodity hardware as Apache CouchDB 2.0.

In the past pretty exactly three years, the two now widely diverging codebases have been unified into one (with a few warts we are aiming to remove for 3.0 and beyond). The three milestones, roughly one per year, were:

  1. The Initial “Windsor” Merge: Named after the final of two hacking sessions of CouchDB core contributors and Cloudant employees Robert Newson and Paul Davis to bring the BigCouch source code into the Apache CouchDB repository and `master` branch of development.
  2. At the time of the branching off of BigCouch, CouchDB was at version 1.0.1. At the time of the Windsor merge, CouchDB was at version 1.4.0 and several substantial new features hadn’t been added to the codebase yet, so we had to do some catch up.
  3. With all the pieces in place, we had to make sure CouchDB 2.0 was a coherent project: Installation, documentation, all tests working, etc. So we spent the last year polishing off the final experience.

With 2.0 in release candidate phase, we already identified the upcoming areas of work and we are not going to delay releases again for this long.

Earlier this year, Cloudant began to upgrade their production clusters to the newly merged 2.0 codebases and on top of that smooth transition, we are now ready to release CouchDB 2.0 after a thorough release candidate process.

You can download the latest release candidate from Files with -RC in their name a special release candidate tags, and the files with the git hash in their name are builds off of every commit to CouchDB master.

We are inviting the community to thoroughly test their applications with CouchDB 2.0 release candidates. See the testing and setup instructions for more details.

6 thoughts on “The Road to CouchDB 2.0

  1. CouchDB 2.0 Architecture – CouchDB Blog

  2. Release Candidates – CouchDB Blog

  3. Feature: Compaction – CouchDB Blog

  4. Feature: Replication – CouchDB Blog

  5. Migrating to CouchDB 2.0 – CouchDB Blog

  6. Check out The Road to CouchDB 2.0 series – CouchDB Blog

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s