The State of CouchDB

This is a rough transcript of the CouchDB Conf, Vancouver Keynote.

Welcome

Good morning everyone. I thank you all for coming on this fine day in Vancouver. I’m very happy to be here. My name is Jan Lehnardt and I am the Vice President of Apache CouchDB at the Apache Software Foundation, but that’s just a fancy title that means I have to do a bunch of extra work behind the scenes. I’m also a core contributor to Apache CouchDB and I am the longest active committer to the project at this point.

I started helping out with CouchDB in 2006 and that feels like a lifetime ago. We’ve come a long way, we’ve shaped the database industry in a big way, we went though a phoenix from the ashes time and came out still inspiring future generations of developers to do great things.

So it is with great honour that I get to be here on stage before you to take a look at the state of CouchDB.

Numbers

I’d like to start with some numbers:

  • In 2013 we added 15 committers to the project, up to a total of 30. Thats 2x the number of people regularly contributing to CouchDB!
  • The year isn’t yet over, but these committers already created 3x the commits of 2012. And they have committed more than in any other year in CouchDB’s history.
  • We have shipped eight releases: 1.0.4 1.1.2, 1.2.1, 1.2.2, 1.3.0, 1.3.1, 1,4.0 and 1.5.0 just this year, that is up from one(!) last year.
    • thanks to our new release schedule we are getting more features to more people faster by focusing on small iterative changes forward.
  • 20% more JIRA tickets and 50% more GitHub issues

We have made a lot of changes in 2012 to make 2013 a great year for CouchDB and it sure looks like we succeeded and that 2014 is only going to trump that.

I’d like to thank everyone on the team for their hard work.

Currently

We’ve just shipped CouchDB 1.5.0 last week and it comes with a few exciting new things as previews, for you to try out and play with and report any issues with back to us. And that is on top of all the regular bug fixing and other improvements.

  1. A completely new developed admin UI, nicknamed Fauxton, that is poised to replace the much-loved, but increasingly dated Futon. I’d like to personally thank the Fauxton team: Sue “Deathbear” Lockwood, Russell “Chewbranca” Branca, Garren Smith and many more volunteers for their work as well as the company Cloudant for sponsoring a good chunk of that work. Great job everyone! Fauxton is going to be replacing Futon in one of the next few releases and will give us the foundation for the next stage of CouchDB’s life.

  2. Plugins. While it was always possible to write plugins for CouchDB, you kind of had to be an expert in CouchDB to get started. We believe that writing plugins is a great gateway drug to getting more people to hack on CouchDB proper, so we made it simpler to build plugins and to install plugins into a running instance of CouchDB. It is still very early days, we don’t even have a plugin registry yet, but we are surely excited about the prospects of installing GeoCouch with a single click of a button in Futon or Fauxton. We also included a template plugin that you can easily extend and make your own, along with a guide to get you started.

    The plugins effort also supports a larger trend we are starting to follow with the CouchDB core codebase: decide on a well-defined core set of functionality and delegate more esoteric things to a rich plugin system That means we no longer have to decline the inclusion of useful code like we’ve done in the past, because it wasn’t applicable to the majority of CouchDB users. Now we can support fringe features and plugins that are only useful to a few of our users, but who really need them.

  3. A Node.JS query server. CouchDB relies on JavaScript for a number of core features and we want to continue to do so. In order to keep up with the rapid improvements made to the JavaScript ecosystem we have tentative plans to switch from a Spidermonkey-driven query server to a V8-driven one. In addition, the Node.js project has a really good installation story, something that we had trouble with in the past, and includes a few utilities that make it very easy for us to switch the query server over.

    All this however is not to blindly follow the latest trends, but to encourage the community to take on the query server and introduce much needed improvements. The current view server is a tricky mix of JS, Erlang and C and we are not seeing many people daring to jump into that. In a second step we expect these improvements to trickle down to the other query server implementations like Python or PHP and make things better for everyone. For now this is also a developer preview and we are inviting all Node.js developers to join us and build a a better query server.

  4. Docs landed in 1.4.0, but 1.5.0 is seeing a major update to the now built-in documentation system. With major thanks to Alexander Shorin, Dirkjan Ochtmann and Dave Cottlehuber who were instrumental in that effort, CouchDB now has “really good docs” instead of a “really crappy wiki”, that are shipped with every release and are integrated with Futon and Fauxton.

Beyond

The immediate next area of focus for the CouchDB project is the merging of two forks: BigCouch and rcouch.

BigCouch is a Dynamo implementation on top of CouchDB that manages a cluster of machines and makes them look as a single one, adding performance improvements and fault tolerance to a CouchDB installation. This is a major step in CouchDB’s evolution as it was designed for such a system from the start, but the core project never included a way to use and manage a cluster. Cloudant have donated their BigCouch codebase to the Apache project already and we are working on an integration.

rcouch is a what I would call a “future port” of CouchDB by longtime committer and contributor Benoit Chesneau. rcouch looks like CouchDB would, if we started fresh today with a modern architecture. Together with BigCouch’s improvements, this will thoroughly modernise CouchDB’s codebase to the latest state of the art of Erlang projects. rcouch also includes a good number of nifty features that make a great addition to CouchDB’s core feature set and some great plugins.

Finally, we’ve just started an effort to set up infrastructure and called for volunteers to translate the CouchDB documentation and admin interface into all major languages. Driven by Andy Wenk from Hamburg, we already have a handful of people signed up to help with translations for a number of different languages.

This is going to keep us busy for a bit and we are looking forward to ship some great releases with these features.

tl;dr

2013 was a phenomenal year for Apache CouchDB. 2014 is poised to be even greater, there are more people than ever pushing CouchDB forward and there is plenty of stuff to do and hopefully, we get to shape some more of the future of computing.

Thank you!

CouchDB Conf Videos

Videos from CouchDB Conf are now available.

CouchDB everywhere with PouchDB
Dale Harvey, Mozilla
PouchDB is CouchDB written in JavaScript, that means it runs everywhere, I mean everywhere. Find out about why this changes everything for CouchDB and app developers, as well as the latest updates and status of the project.

CouchDB Writ Large
Mike Miller, Cloudant
In this talk, Mike Miller describes how CouchDB fits into Cloudant’s software stack as one of many open source projects the company uses to run distributed database clusters around the world.

Replication, FTW!
Benjamin Young, Cloudant
Jan’s “call to arms” earlier this year made replication the central focus of Apache CouchDB. Replication is the feature that makes CouchDB so distinctive in any marketplace (open or otherwise) and sets it apart from the NoSQL pack. We’ll take a look at the various replicating databases, how we can extend the reach of replication, and how we can help others find their way into this world of JSON juggling.

10 Common Misconceptions about CouchDB
Joan Touzet, Atypical
I’ve consulted with hundreds of people who use CouchDB, and the same sorts of questions keep coming up. Come to this talk if you want to know more about the kinds of mistakes many users make when thinking about how to use the database in their application. I’ll talk a bit about the “rough edges” of CouchDB, and how to work them to your advantage.

Experiences using CouchDB inside Microsoft’s Windows Azure team
Will Perry and Brian Benz
Will Perry, a developer on the Windows Azure team, is part of a small team at Microsoft that has been using CouchDB extensively on an internal project. In this talk, he’ll share motivation for using CouchDB, experiences using and running CouchDB in Azure, design and architectural patterns the team used and a quick run-down of some of the things they love and struggle with on a team more used to relational SQL DBs.

Deep Dive into a Shallow Write Pool
Jason Johnson, IBM SoftLayer
When storing data safely, disk I/O is almost always a major bottleneck. In this talk, we’ll examine some of CouchDB’s write patterns and explore how we can tune underlying storage media to complement those patterns.

Say Hello to Fauxton, the New Web Dashboard for CouchDB
Russell Branca, Cloudant
In this talk, Cloudant Engineer Russell Branca discusses the new Web dashboard that will soon be released for Apache CouchDB.

Making CouchDB Plugins
Jason Smith, Nodejitsu
CouchDB now has plugin support! This talk explains the design of the plugin system and demonstrates extending CouchDB by building example plugins. From “my first Erlang project” to major CouchDB changes, plugins have something for everybody.

Personal Web Apps
Nathan Vander Wilt, freelance product developer
CouchDB is great at keeping track of all sorts of personal data — everything from notes and contacts, to sharing a live stream of photos from a balloon 250 feet above a construction site. I’ll demo of some of the more interesting apps I’ve been building for my “personal cloud” and talk about the CouchDB features which make them easy. We’ll also explore what CouchDB *doesn’t* do and how to get those things done anyway.

How CouchDB is powering the next-generation cloud storage
Sascha Reuter, doctape / tape.io GmbH
This talk will give you an overview on how we leverage CouchDB to be the heart of our cloud-storage service. It will cover a whole bunch of fascinating real-life experiences, as well as some nifty tricks on how to unleash CouchDB’s full potential in production. I’ll show off how the mighty changes-stream helps us interconnecting different components of our service, as well as pushing realtime UI updates & notifications to our users. I’ll also tell you about things to avoid, to guarantee full performance, data consistency and high availability in production. To the end, I’ll show off how we use CouchDB even as our global queue for processing & conversion jobs and why this is actually a pretty good idea.

Open Data with Couch, Pouch, and Cloudant: Liberating the Laws of Massachusetts
Calvin Metcalf
As a member of Code for Boston, the Boston Code for America Brigade I scrapped the complete laws of Massachusetts to JSON, put them into a CouchDB database on Cloudant, and then used Cloudant to implement full-text search and PouchDB to make a demo app for people to browse the laws. Learn how couch can help distribute open data when that data is too big to just dump into github.

War stories from npmjs.org
Charlie Robbins & Jason Smith, Nodejitsu
Q&A with Nodejitsu and npmjs.org, the package manager for the Node.js server-side JavaScript platform — running CouchDB on the back end.

Check out the full playlist on YouTube.

Thank you to Cloudant for recording and making these videos available.