This is the third in a series of blog posts introducing the Apache CouchDB 2.0 release. Read part one: The Road to CouchDB 2.0 and part two: Fauxton, the new CouchDB Dashboard.
CouchDB has always anticipated clustering as a core feature and, with 2.0, it has finally landed.
We’ve followed the Dynamo model made famous by Amazon where a database is divided into a number of equal, but separate, pieces, which we refer to as shards. Any given document belongs to one shard, and this is determined directly from its ID (and only its ID). This arrangement means that any node in the CouchDB cluster knows exactly where any document is hosted, allowing for scalable reading and writing. In addition, CouchDB 2.0 keeps multiple copies of each shard, so that the loss of any individual node is not fatal.
When creating a database, you can specify the number of shards (with ?q=
) and the number of copies of those shards (with ?n=
) or use the defaults. The default N is 3 which is almost always the right value, fewer is too risky, greater is too expensive. The default Q is 8 and this is suitable for most uses. You are well advised to raise this number if your database will be large, or if you plan to increase the size of your cluster significantly over time. As a rule of thumb, aim for no more than 10 million documents per shard.
When a document is created, updated, deleted, or read, the node that processes the HTTP request (which can be any node in the cluster) spawns N processes that run in parallel, to attempt the desired operation at every copy of the document. The coordinating node will wait for N/2+1 responses before merging those responses as the HTTP response. This overlap helps to present a consistent view of the database, though that consistency is not guaranteed (CouchDB 2.0 is an Available/Partition-Tolerant system by design, we sacrifice Consistency for Availability).
All the usual CouchDB features work as normal with only minor changes
in some cases. The most noteworthy is the changes feed. The update
sequences that CouchDB 2.0 returns are now strings, not numbers, as it
encodes the numeric sequences of each shard of the database. You treat
them as you should always have treated the numeric sequence;
opaquely. Pass the update sequence value back to the since= parameter of _changes
and all is well.
A further note, though: It is possible for CouchDB 2.0’s changes feed to return updates from before your since parameter so it is important to be idempotent when you process the rows you receive (the replication protocol is a good example of this). The reason for these so-called ‘rewinds’ is the case when CouchDB cannot contact the specific copy of a shard included in the update sequence and must replace it with another copy. Since the updates to the multiple copies of a shard are not coordinated, and hence are not in the same order, CouchDB finds the most recent checkpoint between the two copies and rewinds you there. This is typically a small amount but it depends strongly on update rate.
This has been a necessarily brief overview of the 2.0 architecture but we hope it covers enough ground to pique your interest. Oh, and relax.
Robert Newson is an engineer at Cloudant and
You can download the latest release candidate from http://couchdb.apache.org/release-candidate/2.0/. Files with -RC
in their name a special release candidate tags, and the files with the git hash in their name are builds off of every commit to CouchDB master.
We are inviting the community to thoroughly test their applications with CouchDB 2.0 release candidates. See the testing and setup instructions for more details.