Vorgangs and Conflict Detection with CouchDB: Tobias Gesellchen Interview Part 2

In part 2 of our interview with Tobias Gesellchen, he shared another use case for CouchDB that built on his earlier experiences with CouchDB in part 1.

What made you choose CouchDB for this use case?

As mentioned in the previous use case, CouchDB was my preferred database for a straight forward setup. With several years of experience from a Dev and Ops perspective I had much trust in maintaining a CouchDB for bigger projects with my main employer.

A colleague also had good experience with CouchDB, which helped us convince our team to choose it for a completely new project at Europace.

Did you have a specific problem that CouchDB solved?

Our decision was mainly driven by our experience with CouchDB (and the lack of experience with MongoDB, which already had been in use with another team). In this case we wanted to built an event-driven system, where several users might act in parallel on a shared resource – a so-called “Vorgang”. To ensure that every user always works on the most recent revision of the Vorgang we use CouchDB’s conflict mechanism to detect stale writes.

For the folks who are unsure of how they could use CouchDB–because there are a lot of databases out there—could you explain this use case?

Each event on a specific Vorgang is modeled as a simple document in the CouchDB. We applied a special pattern for the docId to enable conflict detection on the one hand and to keep resource revisions on the other hand. The docId pattern looks similar to “<vorgang-id>:<vorgang-revision>”, so that after three events on a Vorgang “abc” we’ll have three CouchDB documents with the IDs “abc:1”, “abc:2”, “abc:3”. A CouchDB view helps us to query all events for a Vorgang and the application code performs a so-called “projection” to generate a secondary read model in memory.

Disclaimer: The concept has been heavily inspired by Greg Young’s presentations about event sourcing, e.g. Event Store for Web Applications from 2014.

The secondary model isn’t appropriate for every consumer of our core data. We have additional views with only a subset of the complete model’s properties and with a simpler structure. Those views can be considered as snapshots of the projection and allow us to decouple several consumers from our core application. So, in addition to the event store for our core application we have other databases with ordinary documents.

What would you say is the top benefit of using CouchDB in this instance?

Our trust has never been disappointed: even though we’re not using CouchDB the usual way we aren’t even surprised that it works very well. Its simplicity makes it easy to handle and able to keep up with our needs.

The possibility to replicate databases is another benefit to provide a hot standby and backups.

We can actually relax.

What tools are you using in addition for your infrastructure? Have you discovered anything that pairs well with CouchDB?

We created some tools to integrate CouchDB more conveniently with our infrastructure: A collection of Ansible modules helps us to provision our nodes and deploy our services. The CouchDB Prometheus exporter exposes metrics for consumption by the Prometheus monitoring system.

Sometimes we need to process all our CouchDB documents, e.g. when running migrations or other batch tasks. Couchtato allows us to perform anything on a complete database.

What are your future plans with your project? Any cool plans or developments you want to promote?

Our core database currently contains more than 27 million of small documents adding up to approximately 20GB of data. This is probably not the biggest database in the world, but can certainly be considered non-trivial. Though CouchDB still serves our needs we’re touching its operational limits – when working on a single master instance. That’s why we’re in the process of upgrading to the clustered version available as of CouchDB 2.x.

The automated setup of a CouchDB cluster involves the configuration of Erlang cookies, ports, and node names, which doesn’t necessarily lead to beautiful tasks in our Ansible playbooks. A more specific tool like couchdb-cluster-config could be a way to hide most of those details and might also help inexperienced users to get started.

If you have any questions about this use case, or if you just want to chat, you can get in touch with Tobias on Twitter @gesellix, and Europace @EuropaceTech.

Use cases are a great avenue for sharing useful technical information. Please consider joining the fun! Additionally, if there’s something you’d like to see covered on the CouchDB blog, we would love to accommodate. Email us!

For more about CouchDB visit couchdb.org or follow us on Twitter at @couchdb

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s