CouchDB Weekly News, April 6, 2017

Releases in the CouchDB Universe

  • couchbackup 1.2.0 –  CouchBackup is a command-line utility that allows a CouchDB database to be backed-up to a text file.

PouchDB

  • pouchdb-sync-to-anything 0.1.0 – This is a plugin that lets you use CouchDBs replication algorithm with checkpointing, resuming, etc., but provide your own function to write the documents.
  • npmdoc-pouchdb 0.0.1 – API documentation for PouchDB (v6.1.2)

Opinions and other News in the CouchDB Universe

… and in the PouchDB Universe

CouchDB Use Cases, Questions and Answers

Stack Overflow:

no public answer yet:

PouchDB Use Cases, Questions and Answers

Stack Overflow:

For more new questions and answers about CouchDB, see these search results and about PouchDB, see these.

Get involved!

If you want to get into working on CouchDB:

  • We have an infinite number of open contributor positions on CouchDB. Submit a pull request and join the project!
  • Do you want to help us with the work on the new CouchDB website? Get in touch on our new website mailing list and join the website team! – www@couchdb.apache.org
  • The CouchDB advocate marketing programme is just getting started. Join us in CouchDB’s Advocate Hub!
  • CouchDB has a new wiki. Help us move content from the old to the new one!
  • Can you help with Web Design, Development or UX for our Admin Console? No Erlang skills required! – Get in touch with us.
  • Do you want to help moving the CouchDB docs translation forward? We’d love to have you in our L10n team! See our current status and languages we’d like to provide CouchDB docs in on this page. If you’d like to help, don’t hesitate to contact the L10n mailing list on l10n@couchdb.apache.org or ping Andy Wenk (awenkhh on IRC).

We’d be happy to welcome you on board!

Events

Job opportunities for people with CouchDB skills

Time to relax!

  • “Researchers say they’ve pinpointed a scientific explanation for why sounds from nature have such a restorative effect on our psyche: According to a new study, they physically alter the connections in our brains, reducing our body’s natural fight-or-flight instinct.” – Why Nature Sounds Help You Relax, According to Science
  • “‘Direct your attention to the movement,’ the app Sway told me as I unconsciously sped up the rhythm of my languid to-and-fro. ‘You were going too slow,’ it prompted me a few minutes later when I accidentally dozed off. After five minutes of this repetitive motion, a gentle chime sounded, and a notification popped up telling me I had met my meditation goal for the day.” – An App That Tracks Your Movement to Help You Relax, Even in the Back of a Cab
  • “‘Evolutionarily, sleep is about the dumbest thing you would ever do.’ While you are sleeping, you are rendered useless: you’re unconscious, you aren’t foraging or socializing, and you’re vulnerable to predation. Sleep should have been weeded out of living creatures long ago, but it hasn’t.” – We sleep less as we age because our brains don’t think we’re tired
  • “There are obvious ways to tamp down the stress you inflict on others, such as refraining from yelling or making sarcastic comments. But those are only the most visible ways one risks alienating one’s coworkers; to truly stop the office pathology, you have to look deeper.” – 3 Small Things Every Person Can Do to Reduce Stress in Their Office
  • “The news, of course, is a big factor of stress for many of us. Earlier this year, the American Psychological Association found that we’re more stressed than ever — and the election was to blame for a large part of that. To combat that, Blink Fitness is taking National Stress Awareness Month to switch its television programming to mood-lifting, news-free content.” – This Gym Is Turning Off The News To Combat Stress

… and also in the news

PouchDB & CouchDB: An interview with Nolan Lawson

With so many databases out there to choose, it’s hard to know which one will work best with your project’s infrastructure. Nolan Lawson, software developer and core maintainer of the popular JavaScript database PouchDB, understands firsthand the importance of examining a database’s tradeoffs before implementing it into your stack. He recently offered us some of his database insights.

How did you hear about CouchDB, and why did you choose to use it?

In 2012 I was working for Health On the Net, which is a Geneva-based NGO focusing on healthcare-related tech. Mostly what we did was certify health websites as abiding by a specific ethical code, but we also built a lot of websites and apps for clients like the European Commission and Swiss organizations like Santé Romande.

One of these was a greenfield project called Khresmoi, where we had an opportunity to build a health-based search engine using our database of certified health web sites. The main architect of the project had already chosen the core technologies, but he had also accepted a job in the US, so I was his replacement. The project was built on Solr/Lucene, Perl, jQuery, and a weird database I had never heard of before called CouchDB.

I’m not really sure why he had chosen CouchDB, but it was extremely ill-suited for the project at hand. Essentially we were crawling websites and storing the entire content, along with some metadata, in CouchDB. We did this several times a day, and every time a page was updated, we simply overwrote the existing documents. We weren’t using CouchDB sync at all, and we weren’t checking to see if the content had changed before writing a new revision.

Since of course CouchDB is all about revisions, this meant that the size of the database kept blowing up. Our machines would get overloaded with tens of gigabytes of data. The original architect hadn’t foreseen any of these problems, so I had to learn from scratch what CouchDB was, and how to do things like “compaction” on a regular basis to keep the database from ballooning.

We also had a lot of partners in the Khresmoi project who were very interested in aggregated views on our metadata, so I also had to learn how to performantly execute map/reduce queries, and keep those from growing out of control as well. It was pretty sink-or-swim, and to be honest I really disliked CouchDB at first, and I was always looking for opportunities to replace it with something else.

By learning all the rough edges of CouchDB, though, I eventually gained an appreciation for what CouchDB was actually good at: sync. It also impressed upon me the importance of understanding the tradeoffs of a database before using it in a project.

Did you have a specific problem that CouchDB solved?

In my mind, CouchDB has two killer features: sync and HTTP. We weren’t using either one in this project. The Perl crawler stored webpage data in CouchDB, and CouchDB was never exposed to the frontend via HTTP; it was just ferried into a Solr search database. This was also in the days before attachments, so we were storing all content as base64 strings.

What CouchDB did do fairly well was that we could do map/reduce queries on the data and then send a simple, queryable URL to our partners so that they could work with the data. It was also easy to set up authentication so that, for instance, only those with a username and a password could read it, but they couldn’t write it. The downside was that the views took a long time to build up; usually a partner would request a view on the data, and I’d say, “Okay, it’ll be done after the weekend.”

For the folks who are unsure of how they could use CouchDB–because there are a lot of databases out there—could you explain the use case?

CouchDB’s superpower is sync. Sometimes I even try to explain it to people by saying, “CouchDB isn’t a database; it’s a sync engine.” It’s a way of efficiently transferring data from one place to another, while intelligently managing conflicts and revisions. It’s very similar to Git. When I make that analogy, the light bulb often goes off.

Where this often fails is that folks may have an existing datastore, and they just want some sync mechanism on top of that. For instance, they have a MySQL or a MongoDB database, and they want just want PouchDB to sync to that instead of syncing to CouchDB. The reason this doesn’t work, and which is often hard to grasp, is that those other databases don’t have a concept of revisions built-in. For instance, when you delete a row or an object, it’s just gone. In CouchDB, it keeps a tombstone around so that it can remember what was deleted.

The analogy I would give, for people who struggle to understand why they can’t just slap CouchDB replication on top of Mongo or MySQL, is that it’s like saying, “Hey, I love Git, and the Git client is really cool, but can I use it with my FTP server?” Obviously that doesn’t work – an FTP server is just a flat filesystem, with no concept of branches or revisions. It’s exactly the same with CouchDB.

What would you say are the top three benefits of using CouchDB?

Sync, reliability, and simplicity. As J. Chris Anderson has said, CouchDB doesn’t aim to be the Ferrari of databases; it wants to be the Honda accord of databases. (See my old blog post on the subject)

The append-only file format means that you can just kill -9 a running CouchDB process and your data is still recoverable. It never gets corrupted. Also the HTTP/REST interface is very easy to use; you can use something like curl or Postman to learn how it works. When I was learning CouchDB, I would often just put some sample data into a database using Futon, and then I’d play around with URL parameters until I understood how it was working.

What tools are you using in addition for your infrastructure? Have you discovered anything that pairs well with CouchDB?

Well, as a co-maintainer of PouchDB, I obviously have to plug PouchDB here. PouchDB makes it trivially easy to sync between CouchDB on the server and IndexedDB, WebSQL, or LevelDB on the client. A lot of this can be credited to how well-thought-out CouchDB is as a whole.

There are other tools I find useful, though, like Postman which is a neat tool for debugging HTTP APIs. I’ve also written a tool called pouchdb-dump-cli which can be used to “dump” an entire CouchDB or PouchDB database to a text file, which can then be loaded back using pouchdb-load. Of course the classic backup tool for CouchDB is called cp (i.e., just copy the .couch file), but pouchdb-dump/pouchdb-load can be nice for portability and to make it easy to inspect the full contents of a database.

What are your future plans with your project? Any cool plans or developments you want to promote?

Absolutely, we’ve got a lot of work going in to PouchDB at the moment. Future improvements we plan to make are:

  • Greater customizability, reduce the size of the core JavaScript package for those who don’t need polyfills, legacy support, niche features, etc.
  • A more performant secondary index system
  • The purge API, which is the major piece of CouchDB functionality that is still unsupported by PouchDB
  • Faster replication – there are still some low-hanging fruit in the replication algorithm where we can optimize the back-and-forth and speed up replication

For more about CouchDB visit couchdb.org or follow us on Twitter at @couchdb. To learn more about PouchDB, visit pouchdb.com, or follow the official project Twitter account, @pouchdb

Have a suggestion on what you’d like to hear about next on the CouchDB blog? Email us!