The Road to CouchDB 3.0: Goodbye Travis, Hello Jenkins, Much Improved Continuous Integration

This is a post in a series about the Apache CouchDB 3.0 release. Check out the other posts in this series.

In the lead up to CouchDB 3.0, we have noticed that our needs for running automated testing had outgrown what the fine folks at Travis CI provide for free, so we started looking for alternatives.

We had multiple goals for this new CI solution:

  • quick startup time: during busy times, Travis’ free offering sometimes takes an hour to start working, which is entirely fair, but we needed something better.
  • better performance: we spent quite some time hardening our test suite against Travis’s environment of resource constrained machines, to ensure we have no false-negative reports due to timeouts and other issues. In some cases, this meant writing a test case that did not represent real-world usage of the code it was testing. We are also always happy for a faster test runtime, even on modern machines, our test suite takes about 20 minutes to run, add a matrix of different configurations, and we need some serious compute to move quickly and confidently.
  • broader platform support: CouchDB 3.0 now officially supports ARM and PPC platforms in 32bit and 64bit, mainly because our new solution can automatically run tests on these platforms each time we change code. We also now (again) test automatically on FreeBSD systems, and a macOS build machine is in the works.
  • nightly binaries: we wanted to be able to build binaries for all supported platforms for PRs and master for improved testing of as-of-yet unreleased features. This now happens automatically. You can subscribe to the Developer mailing list to get access to these binaries.

Our new CI solution is based on Jenkins via CloudBees with worker servers donated by IBM/Cloudant and MacStadium and it supports all of the above requirements

However, none of this would have been made possible without the untiring efforts of Joan “wohali” Touzet who pulled all the strings together to making this happen, from organising and configuring donation hardware, to sorting out ASF policy to make sure we are all aligned, to writing oh so many build scripts to tie the whole service together. Some of Joan’s time was donated by Neighbourhoodie Software.

Meet the new https://ci.couchdb.org.

The Road to CouchDB 3.0: Security

This is a post in a series about the Apache CouchDB 3.0 release. Check out the other posts in this series.

To understand the security changes for CouchDB 3.0, we have to go all the way back to CouchDB 1.x and see how we got here.

Ease Of Use > *

In CouchDB 1.x, one of our overarching goals was to create a database that is easy to use. We spent many years refining the CouchDB REST API to make it as easy and as convenient as possible, so people don’t get turned away from using CouchDB by how hard it is to use.

We mainly succeeded at that goal, and still today, over a decade later, we keep hearing stories that people just used CouchDB for something small and that they just stuck with it, because it was so easy to use.

However, having an internet-connected database server that has no password by default leads to people accidentally leaving their CouchDB unsecured and losing their data.

In 2.x, we took great pains to build a scalable and highly-available clustered database underneath that same API, so applications built for CouchDB 1.x continued to run just fine on a 2.x installation. With 99% API compatibility, we pulled this off successfully as well.

One of the aspects of getting started easily was a 1.x-era choice to make it easy to use CouchDB: the Admin Party. Admin Party means that, by default, any request made against CouchDB was done in the context of an admin user, i.e. you were allowed to do anything.

To make CouchDB approachable, at the time, we didn’t want to burden people with setting up accounts and passwords, and mess with permissions and whatnot. This too was a good choice for getting people started, but when building modern, internet-connected database projects, not having admin passwords is a very bad idea.

Of course, you were able to create users and set up sophisticated permissions even in CouchDB 1.x, but over time it became clear that a default of “everyone is an administrator” is not only a very bad idea, but actively terrible. While CouchDB has not been part of a larger data breach just yet, we do have reports of people getting bitcoin miners installed on their database servers through an inadequately secured CouchDB setup.

In 2.x we took some steps to make it easier to set up a more secure CouchDB installation, like enforcing an admin password when you start with a cluster setup. But you can still install a single-node setup with an Admin Party so that newly created databases were accessible to anyone until locked down.

In CouchDB 3.0, we are switching our security philosophy from “open by default” to “closed by default”.

3.0 Secure by Default

We are making a number of changes to achieve a “closed by default” situation.

Admin Account Required

First and foremost, in all configurations of CouchDB, you will have to provide an admin password before the database server starts. You do this as usual by editing your local.ini file before starting CouchDB. With no admin configured, CouchDB prints a loud and clear error message about that fact. The Mac and Windows binaries are going to prompt you for a password during the installation phase. This effectively ends the era of the Admin Party. Fun was had, but we need to move on now.

Database Security

A minor contributor to the Admin Party was the security mode for newly created databases: world read/writable. Even if you set up an admin password in CouchDB 1.x or 2.x, newly created databases were accessible by anyone until locked down.

2.x introduced the default security option that allows users to make newly created databases to be accessible only by admin users, but as opt-in settings go, most users don’t follow them.

Finally, 3.x defaults to  the admin-only database security. All databases created are only accessible by server admin users, which, remember, must now exist. If you want to make a database accessible to other users, for regular application usage, you must explicitly grant access,per database.

Metrics Role

There are a number of endpoints on CouchDB that are accessible by admins only, even in 1.x and 2.x. Now configuring an admin user is required, some of those endpoints can benefit from slightly laxer security. In particular. the

In particular, it is not uncommon to install a daemon on the same machine as CouchDB, which periodically polls those endpoints to report the result to a central metrics service. Such metrics daemons often use configuration files, and with /_stats and /_system being admin-only, those config files must include the CouchDB admin password: not ideal. It is prudent not to store administrative passwords in such a metrics service.

To solve this, 3.0 introduces a new system role for users defined in the _users database: _metrics. Users with that role will be able to access the /_stats and /_system endpoints, but they won’t have any other administrative permissions.

Require Valid User

Ever since 1.x, CouchDB has had the option to require that every request to CouchDB must be authenticated. This is another valuable building block for setting up CouchDB securely, but it has two weird edge-cases that we have addressed in CouchDB 3.0:

  • the /_up endpoint is used by load balancers to periodically check if a cluster node is capable of handling requests. With require_valid_user=true, load balancers would have to be set up with a username and password when running those checks. Many load balancers, however, have no configuration option for adding those credentials, making require_valid_user not an option for those configurations without a lot of networking workarounds.
  • the /_session endpoint allows users to exchange a username/password combination for a session token in the form of a HTTP cookie. Making session requests is a lot more efficient than the alternative of using HTTP Basic Authentication. With Basic Auth, the validity of the password has to be checked on each request. With modern settings for progressively slower password hashing like PBKDF2 (as used by CouchDB), this can lead to significant resource strains for the CouchDB servers. Session authentication does not have this problem, as validating a session token is a lot cheaper in terms of CPU resources. However, with require_valid_user=true, to get a session token, you had to send your credentials to /_session as a JSON body AND with HTTP Basic Authentication. This is not really a problem, but somewhat awkward.

To make require_valid_user more useful overall, we have relaxed the requirements for both /_up and /_session to no longer require an authenticated request, even when require_valid_user is enabled. This is in line with most user expectations of this feature.

Admin Only _all_dbs

In CouchDB 1.x, you could get a list of all databases on the server as an anonymous user. From a security perspective, this is less than ideal. For the longest time, folks have worked around this with path blocking in load balancers.

For CouchDB 2.x we wanted to change the /_all_dbs endpoint to be admin-only, but because 2.0 was a fairly large release with many moving parts, we overlooked making this change. Since it is a breaking change, we couldn’t just introduce it later in 2.x, but had to wait for 3.0. We did, however, add an option for this in 2.x.

In the lead-up to the 3.0 release, we nearly forgot to swap the default setting here again, but we remembered just in time and now /_all_dbs is an admin-only resource.

Conclusion

With these changes, we believe CouchDB is now sufficiently closed down by default, with enough options to selectively open access to the resources that need to be made available to both authenticated and anonymous users, on a per-setup basis. This brings CouchDB in line with modern security best practices, finally!