Skip to content

MongoDB + RocksDB at Parse

If you've been paying attention to the MongoDB world lately, then you know that exciting things are afoot. The 3.0 release introduces a modular storage engine API, allowing third party engines like RocksDB, TokuDB and WiredTiger to integrate seamlessly with the MongoDB data interface.

The RocksDB engine developed by Facebook engineers is one of the fastest, most compact and write-optimized storage engines available. RocksDB has been running for years as a storage layer for various services here at Facebook, so we have a lot of confidence in its maturity. It is also currently the only LSM engine available for use with MongoDB (WiredTiger also supports LSM natively, but only the B-tree implementation will be available for MongoDB until later releases).

At Parse we have been working closely with other Facebook engineers to integrate and optimize RocksDB + MongoDB, by replaying a wide range of production workloads offline and comparing results between mmapv1, RocksDB, and WT.

And we now have some exciting news to share: we are running MongoDB on RocksDB in production for some workloads and we're seeing great results!

We're preparing a series of blog posts about our experiences with Mongo + Rocks, where we will share more about our testing and hardening process and specific benchmark numbers. You can watch for these blog posts next week. In the meantime, we wanted to give you a chance to try it out for yourself.

Build MongoDB with the RocksDB storage engine:

  1. Install the dev versions of compression libraries: snappy, bzip2 and zlib
    sudo yum install snappy-devel zlib-devel bzip2-devel # centos
    sudo apt-get install libbz2-dev libsnappy-dev zlib1g-dev libzlcore-dev # ubuntu
  2. Install rocksdb from mongorocks branch:
    git clone https://github.com/facebook/rocksdb.git
    cd rocksdb
    git checkout mongorocks
    make static_lib
    sudo make install
  3. Compile mongodb from the v3.0-mongorocks branch:
    git clone https://github.com/mongodb-partners/mongo.git
    cd mongo
    git checkout v3.0-mongorocks
    scons mongod mongo mongos --rocksdb=1

To run mongod with the RocksDB storage engine, just invoke mongod with the --storageEngine=rocksdb parameter.  You can add the RocksDB node to an existing replica set by using rs.Add() and performing an initial sync as you normally would.

Please treat this as work in progress. Both of those branches will keep changing as we keep improving the performance and stability.  We will release and package a "stable" version at some point in the future.

Once you're up and running with MongoDB on RocksDB, run db.serverStatus()["rocksdb"] to check out all of the shiny new RocksDB metrics.

RocksDB also offers some really cool features, such as online consistent backups and compaction.

Create a backup:

db.adminCommand({setParameter:1, rocksdbBackup: "/var/lib/mongodb/backup/1"})

If the destination is on the same filesystem as MongoDB's directory, this should be very fast. RocksDB's table files are immutable, so we just hard-link them.

Compact the database:

db.adminCommand({setParameter:1, rocksdbCompact: 1})

Depending on the size of the database, this could take from minutes to hours. After it's done, your reads should be much faster. However, fear not! Even while compacting, you can continue both reading and writing to your database without any performance impact.

You can learn more about RocksDB internals here. Also check out this presentation by RocksDB team about what they are currently working on.

Once you play around with it, please give us your feedback at @RocksDB on Twitter. We would love to hear about your experiences! We will also be talking more about RocksDB at MongoDB World, so don't forget to register soon.

Huge thanks to Igor Canadi (@igorcanadi), the Facebook engineer leading this project!